- Main
Generalized Representation of Syntactic Structures
Abstract
Analysis of language provides important insights into the un-derlying psychological properties of individuals and groups.While the majority of language analysis work in psychologyhas focused on semantics, psychological information is en-coded not just in what people say, but how they say it. Inthe current work, we propose Conversation Level Syntax Simi-larity Metric-Group Representations (CASSIM-GR). This toolbuilds generalized representations of syntactic structures ofdocuments, thus allowing researchers to distinguish betweenpeople and groups based on syntactic differences. CASSIM-GR builds off of Conversation Level Syntax Similarity Metricby applying spectral clustering to syntactic similarity matricesand calculating the center of each cluster of documents. Thisresulting cluster centroid then represents the syntactical struc-ture of the group of documents. To examine the effectivenessof CASSIM-GR, we conduct three experiments across threeunique corpora. In each experiment, we calculate the cluster-ing accuracy and compare our proposed technique to a bag-of-words approach. Our results provide evidence for the ef-fectiveness of CASSIM-GR and demonstrate that combiningsyntactic similarity and tf-idf semantic information improvesthe total accuracy of group classification.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-