Skip to main content
eScholarship
Open Access Publications from the University of California

Grammar-Based and Lexicon-Based Techniques to Extract Personality Traitsfrom Text

Abstract

Language provides an important source of information to pre-dict human personality. However, most studies that have pre-dicted personality traits using computational linguistic meth-ods have focused on lexicon-based information. We investigateto what extent the performance of lexicon-based and grammar-based methods compare when predicting personality traits. Weanalyzed a corpus of student essays and their personality traitsusing two lexicon-based approaches, one top-down (Linguis-tic Inquiry and Word Count (LIWC)), one bottom-up (topicmodels) and one grammar-driven approach (Biber model), aswell as combinations of these models. Results showed thatthe performance of the models and their combinations demon-strated similar performance, showing that lexicon-based top-down models and bottom-up models do not differ, and neitherdo lexicon-based models and grammar-based models. More-over, combination of models did not improve performance.These findings suggest that predicting personality traits fromtext remains difficult, but that the performance from lexicon-based and grammar-based models are on par.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View