Lexical Predictors of Personality Type
Shlomo Argamon, Sushant Dhawle, Moshe Koppel and James Pennebaker
In: 2005 Joint Annual Meeting of the Interface and the Classification Society of North America, 8-12 Jun 2005, St. Louis, MO.
We are currently pursuing methods for “author profiling” in which various aspects
of the author’s identity might be identified from a text, without necessarily having a
corpus of documents from the same individual. A key component of such an identity
profile is personality; this paper addresses distinguishing high from low neuroticism
and extraversion in authors of informal text. We consider four different sets of lexical
features for this task: a standard function word list, conjunctive phrases, modality indicators,
and appraisal adjectives and modifiers. SMO, a support vector machine learner,
was used to learn linear separators for the high and low classes in each of the two tasks.
We find that appraisal use is the best predictor for neuroticism, and that function words
work best for extraversion. Further, examination of the specifically most important features
yields insight into how neuroticism and extraversion differentially affect language