PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Online Entropy-based Model of Lexical Category Acquisition
Grzegorz Chrupala and Afra Alishahi
In: CoNLL 2010, Uppsala, Sweden(2010).


Children learn a robust representation of lexical categories at a young age. We propose an incremental model of this process which efficiently groups words into lexical categories based on their local context using an information-theoretic criterion. We train our model on a corpus of child-directed speech from CHILDES and show that the model learns a fine-grained set of intuitive word categories. Furthermore, we propose a novel evaluation approach by comparing the efficiency of our induced categories against other category sets (including traditional part of speech tags) in a variety of language tasks. We show the categories induced by our model typically outperform the other category sets.

EPrint Type:Conference or Workshop Item (Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Natural Language Processing
ID Code:8853
Deposited By:Grzegorz Chrupala
Deposited On:21 February 2012