PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Confidence-Weighted Learning of Factored Discriminative Language Models
Viet Ha-Thuc and Nicola Cancedda
Proceedings of the 49th annual meeting of the Association for Computational Linguistics (ACL 2011) 2011.


Language models based on word surface forms only are unable to benefit from available linguistic knowledge, and tend to suffer from poor estimates for rare features. We propose an approach to overcome these two limitations. We use factored features that can flexibly capture linguistic regularities, and we adopt confidence-weighted learning, a form of discriminative online learning that can better take advantage of a heavy tail of rare features. Finally, we extend the confidence-weighted learning to deal with label noise in training data, a common case with discriminative language modeling.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:8874
Deposited By:Nicola Cancedda
Deposited On:21 February 2012