PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

TAG, Dynamic Programming, and the Perceptron for Efficient, Feature-rich Parsing
Xavier Carreras, Michael Collins and Terry Koo
In: 12th Conference on Computational Natural Language Learning (CoNLL), 16-17 August 2008, Manchester, UK.


We describe a parsing approach that makes use of the perceptron algorithm, in conjunction with dynamic programming methods, to recover full constituent-based parse trees. The formalism allows a rich set of parse-tree features, including PCFGbased features, bigram and trigram dependency features, and surface features. A severe challenge in applying such an approach to full syntactic parsing is the efficiency of the parsing algorithms involved. We show that efficient training is feasible, using a Tree Adjoining Grammar (TAG) based parsing formalism. A lower-order dependency parsing model is used to restrict the search space of the full model, thereby making it efficient. Experiments on the Penn WSJ treebank show that the model achieves state-of-the-art performance, for both constituent and dependency accuracy.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Oral)
Additional Information:CoNLL-2009 Best Paper Award
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:4510
Deposited By:Xavier Carreras
Deposited On:13 March 2009