PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Inducing Compact but Accurate Tree-Substitution Grammars
Trevor Cohn, Sharon Goldwater and Phil Blunsom
In: NAACL 2009, 31 May - 05 Jun 2009, Boulder, CO, USA.


Tree substitution grammars (TSGs) are a compelling alternative to context-free grammars for modelling syntax. However, many popular techniques for estimating weighted TSGs (under the moniker of Data Oriented Parsing) suffer from the problems of inconsistency and overfitting. We present a theoretically princi- pled model which solves these problems using a Bayesian non-parametric formulation. Our model learns compact and simple grammars, uncovering latent linguistic structures (e.g., verb subcategorisation), and in doing so far out-performs a standard PCFG.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Natural Language Processing
ID Code:5886
Deposited By:Trevor Cohn
Deposited On:08 March 2010