PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Modeling Inflection and Word-Formation in SMT
Alexander Fraser, Marion Weller, Aoife Cahill and Fabienne Cap
In: EACL 2012, 23-27 Apr 2012, Avignon, France.

Abstract

The current state-of-the-art in statistical machine translation (SMT) suffers from issues of sparsity and inadequate modeling power when translating into morphologically rich languages. We model both inflection and word-formation for the task of translating into German. For inflection, we generalize over different inflected forms of a word, while also ensuring coherence of the inflected output. For word-formation, we address compounding, which is highly productive in German. For both inflection and word-formation, we address the problem of portmanteaus. We translate from English words to an underspecified German representation and then use linear-chain CRFs to predict the fully specified German representation. We show that improved modeling of inflection and word-formation leads to improvement in translation performance.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:9510
Deposited By:Alexander Fraser
Deposited On:16 March 2012