PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Syntactic discriminative language model rerankers for statistical machine translation
Simon Carter and Christof Monz
Machine Translation Volume 25, pp. 317-339, 2011.

Abstract

This article describes a method that successfully exploits syntactic fea- tures for n-best translation candidate reranking using perceptrons. We motivate the utility of syntax by demonstrating the superior performance of parsers over n-gram language models in differentiating between Statistical Machine Translation output and human translations. Our approach uses discriminative language modelling to rerank the n-best translations generated by a statistical machine translation system. The per- formance is evaluated for Arabic-to-English translation using NIST’s MT-Eval bench- marks. While deep features extracted from parse trees do not consistently help, we show how features extracted from a shallow Part-of-Speech annotation layer outper- form a competitive baseline and a state-of-the-art comparative reranking approach, leading to significant BLEU improvements on three different test sets.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:9341
Deposited By:Christof Monz
Deposited On:16 March 2012