PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Bilingual segmentation for phrasetable pruning in Statistical Machine Translation
Germán Sanchis-Trilles, Daniel Ortiz-Martínez, Jesús González Rubio, Jorge Gonzalez and Francisco Casacuberta
In: EACL 2011(2011).


Statistical machine translation systems have greatly improved in the last years. However, this boost in performance usu- ally comes at a high computational cost, yielding systems that are often not suitable for integration in hand-held or real-time devices. We describe a novel technique for reducing such cost by performing a Viterbi-style selection of the parameters of the translation model. We present results with finite state transducers and phrase- based models showing a 98% reduction of the number of parameters and a 15-fold in- crease in translation speed without any sig- nificant loss in translation quality.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:8793
Deposited By:Alfons Juan
Deposited On:21 February 2012