PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Statistical approaches to computer-assisted translation
S. Barrachina, O. Bender, Francisco Casacuberta, Jorge Civera, E. Cubel, S. Khadivi, A. Lagarda, H. Ney, J. Toms and Enrique Vidal
Computational Linguistics Volume 35, Number 1, pp. 3-28, 2009. ISSN 0891-2017


Current machine translation (MT) systems are still not perfect. In practice, the output from these systems needs to be edited to correct the errors. A way of increasing the productivity of the whole translation process (MT plus human work) is to incorporate the human correction activities within the translation process itself, thereby shifting the MT paradigm to that of computer-assisted translation. This model entails an iterative process in which the human translator activity is included in the loop: In each iteration, a prefix of the translation is validated (accepted or amended) by the human and the system computes its best (or N-best) translation suffix hypothesis to complete this prefix. A successful framework for MT is the so-called statistical (or pattern recognition) framework. Interestingly, within this framework, the adaptation of MT systems to the interactive scenario affects mainly the search process, allowing a great reuse of successful techniques and models. In this article, alignment templates, phrase-based models and stochastic finite-state transducers are used to develop computer-assisted translation systems. These systems were assessed in a European project (TransType2) in two real tasks: The first was the translation of printer manuals; the second was the translation of the Bulletin of the European Union. In each task, the following three pairs of languages were involved (in both translation directions): English-Spanish, English-German and English-French.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:User Modelling for Computer Human Interaction
Natural Language Processing
ID Code:5821
Deposited By:Alfons Juan
Deposited On:08 March 2010