PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Source-Language Entailment Modeling for Translating Unknown Terms
Shachar Mirkin, Lucia Specia, Nicola Cancedda, Ido Dagan, Marc Dymetman and Idan Szpektor
In: ACL-IJCNLP 2009, 2-7 Aug 2009, Singapore.


This paper addresses the task of handling unknown terms in SMT. We propose using source-language monolingual models and resources to paraphrase the source text prior to translation. We further present a conceptual extension to prior work by allowing translations of entailed texts rather than paraphrases only. A method for performing this process efficiently is presented and applied to some 2500 sentences with unknown terms. Our experiments show that the proposed approach substantially increases the number of properly translated texts.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Lecture)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:5416
Deposited By:Shachar Mirkin
Deposited On:18 July 2009