PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Source-Language Entailment Modeling for Translating Unknown Terms.
Shachar Mirkin, Lucia Specia, Nicola Cancedda, Ido Dagan, Marc Dymetman and Idan Szpektor
ACL 2009.

Abstract

This paper addresses the task of handling unknown terms in SMT. We propose us- ing source-language monolingual models and resources to paraphrase the source text prior to translation. We further present a conceptual extension to prior work by al- lowing translations of entailed texts rather than paraphrases only. A method for performing this process efficiently is pre- sented and applied to some 2500 sentences with unknown terms. Our experiments show that the proposed approach substan- tially increases the number of properly translated texts.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:5864
Deposited By:Marc Dymetman
Deposited On:08 March 2010