PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Linguistically Enriched Word-Sequence Kernels for Discriminative Language Modeling
Pierre Mahe' and Nicola Cancedda
In: Learning Machine Translation (2009) MIT Press , Cambridge, Massachusetts, USA , pp. 111-128.


This chapter introduces a method for taking advantage of background linguistic resources in statistical machine translation. Morphological, syntactic and possibly semantic properties of words are combined by means of an enriched word-sequence kernel. In contrast to alternative formulations, linguistic resources are integrated in such a way as to generate rich composite features defined across the various word representations. Word-sequence kernels find natural applications in the context of discriminative language modeling, where they can help correct specific problems of the translation process. As a first step in this direction, experiments on an artificial problem consisting in the detection of word misordering demonstrate the interest of the proposed kernel construction.

EPrint Type:Book Section
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:4235
Deposited By:Nicola Cancedda
Deposited On:19 December 2008