PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Asymmetric Term Alignment with Selective Contiguity Constraints by Multi-Tape Automata
Madalina Barbaiani, Nicola Cancedda, Chris Dance, Szilard Fazekas, Tamas Gaal and Eric Gaussier
In: 6th International Workshop on Finite-State Methods and Natural Language Processing, 14-16 Sept, 2007, Potsdam, Germany.


This article describes a HMM-based word-alignment method that can selectively enforce a contiguity constraint. This method has a direct application in the extraction of a bilingual terminological lexicon from a parallel corpus, but can also be used as a preliminary step for the extraction of phrase pairs in a Phrase-Based Statistical Machine Translation system. Contiguous source words composing terms are aligned to contiguous target language words. The HMM is transformed into a Weighted Finite State Transducer (WFST) and contiguity constraints are enforced by specific multi-tape WFSTs. The proposed method is especially suited when basic linguistic resources (morphological analyzer, part-of-speech taggers and term extractors) are available for the source language only.

Postscript - Requires a viewer, such as GhostView
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
Information Retrieval & Textual Information Access
ID Code:3151
Deposited By:Nicola Cancedda
Deposited On:29 December 2007