A Discriminative Matching Approach to Word Alignment
Ben Taskar, Simon Lacoste-Julien and Dan Klein
In: Conference on Empirical Methods in Natural Language Processing (EMNLP 2005), 6-8 Oct 2005, Vancouver, BC, Canada.
We present a discriminative, large-margin approach to feature-based matching for word alignment. In this framework, pairs of word tokens receive a matching score, which is based
on features of that pair, including measures of association between the words, distortion between their positions, similarity
of the orthographic form, and so on. Even with only 100 labeled training examples and simple features which incorporate counts from a large unlabeled corpus, we achieve AER performance
close to IBM Model 4, in much less time. Including Model 4 predictions as features, we achieve a relative AER reduction of 22% in over intersected Model 4 alignments.