Minimum Error Rate Training by Sampling the Translation Lattice
Samidh Chatterjee and Nicola Cancedda
In: Empirical Methods for Natural Language Processing, EMNLP 2010, 9-11 Oct 2010, Cambridge, Massachusetts, US.
Minimum Error Rate Training is the algorithm for log-linear model parameter training most used in state-of-the-art Statistical Machine Translation systems. In its original formulation, the algorithm uses N-best lists output by the decoder to grow the Translation Pool that shapes the surface on which the actual optimization is performed. Recent work has been done to extend the algorithm to use the entire translation lattice built by the decoder, instead of N-best lists. We disclose here a third, intermediate way, consisting in growing the translation pool using samples randomly drawn from the translation lattice. We empirically measure an improvement in the BLEU scores compared to training using N-best lists, without suffering the increase in computational complexity associated with operating with the whole lattice.