Multi-task minimum error rate training for SMT
We present experiments on multi-task learning for discriminative training in statistical machine translation (SMT), extending standard minimum-error-rate training (MERT) by techniques that take advantage of the similarity of related tasks. We apply our techniques to German-to-English translation of patents from 8 tasks according to the International Patent Classiﬁcation (IPC) system. Our experiments show statistically signiﬁcant gains over task-speciﬁc training by techniques that model commonalities through shared parameters. However, more ﬁnegrained combinations of shared parameters with task-speciﬁc ones could not be brought to bear on models with a small number of dense features. The software used in the experiments is released as open-source tool.