Kernel-based machine translation
In this chapter, we introduce a novel machine translation framework based on kernel regression techniques. In our model, the translation task is viewed as a string-to-string mapping, for which ridge regression is employed with both source and target sentences embedded into their respective kernel induced feature spaces. Not only does it suggest a more straightforward and flexible way to model the translational equivalence problem, compared to previous probabilistic models that usually require strong assumptions of conditional independences, this method can also be expected to capture much higher-dimensional correspondences between inputs and outputs. We propose scalable training for it based on the blockwise matrix inversion formula, as well as sparse approximations via retrieval-based subset selection techniques. However, because of the complexities of kernel methods, the contribution of this work is still mainly conceptual. We report experimental results on a small-scale reduced-domain corpus, to demonstrate the potential advantages of our method when compared with an existing phrase-based log-linear model.