PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

An Intelligent Agent that Autonomously Learns how to Translate
Marco Turchi, Tijl De Bie and Nello Cristianini
In: WI-IAT 2009, 15-18 Sep 2009, Milano.

Abstract

We describe the design of an autonomous agent that can teach itself how to translate from a foreign language, by first assembling its own training set, then using it to improve its vocabulary and language model. The key idea is that a Statistical Machine Translation package can be used for the Cross-Language Retrieval Task of assembling a training set from a vast amount of available text (e.g. a large multilingual corpus, or the Web) and then train on that data, repeating that process several times. The stability issues related to such a feedback loop are addressed by a mathematical model, connecting statistical and control-theoretic aspects of the system. We test it on real-world tasks, showing that indeed this agent can improve its translation performance autonomously and in a stable fashion, when seeded with a very small initial training set. The modelling approach we develop for this agent is general, and we believe will be useful for an entire class of self-learning autonomous agents working on the Web.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Natural Language Processing
ID Code:5931
Deposited By:Tijl De Bie
Deposited On:08 March 2010