PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

A Model of Semantics and Corrections in Language Learning
Dana Angluin and Leonor Becerra
(2010) Technical Report. -, USA.


We present a computational model of language learning via a sequence of interactions between a teacher and a learner. The utterances of the teacher and learner refer to shared situations, and the learner uses cross-situational correspondences to learn to comprehend the teacher’s utterances and produce appropriate utterances of its own. We show that in this model the teacher and learner come to be able to understand each other’s meanings. Moreover, the teacher is able to produce meaning-preserving corrections of the learner’s utterances, and the learner is able to detect them. We test our model with limited sublanguages of several natural languages in a common domain of situations. The results show that learning to a high level of performance occurs after a reasonable number of interactions. Moreover, even if the learner does not treat corrections specially, in several cases a high level of performance is achieved significantly sooner by a learner interacting with a correcting teacher than by a learner interacting with a non-correcting teacher. Demonstrating the benefit of semantics to the learner, we compare the number of interactions to reach a high level of performance in our system with the number of similarly generated utterances (with no semantics) required by the ALERGIA algorithm to achieve the same level of performance. We also define and analyze a simplified model of a probabilistic process of collecting corrections to help understand the possibilities and limitations of corrections in our setting.

EPrint Type:Monograph (Technical Report)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Natural Language Processing
ID Code:7571
Deposited By:Leonor Becerra
Deposited On:17 March 2011