PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Learning Context-Free Languages
Colin de la Higuera and Jose Oncina
AI Review 2005.


Language learning is referred to as grammatical inference or grammar induction. Whereas the problem of learning or inferring regular languages (usually represented by deterministic finite state automata) has been well studied, the one of learning context-free languages has received less attention and is recognised to be a harder problem. We present in this survey a number of better or worse known results, concerning all the important learning tasks related to the class of context-free languages: learning from text and from an informant, learning with queries or with mistakes, learning from additional help which can be a partial knowledge of the structure or using the hypothesis that the actual distribution is modelled by a stochastic context-free grammar. We show that the state of the art is mainly made of negative results. Conversely, as the practical implications of these languages are important, many specific heuristics have been proposed to deal with the question of their learning. We explore some of these heuristics and propose some research directions.

Postscript - Requires a viewer, such as GhostView
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
ID Code:1646
Deposited By:Jose Oncina
Deposited On:28 November 2005