PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

A note on phase transition and computational pitfalls of learning from sequences
Antoine Cornuéjols and Michele Sebag
Journal of Intelligent Information Systems (JIIS) 2006.


An ever greater range of applications call for learning from sequences. Grammar induction is one prominent tool for sequence learning, it is therefore important to know its properties and limits. This paper presents a new type of analysis for inductive learning. A few years ago, the discovery of a phase transition phenomenon in inductive logic programming proved that fundamental characteristics of the learning problems may affect the very possibility of learning under very general conditions. We show that, in the case of grammatical inference, while there is no phase transition when considering the whole hypothesis space, there is a much more severe ``gap'' phenomenon affecting the effective search space of standard grammatical induction algorithms for deterministic finite automata (DFA). Focusing on standard search heuristics, we show that they overcome this difficulty to some extent, but that they are subject to overgeneralization. The paper last suggests some directions to alleviate this problem.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Natural Language Processing
ID Code:3745
Deposited By:Antoine Cornuéjols
Deposited On:16 February 2008