A note on phase transition and computational pitfalls of learning from sequences
Antoine Cornuéjols and Michele Sebag
Journal of Intelligent Information Systems (JIIS)
An ever greater range of applications call for learning from sequences. Grammar induction is one prominent tool for sequence learning, it is therefore important to know its properties and limits.
This paper presents a new type of analysis for inductive learning. A few years ago, the discovery of a phase transition phenomenon in inductive logic programming proved that fundamental characteristics of the learning problems may affect the very possibility of learning under very general conditions.
We show that, in the case of grammatical inference, while there is no phase transition when considering the whole hypothesis space, there is a much more severe ``gap'' phenomenon affecting the effective search space of standard grammatical induction algorithms for deterministic finite automata (DFA). Focusing on standard search heuristics, we show that they overcome this difficulty to some extent, but that they are subject to overgeneralization. The paper last suggests some directions to alleviate this problem.