|
Representing Languages by Learnable Rewriting Systems AbstractPowerful methods and algorithms are known to learn regular languages. Aiming at extending them to more complex grammars, we choose to change the way we represent these languages. Among the formalisms that allow to define classes of languages, the one of string-rewriting systems (SRS) has outstanding properties. Indeed, SRS are expressive enough to define, in a uniform way, a noteworthy and non trivial class of languages that contains all the regular languages, { a^nb^n: n \geq 0 }, {w\in\{a,b\}^*:|w|_a=|w|_b}, the parenthesis languages of Dyck, the language of Lukasewitz, and many others. Moreover, SRS constitute an efficient (often linear) parsing device for strings, and are thus promising and challenging candidates in forthcoming applications of Grammatical Inference. In this paper, we pioneer the problem of their learnability. We propose a novel and sound algorithm which allows to identify them in polynomial time. We illustrate the execution of our algorithm throughout a large amount of examples and finally raise some open questions and research directions.
[Edit] |