Identification in the limit of systematic noisy languages
Frédéric Tantini, Colin de la Higuera and Jean-Christophe Janodet
In: ICGI 2006, 20 Sept - 22 Sept 2006, Tokyo, Japan.
To study the problem of learning from noisy data, the common approach is to use a statistical model of noise. The influence of the noise is then considered according to pragmatic or statistical criteria, by using a paradigm taking into account a distribution of the data. In this article, we study the noise as a nonstatistical phenomenon, by defining the concept of systematic noise. We establish various ways of learning (in the limit) from noisy data. The first is based on a technique of reduction between problems and consists in learning from the data which one knows noisy, then in denoising the learned function. The second consists in denoising on the fly the training examples, thus to identify in the limit good examples, and then to learn from noncorrupted data. We give in both cases sufficient conditions so that learning is possible and we show through various examples (coming in particular from the field of the grammatical inference) that our techniques are complementary.
|EPrint Type:||Conference or Workshop Item (Paper)|
|Project Keyword:||Project Keyword UNSPECIFIED|
|Subjects:||Theory & Algorithms|
|Deposited By:||Frédéric Tantini|
|Deposited On:||01 July 2006|