PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Using the H-Divergence to Prune Probabilistic Automata
Marc Bernard, Baptiste Jeudy, Jean-Philippe Peyrache, Marc Sebban and Franck Thollard
In: IEEE 23rd International Conference on Tools with Artificial Intelligence, ICTAI 2011, November 7-9, United States.


A problem usually encountered in probabilistic automata learning is the difficulty to deal with large training samples and/or wide alphabets. This is partially due to the size of the resulting Probabilistic Prefix Tree (PPT) from which state merging-based learning algorithms are generally applied. In this paper, we propose a novel method to prune PPTs by making use of the H-divergence dH, recently introduced in the field of domain adaptation. dH is based on the classification error made by an hypothesis learned from unlabeled examples drawn according to two distributions to compare. Through a thorough comparison with state-of-the-art divergence measures, we provide experimental evidences that demonstrate the efficiency of our method based on this simple and intuitive criterion.

EPrint Type:Conference or Workshop Item (Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:8527
Deposited By:Marc Sebban
Deposited On:12 February 2012