PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Learning Constrained Edit State Machines
Laurent Boyer, Olivier Gandrillon, Amaury Habrard, Mathilde Pellerin and Marc Sebban
In: 21st International Conference on Tools with Artificial Intelligence, November 2-5, 2009, Newark (NYC Metropolitan Area), New Jersey, USA.


Learning the parameters of the edit distance has been increasingly studied during the past few years to improve the assessment of similarities between structured data, such as strings, trees or graphs. Often based on the optimization of the likelihood of pairs of data, the learned models usually take the form of probabilistic state machines, such as pair-Hidden Markov Models (pair-HMM), stochastic transducers, or probabilistic deterministic automata. Although the use of such models has lead to significant improvements of edit distance-based classification tasks, a new challenge has appeared on the horizon: How integrating background knowledge during the learning process? This is the subject matter of this paper in the case of (input,output) pairs of strings. We present a generalization of the pair-HMM in the form of a constrained state machine, where a transition between two states is driven by constraints fulfilled on the input string. Experimental results are provided on a task in molecular biology, aiming to detect transcription factor binding sites.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:5642
Deposited By:Marc Sebban
Deposited On:08 March 2010