Learning Unbiased Stochastic Edit Distance in the form of
a Memoryless Finite-State Transducer
Jose Oncina and Marc Sebban
In: Grammatical Inference Applications: Successes and Future Challenges (IJCAI05), 31 July 2005, Edinburg, UK.
We aim at learning an unbiased stochastic edit distance in the form of a finite-state transducer from a corpus of (input,output) pairs of strings. Contrary to the other standard methods,which generally use the algorithm Expectation Maximization, our algorithm learns a transducer independently on the marginal probability distribution of the input strings. Such an unbiased way to proceed requires to optimize the parameters of a conditional transducer instead of a joint one. This transducer can be very useful in many domains of pattern recognition and machine learning, such as noise management, or DNA alignment. Several experiments are carried out with our algorithm showing that it is able to correctly assess theoretical target distributions.