PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Learning Stochastic Edit Distance: application in handwritten character recognition
Jose Oncina and Marc Sebban
Journal of Pattern Recognition Volume 39, Number 9, pp. 1575-1587, 2006.


Many pattern recognition algorithms are based on the nearest neighbour search and use the well known edit distance, for which the primitive edit costs are usually fixed in advance. In this article, we aim at learning an unbiased stochastic edit distance in the form of a finite-state transducer from a corpus of (input,output) pairs of strings. Contrary to the other standard methods, which generally use the Expectation Maximisation algorithm, our algorithm learns a transducer independently on the marginal probability distribution of the input strings. Such an unbiased way to proceed requires to optimise the parameters of a conditional transducer instead of a joint one. We apply our new model in the context of handwritten digit recognition. We show, carrying out a large series of experiments, that it always outperforms the standard edit distance.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:2613
Deposited By:Marc Sebban
Deposited On:22 November 2006