PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Embedded Bernoulli Mixture HMMs for Continuous Handwritten Text Recognition
Alfons Juan and Adrià Giménez
In: 13th International Conference on Computer Analysis of Images and Patterns, 2-4 Sep 2009, Germany.


Hidden Markov Models (HMMs) are now widely used in off-line handwritten text recognition. As in speech recognition, they are usually built from shared, embedded HMMs at symbol level, in which state-conditional probability density functions are modelled with Gaussian mixtures. In contrast to speech recognition, however, it is unclear which kind of real-valued features should be used and, indeed, very different features sets are in use today. In this paper, we propose to by-pass feature extraction and directly fed columns of raw, binary image pixels into embedded Bernoulli mixture HMMs, that is, embedded HMMs in which the emission probabilities are modelled with Bernoulli mixtures. The idea is to ensure that no discriminative information is filtered out during feature extraction, which in some sense is integrated into the recognition model. Good empirical results are reported on the well-known IAM database.

EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:5661
Deposited By:Alfons Juan
Deposited On:08 March 2010