PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Multimodal Interactive Transcription of Text Images
Alejandro Hector Tosselli, Verónica Romero, Moisés Pastor i Gadea and Enrique Vidal
Pattern Recognition Volume 43, Number 5, pp. 1814-1825, 2010.


To date, automatic handwriting recognition systems are far from being perfect and heavy human intervention is often required to check and correct the results of such systems. This “post-editing” process is both inefficient and uncomfortable to the user. An example is the transcription of historic documents: state-of-the-art handwritten text recognition technology is not suitable to perform this task automatically and expensive paleography expert work is needed to achieve correct transcriptions. As an alternative to fully manual transcription and post-editing, a multimodal interactive approach is proposed here where user feedback is provided by means of touchscreen pen strokes and/or more traditional keyboard and mouse operation. User's feedback directly allows to improve system accuracy, while multimodality increases system ergonomy and user acceptability. Multimodal interaction is approached in such a way that both the main and the feedback data streams help each-other to optimize overall performance and usability. Empirical tests on three cursive handwritten tasks suggest that, using this approach, considerable amounts of user effort can be saved with respect to both pure manual work and non-interactive, post-editing processing.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:User Modelling for Computer Human Interaction
Natural Language Processing
Multimodal Integration
ID Code:7448
Deposited By:Alfons Juan
Deposited On:17 March 2011