PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Confidence Measures for Error Correction in Interactive Transcription of Handwritten Text
L. Tarazón, D. Pérez, Nicolás Serrano, V. Alabau, O. Ramos Terrades, Alberto Sanchis and Alfons Juan
In: 15th International Conference on Image Analysis and Processing, Italy(2009).

Abstract

An effective approach to transcribe old text documents is to follow an interactive-predictive paradigm in which both, the system is guided by the human supervisor, and the supervisor is assisted by the system to complete the transcription task as efficiently as possible. In this paper, we focus on a particular system prototype called GIDOC, which can be seen as a first attempt to provide user-friendly, integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. More specifically, we focus on the handwriting recognition part of GIDOC, for which we propose the use of confidence measures to guide the human supervisor in locating possible system errors and deciding how to proceed. Empirical results are reported on two datasets showing that a word error rate not larger than a $10\%$ can be achieved by only checking the $32\%$ of words that are recognised with less confidence.

EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:5671
Deposited By:Alfons Juan
Deposited On:08 March 2010