PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Language Identification for Interactive Handwriting Transcription of Multilingual Documents
Miguel A. Del Agua, Nicolás Serrano and Alfons Juan
In: IbPRIA 2011(2011).

Abstract

An effective approach to handwriting transcription of (old) documents is to follow a sequential, line-by-line transcription of the whole document, in which a continuously retrained system interacts with the user. In the case of multilingual documents, however, a minor yet impor- tant issue for this interactive approach is to first identify the language of the current text line image to be transcribed. In this paper, we propose a probabilistic framework and three techniques for this purpose. Empiri- cal results are reported on an entire 764-page multilingual document for which previous empirical tests were limited to its first 180 pages, written only in Spanish.

EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:8768
Deposited By:Alfons Juan
Deposited On:21 February 2012