PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Document annotation by active learning techniques
Boris Chidlovskii and Loïc Lecerf
In: ACM DocEng 2006, 10-13 Oct, 2006, Amsterdam, The Netherlands.

Abstract

We present an integrated framework for the document conversion from legacy formats to XML format. We describe the Leg Doc project, aimed at automating the conversion of layout annotations layout-oriented formats like PDE, PS and HTML to semantic-oriented annotations. A toolkit of different components covers complementary techniques the logical document analysis and semantic annotations with the methods of machine learning. We report on the preliminary results of deplying active laring techniques for.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Information Retrieval & Textual Information Access
ID Code:3044
Deposited By:Boris Chidlovskii
Deposited On:16 September 2007