PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

LearningPinocchio: Adaptive Information Extraction for Real World Applications
Fabio Ciravegna and Alberto Lavelli
Journal of Natural Language Engineering Volume 10, Number 2, 2004.


The new frontier of research on Information Extraction from texts is portability without any knowledge of Natural Language Processing. The market potential is very large in principle, provided that a suitable easy-to-use and effective methodology is provided. In this paper we describe LearningPinocchio, a system for adaptive Information Extraction from texts that is having good commercial and scientific success. Real world applications have been built and evaluation licenses have been released to external companies for application development. In this paper we outline the basic algorithm behind the scenes and present a number of applications developed with LearningPinocchio. Then we report about an evaluation performed by an independent company. Finally we discuss the general suitability of this IE technology for real world applications and draw some conclusion.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Information Retrieval & Textual Information Access
ID Code:919
Deposited By:Fabio Ciravegna
Deposited On:06 January 2005