PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Scaling Web-based Acquisition of Entailment Relations
Idan Szpektor, Hristo Tanev, Ido Dagan and Bonaventura Coppola
In: Empirical Methods in Natural Language Processing (EMNLP) 2004, July 2004, Barcelona, Spain.


Paraphrase recognition is a critical step for natural language interpretation. Accordingly, many NLP applications would benefit from high coverage knowledge bases of paraphrases. However, the scalability of state-of-the-art paraphrase acquisition approaches is still limited. We present a fully unsupervised learning algorithm for Web-based extraction of entailment relations, an extended model of paraphrases. We focus on increased scalability and generality with respect to prior work, eventually aiming at a full scale knowledge base. Our current implementation of the algorithm takes as its input a verb lexicon and for each verb searches the Web for related syntactic entailment templates. Experiments show promising results with respect to the ultimate goal, achieving much better scalability than prior Web-based methods.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Natural Language Processing
Information Retrieval & Textual Information Access
ID Code:797
Deposited By:Ido Dagan
Deposited On:30 December 2004