PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Acquiring entailment pairs across languages and domains: A Data Analysis
Manaal Faruqui and Sebastian Pado
In: IWCS 2011, Oxford, UK(2011).

Abstract

Entailment pairs are sentence pairs of a premise and a hypothesis, where the premise textually entails the hypothesis. Such sentence pairs are important for the development of Textual Entailment systems. In this paper, we take a closer look at a prominent strategy for their automatic acquisition from newspaper corpora, pairing first sentences of articles with their titles. We propose a simple logistic regression model that incorporates and extends this heuristic and investigate its robustness across three languages and three domains. We manage to identify two predictors which predict entailment pairs with a fairly high accuracy across all languages. However, we find that robustness across domains within a language is more difficult to achieve.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:7320
Deposited By:Sebastian Pado
Deposited On:17 March 2011