Probabilistic Textual Entailment: Generic Applied Modeling of Language Variability
Ido Dagan and Oren Glickman
In: Learning Methods for Text Understanding and Mining, 26 - 29 January 2004, Grenoble, France.
A most prominent phenomenon of natural languages is variability – stating the same meaning in various ways. Robust language processing applications – like Information Retrieval (IR), Question Answering (QA), Information Extraction (IE), text summarization and machine translation – must recognize the different forms in which their inputs
and requested outputs might be expressed. Today,
inferences about language variability are often performed
by practical systems at a "shallow" semantic level, due to the fact that robust semantic interpretation into logic-based meaning-level representations is not feasible. However, there is yet no generally applicable framework for modeling variability in an application independent manner. Consequently this problem is treated mostly independently within individual systems, and usually to a quite limited extent. In this paper we outline a proposal for a generic model for recognizing language variability at a shallow semantic level, its implementation as a practical engine to be leveraged within a variety of applications, and several learning tasks that it poses.