A logic-based approach to relation extraction from texts
Tamas Horvath, Gerhard Paaß, Frank Reichartz and Stefan Wrobel
Inductive logic programming
In recent years, text mining has moved far beyond the classical problem of text classification with an increased interest in more sophisticated processing of large text corpora, such as, for example, evaluations of complex queries. This and several other tasks are based on the essential step of relation extraction. This problem becomes a typical application of learning logic programs by considering the dependency trees of sentences as relational structures and examples of the target relation as ground atoms of a target predicate. In this way, each example is represented by a definite first-order Horn-clause. We show that an adaptation of Plotkin's least general generalization (LGG) operator can effectively be applied to such clauses and propose a simple and effective divide-and-conquer algorithm for listing a certain set of LGGs. We use these LGGs to generate binary features and compute the hypothesis by applying SVM to the feature vectors obtained. Empirical results on theACE--2003 benchmark dataset indicate that the performance of our approach is comparable to state-of-the-art kernel methods.