PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Fast Theta-Subsumption with Constraint Satisfaction Algorithms
Jérôme Maloberti and Michele Sebag
Machine Learning Journal Volume 55, Number 2, pp. 137-174, 2004.


Relational learning and Inductive Logic Programming (ILP) commonly use as covering test the theta-subsumption test defined by Plotkin. Based on a reformulation of theta-subsumption as a binary constraint satisfaction problem, this paper describes a novel theta-subsumption algorithm named Django which combines well known CSP procedures and theta-subsumption specific data structures. Django is validated using the stochastic complexity framework developed in CSPs, and imported in ILP by Giordana et Saitta. Principled and extensive experiments within this framework show that Django improves on earlier theta-subsumption algorithms by several orders of magnitude, and that different procedures are better at different regions of the stochastic complexity landscape. These experiments allow for building a control layer over Django, termed Meta-Django, which determines the best procedures to use depending on the order parameters of the problem instance. The performance gains and good scalability of Django and Meta-Django are finally demonstrated on a real-world ILP task (emulating the search for frequent clauses in the mutagenesis domain) though the smaller size of the problems results in smaller gain factors (ranging from 2.5 to 30).

Postscript - Requires a viewer, such as GhostView
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:609
Deposited By:Michele Sebag
Deposited On:29 December 2004