PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Mining frequent closed rooted trees
José Balcázar, Albert Bifet and Antoni Lozano
Machine Learning Volume 78, Number 1-2, pp. 1-33, 2009. ISSN 0885-6125 (Print) 1573-0565 (Online)

Abstract

Many knowledge representation mechanisms are based on tree-like structures, thus symbolizing the fact that certain pieces of information are related in one sense or another. There exists a well-studied process of closure-based data mining in the itemset framework: we consider the extension of this process into trees. We focus mostly on the case where labels on the nodes are nonexistent or unreliable, and discuss algorithms for closure-based mining that only rely on the root of the tree and the link structure. We provide a notion of intersection that leads to a deeper understanding of the notion of support-based closure, in terms of an actual closure operator. We describe combinatorial characterizations and some properties of ordered trees, discuss their applicability to unordered trees, and rely on them to design efficient algorithms for mining frequent closed subtrees both in the ordered and the unordered settings. Empirical validations and comparisons with alternative algorithms are provided.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:5627
Deposited By:Albert Bifet
Deposited On:08 March 2010