PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

MINI: Mining Informative Non-redundant Itemsets
arianna gallo, Tijl De Bie and Nello Cristianini
In: PKDD 2007, Warsaw, Poland(2007).

Abstract

Frequent itemset mining assists the data mining practitioner in searching for strongly associated items (and transactions) in large transaction databases. Since the number of frequent itemsets is usually extremely large and unmanageable for a human user, recent works have sought to define condensed representations of them, e.g. closed or maximal frequent itemsets. We argue that not only these methods often still fall short in sufficiently reducing of the output size, but they also output many redundant itemsets. In this paper we propose a philosophically new approach that resolves both these issues in a computationally tractable way. We present and empirically validate a statistically founded approach called MINI, to compress the set of frequent itemsets down to a list of informative and non-redundant itemsets.

EPrint Type:Conference or Workshop Item (Spotlight)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Information Retrieval & Textual Information Access
ID Code:3822
Deposited By:Tijl De Bie
Deposited On:25 February 2008