Implicit Enumeration of Patterns
In: KDID 2004, 20 Sep 2004, Pisa, Italy.
Condensed representations of pattern collections have been recognized to be important building blocks of inductive databases, a promising theoretical framework for data mining, and recently they have been studied actively. However, there has not been much research on how condensed representations should actually be represented.
In this paper we study implicit enumeration of patterns, i.e., how to represent pattern collections by listing only the interestingness values of the patterns. The main problem is that the pattern classes are typically huge compared to the collections of interesting patterns in them. We solve this problem by choosing a good ordering of listing the patterns in the class such that the ordering admits effective pruning and prediction of the interestingness values of the patterns. This representation of interestingness values enables us to quantify how surprising a pattern is in the collection. Furthermore, the encoding of the interestingness values reflects our understanding of the pattern collection. Thus the size of the encoding can be used to evaluate the correctness of our assumptions about the pattern collection and the interestingness measure.