PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining
Petra Kralj, Nada Lavrac and Geoff Webb
Journal of Machine Learning Research Volume 10, Number 2, pp. 377-403, 2009.


This paper gives a survey of contrast set mining (CSM), emerging pattern mining (EPM), and subgroup discovery (SD) in a unifying framework named supervised descriptive rule discovery. While all these research areas aim at discovering patterns in the form of rules induced from labeled data, they use different terminology and task definitions, claim to have different goals, claim to use different rule learning heuristics, and use different means for selecting subsets of induced patterns. This paper contributes a novel understanding of these subareas of data mining by presenting a unified terminology, by explaining the apparent differences between the learning tasks as variants of a unique supervised descriptive rule discovery task and by exploring the apparent differences between the approaches. It also shows that various rule learning heuristics used in CSM, EPMand SD algorithms all aim at optimizing a trade off between rule coverage and precision. The commonalities (and differences) between the approaches are showcased on a selection of best known variants of CSM, EPM and SD algorithms. The paper also provides a critical survey of existing supervised descriptive rule discovery visualization methods.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:5127
Deposited By:Jan Rupnik
Deposited On:24 March 2009