PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Fast Discovery of Relevant Subgroups using a Reduced Search Space
Henrik Grosskreutz and Daniel Paurat
60 technical report 2010.

Abstract

We consider a modified version of the local pattern discovery task of subgroup discovery, where subgroups dominated by other subgroups are discarded. The advantage of this modified task, known as relevant subgroup discovery, is that it avoids redundancy in the outcome. Although it was considered in many applications, so far no efficient and exact algorithm for this task has been proposed. One particular problem is that the correctness is not guaranteed if the standard pruning approach is applied. In this paper, we devise a new algorithm based on two ideas: For one, we use the theory of closed sets for labeled data to reduce the candidate space; for another we introduce a special search space traversal which allows the use of optimistic estimate pruning while guaranteeing the correctness of the solution. We show that although our algorithm solves a more valuable task than other (classical) approaches, it outperforms all existing subgroup discovery algorithms.

PDF - PASCAL Members only - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:8149
Deposited By:Michael Kamp
Deposited On:02 June 2011