Ranking Interesting Subgroups
In: 26th International Conference on Machine Learning (ICML 2009), 14-18 June 2009, Montreal, Canada.
Subgroup discovery is the task of identifying
the top k patterns in a database with most
significant deviation in the distribution of a
target attribute Y . Subgroup discovery is a
popular approach for identifying interesting
patterns in data, because it combines statistical
significance with an understandable representation
of patterns as a logical formula.
However, it is often a problem that some subgroups,
even if they are statistically highly
significant, are not interesting to the user.
We present an approach based on the work on
ranking Support Vector Machines that ranks
subgroups with respect to the user’s concept
of interestingness, and finds more interesting
subgroups. This approach can significantly
increase the quality of the subgroups.