|
Statistical Strategies for Pruning All the Uninteresting Association Rules AbstractWe propose a general framework to formalize the pro blem of capturing the intensity of implication for association rules through statistical metrics. In this framework we present properties that influence the interestingness of a rule, analyze the conditions that lead a measure to perform a perfect prune at a time, and define a final proper order to sort the surviving rules. We will discuss why none of the currently employed measures can capture objective inte restingness, and just the combination of some of them in a multistep fashion, can be reliable. In contrast, we propose a new simple mo dification of the Pearson coefficient that will meet all the necessary requirements. We statistically infer the convenient cutoff threshold for this new metric by empirically describing its distribution function through simulation. Experiments show a promising behaviour of our proposal.
[Edit] |