Comparing classification results between N-ary and binary problems
Quality Measures in Data Mining
Many quality measures for rule discovery are binary measures, they are designed to rate binary rules (rules which separate the database examples into two categories eg. "it is a bird" vs. "it is not a bird") and they cannot rate N-ary rules (rules which separate the database examples into N categories eg. "it is a bird" or "it is an insect" or "it is a fish"). Many quality measures for classification problems are also binary (they are meant to be applied when the class variable has exactly two possible, mutually exlusive, values).
This chapter gives the data-analyst a pratical tool enabling him or her to apply these quality measures to any rule or classification task, provided the outcome takes a known and finite number of possible values (fuzzy concepts are excluded). Its purpose is also to help the data-analyst during the delicate task of pre-processing before a classification experiment. When he or she is considering different formulations of the task at hand, and more precisely, the number of possible class values that the classification problem should have, a clear indication of the relative difficulties of the consequent problems will be welcome.
|Postscript - PASCAL Members only - Requires a viewer, such as GhostView|