PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Using a similarity measure for credible classification
Martin Anthony, Ersoy Subasi, Mine Subasi and Peter Hammer
(2005) Technical Report. RUTCOR, Piscataway, New Jersey.


This paper concerns classification by Boolean functions. We investigate the classification accuracy obtained by standard classification techniques on unseen points (elements of the domain, $\{0,1\}^n$, for some $n$) that are similar, in particular senses, to the points that have been observed as training observations. Explicitly, we use a new measure of how similar a point $x \in \{0,1\}^n$ is to a set of such points to restrict the domain of points on which we offer a classification. For points sufficiently dissimilar, no classification is given. We report on experimental results which indicate that the classification accuracies obtained on the resulting restricted domains are better than those obtained without restriction. These experiments involve a number of standard data-sets and classification techniques. We also compare the classification accuracies with those obtained by restricting the domain on which classification is given by using the Hamming distance.

EPrint Type:Monograph (Technical Report)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:581
Deposited By:Martin Anthony
Deposited On:28 November 2005