PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

PAC Classification based on PAC Estimates of Label Class Distributions
Nick Palmer and Paul Goldberg
(2006) Working Paper. working paper, Coventry, UK.

Abstract

A standard approach in pattern classification is to estimate the distributions of the label classes, and then to apply the Bayes classifier to the estimates of the distributions in order to classify unlabeled examples. As one might expect, the better our estimates of the label class distributions, the better the resulting classifier will be. In this paper we make this observation precise by identifying risk bounds of a classifier in terms of the quality of the estimates of the label class distributions. We show how PAC learnability relates to estimates of the distributions that have a PAC guarantee on their $L_1$ distance from the true distribution, and we bound the increase in negative log likelihood risk in terms of PAC bounds on the KL-divergence. We give an inefficient but general-purpose smoothing method for converting an estimated distribution that is good under the $L_1$ metric into a distribution that is good under the KL-divergence.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
Postscript - Requires a viewer, such as GhostView
EPrint Type:Monograph (Working Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:873
Deposited By:Paul Goldberg
Deposited On:06 January 2005