PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Generalization Bounds 
Mark Reid
(2010) Springer .

Abstract

In the theory of statistical machine learning, a generalization bound—or, more precisely, a generalization error bound—is a statement about the predictive performance of a learning algorithm or class of algorithms. Here, a learning algorithm is viewed as a procedure that takes some finite training sample of labelled instances as input and returns a hypothesis regarding the labels of all instances, including those which may not have appeared in the training sample. Assuming labelled instances are drawn from some fixed distribution, the quality of a hypothesis can be measured in terms of its risk that is, its incompatibility with the distribution. The performance of a learning algorithm can then be expressed in terms of the expected risk of its hypotheses given randomly generated training samples. Under these assumptions a generalization bound is a theorem which holds for any distribution and states that, with high probability, applying the learning algorithm to a randomly drawn sample will result in a hypothesis with risk no greater than some value. This bounding value typically depends on the size of the training sample, an empirical assessment of the risk of the hypothesis on the training sample as well as the “richness” or “capacity” of the class of predictors that can be output by the learning algorithm.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Book
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Theory & Algorithms
ID Code:7459
Deposited By:Wray Buntine
Deposited On:17 March 2011