PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Generalization error estimation under covariate shift
Masashi Sugiyama and Klaus-Robert Müller
(2005) IBIS , Proceedings of Eighth Workshop on Information-Based Induction Sciences (IBIS2005), Tokyo, Japan, Nov. 9-11 , pp. 1077-1104.

This is the latest version of this eprint.


In supervised learning, it is almost always assumed that the training and test input points follow the same probability distribution. However, this assumption is violated, e.g., in interpolation, extrapolation, active learning, or classification with imbalanced data. In such situations---known as the covariate shift, cross-validation estimate of the generalization error is biased, which results in poor model selection. In this paper, we propose an alternative estimator of the generalization error which is under the covariate shift exactly unbiased if model includes the learning target function and is asymptotically unbiased in general. We also show that, in addition to the unbiasedness, the proposed generalization error estimator can accurately estimate the difference of the generalization error among different models, which is a desirable property in model selection. Numerical studies show that the proposed method compares favorably with cross-validation.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Book
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:1892
Deposited By:Klaus-Robert Müller
Deposited On:29 December 2005

Available Versions of this Item