PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

No Unbiased Estimator of the Variance of K-Fold Cross-Validation
Yoshua Bengio and Yves Grandvalet
In: NIPS 2003, 09-11 Dec 2003, Vancouver, Canada, Vancouver, Canada.

Abstract

Most machine learning researchers perform quantitative experiments to estimate generalization error and compare algorithm performances. In order to draw statistically convincing conclusions, it is important to estimate the uncertainty of such estimates. This paper studies the estimation of uncertainty around the K-fold cross-validation estimator. The main theorem shows that there exists no universal unbiased estimator of the variance of K-fold cross-validation. An analysis based on the eigendecomposition of the covariance matrix of errors helps to better understand the nature of the problem and shows that naive estimators may grossly underestimate variance, as confirmed by numerical experiments.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Poster)
Additional Information:Long version published in JMLR, vol. 5
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:828
Deposited By:Yves Grandvalet
Deposited On:01 January 2005