PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

A data-dependent generalisation error bound for the AUC
Nicolas Usunier, Massih Amini and Patrick Gallinari
In: ROCML ICML 2005 Workshop, 8-12 August 2005, Bonn, Germany.

Abstract

In this paper, we are interested in the generalisation properties of the Area Under the ROC Curve (AUC). The optimisation of the AUC has recently been proposed for learning ranking functions. However, the estimation of the AUC of a function - depending on the true distribution of examples - using its empirical value - computed on a training set - is still an open problem. In this paper, we present the first data-dependent generalisation error bound for the AUC. This bound presents the advantage to be thight, it also allows to draw practical conclusions on learning algorithms which optimise the AUC. In particular, we show that in the case of AUC, kernel function classes have strong generalisation guarantees provided that the weights of the functions are small, suggesting that regularisation procedures which tend to limit the norm of the weight vector may lead to better generalisation performance for algorithms which optimise the AUC.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:1069
Deposited By:Massih Amini
Deposited On:04 September 2005