PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Progressive mixture rules are deviation suboptimal
Jean-Yves Audibert
In: NIPS 2007, 3-6 Dec 2007, Vancouver, Canada.


We consider the learning task consisting in predicting as well as the best function in a finite reference set $G$ up to the smallest possible additive term. If $R(g)$ denotes the generalization error of a prediction function $g$, under reasonable assumptions on the loss function (typically satisfied by the least square loss when the output is bounded), it is known that the progressive mixture rule satisfies E R(progressive mixture) < min_{g in G} R(g) + Cst ( log |G| ) / n, where $n$ denotes the size of the training set, and $E$ denotes the expectation with respect to the training set distribution. This work shows that, surprisingly, for appropriate reference sets $\G$, the deviation convergence rate of the progressive mixture rule is no better than $Cst/sqrt n$: it fails to achieve the expected $Cst/n$. We also provide an algorithm which does not suffer from this drawback,and which is optimal in both deviation and expectation convergence rates.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:3153
Deposited By:Jean-Yves Audibert
Deposited On:29 December 2007