PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Metric Entropy and Gaussian Bandits
Steffen Grunewalder, Jean-Yves Audibert, Manfred Opper and John Shawe-Taylor
In: Nonparametric Bayes Workshop at NIPS 2009, 12 Dec 2009, Whistler, Cznada.

Abstract

Metric entropy and generic chaining methods are powerful tools from probabil- ity theory that can be used to study pathwise properties of stochastic processes. Despite this fact they have largely been ignored in machine learning. We demon- strate their power in this work in applying them to a bandit problem with a Gaussian process prior. The diculty of the setting lies in the fact that we are dealing with a continuous space of arms and we need to control the supremum of a reward process on the arms. We apply the so called Dudley integral to reduce the problem of controlling the supremum of a \dicult" stochastic process to the problem of bounding a canonical metric that is based solely on the covariance function (which is an analytical and thus \simple" object). We consider the sce- nario in which there is no noise in the observed reward. Our main result is to bound the regret experienced by algorithms relative to the a posteriori optimal strategy of playing the best arm throughout based on benign assumptions about the covariance function dening the Gaussian process.

EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:6912
Deposited By:Jean-Yves Audibert
Deposited On:14 April 2010