PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Online learning in adversarial lipschitz environments
Odalric-Ambrym Maillard and Rémi Munos
In: ECML 2010, Barcelone(2010).

Abstract

We consider the problem of online learning in an adversarial environment when the reward functions chosen by the adversary are assumed to be Lipschitz. This setting extends previous works on linear and convex online learning. We provide a class of algorithms with cumulative regret upper bounded by O(sqrt dT ln(lambda)) where d is the dimension of the search space, T the time horizon, and lambda the Lipschitz constant. Efficient numerical implementations using particle methods are discussed. Applications include online supervised learning problems for both full and partial (bandit) information settings, for a large class of non-linear regressors/classifiers, such as neural networks.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:7407
Deposited By:Rémi Munos
Deposited On:17 March 2011