PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Adaptive hedge
Tim Erven, van, Steven de Rooij, Wouter Koolen and Peter Grünwald
Proceedings of the 25th Annual Conference on Neural Information Processing Systems (NIPS 2011) 2011.

Abstract

Most methods for decision-theoretic online learning are based on the Hedge algo- rithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there ex- ists an action that is significantly better than all others. We propose a new way of setting the learning rate, which adapts to the difficulty of the learning prob- lem: in the worst case our procedure still guarantees optimal performance, but on easy instances it achieves much smaller regret. In particular, our adaptive method achieves constant regret in a probabilistic setting, when there exists an action that on average obtains strictly smaller loss than all other actions. We also provide a simulation study comparing our approach to existing methods.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:8573
Deposited By:Wouter Koolen
Deposited On:12 February 2012