PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

On the loss version of the adversarial multi-armed bandit problem
Chamy Allenberg and Peter Auer
(2005) Working Paper. no.

Abstract

The Loss Bandit game is the loss variant of the adversarial multi-armed bandit problem. It is carried out in T iterations. At the beginning of any iteration an adversary assigns losses from [0,1] to each of the K options. Then, without knowing the adversary's assignments, we are required to select one out of the $K$ arms, and suffer the loss that was assigned to it. We compete against the optimal loss, which is the minimal total loss of the best arm. In this work we present an optimal upper bound on the regret of the Loss Bandit game. It is the first upper bound on the regret of the Loss Bandit game that is a function of the optimal loss.

PDF - PASCAL Members only - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Monograph (Working Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Theory & Algorithms
ID Code:1866
Deposited By:Peter Auer
Deposited On:29 November 2005