PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Regret minimization under partial monitoring
Nicolò Cesa-Bianchi, Gábor Lugosi and Gilles Stoltz
Mathematics of Operations Research Volume 31, Number 3, pp. 562-580, 2006.

This is the latest version of this eprint.

Abstract

We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan consistent players for this games; that is, randomized playing strategies whose per-round regret vanishes with probability one as the number n of game rounds goes to infinity. We prove a general lower bound of n^{-1/3} on the convergence rate of the regret, and exhibit a specific strategy that attains this rate on any game for which a Hannan consistent player exists.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Theory & Algorithms
ID Code:2231
Deposited By:Nicolò Cesa-Bianchi
Deposited On:06 October 2006

Available Versions of this Item