PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Regret minimization under partial monitoring
Nicolò Cesa-Bianchi, Gábor Lugosi and Gilles Stoltz
Submitted 2004.

There is a more recent version of this eprint available. Click here to view it.


We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan consistent players for this games; that is, randomized playing strategies whose per-round regret vanishes with probability one as the number n of game rounds goes to infinity. We prove a general lower bound of n^{-1/3} on the convergence rate of the regret, and exhibit a specific strategy that attains this rate on any game for which a Hannan consistent player exists.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Theory & Algorithms
ID Code:273
Deposited By:Nicolò Cesa-Bianchi
Deposited On:23 November 2004

Available Versions of this Item