PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Toward a classification of finite partial-monitoring games
András Antos, Gabor Bartok, Dávid Pál and Csaba Szepesvari
Theoretical Computer Science Volume 473, pp. 77-99, 2013. ISSN 0304-3975

Abstract

Partial-monitoring games constitute a mathematical framework for sequential decision making problems with imperfect feedback: The learner repeatedly chooses an action, the opponent responds with an outcome, and then the learner suffers a loss and receives a feedback signal, both of which are fixed functions of the action and the outcome. The goal of the learner is to minimize his total cumulative loss. We make progress towards the classification of these games based on their minimax expected regret. Namely, we classify almost all games with two outcomes and a finite number of actions: We show that their minimax expected regret is either zero, tilde{Theta}(sqrt{T}), Theta(T^{2/3}), or Theta(T), and we give a simple and efficiently computable classification of these four classes of games. Our hope is that the result can serve as a stepping stone toward classifying all finite partial-monitoring games.

EPrint Type:Article
Additional Information:Special Issue on ALT 2010. Available online 23 October 2012. Eds.:M. Hutter, F. Stephan, V. Vovk, T. Zeugmann.
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:7692
Deposited By:András Antos
Deposited On:17 March 2011