PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Reinforcement Learning and the Bayesian Control Rule
Pedro Ortega, Daniel A. Braun and Simon Godsill
In: The Fourth Conference on General Artificial Intelligence, 3-6 Aug 2011, Mountain View, California, USA.

This is the latest version of this eprint.

Abstract

We present an actor-critic scheme for reinforcement learning in complex domains. The main contribution is to show that planning and I/O dynamics can be separated such that an intractable planning problem reduces to a simple multi-armed bandit problem, where each lever stands for a potentially arbitrarily complex policy. Furthermore, we use the Bayesian control rule to construct an adaptive bandit player that is universal with respect to a given class of optimal bandit players, thus indirectly constructing an adaptive agent that is universal with respect to a given class of policies.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
ID Code:8294
Deposited By:Pedro Ortega
Deposited On:28 August 2011

Available Versions of this Item

  • Reinforcement Learning and the Bayesian Control Rule (deposited 28 August 2011) [Currently Displayed]