PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Regularized fitted q-iteration for planning in continuous-space markovian decision problems
A. m. Farahmand, M. Ghavamzadeh, Csaba Szepesvari and S. Mannor
In: ACC-09, St. Louis, Missouri, USA(2009).

Abstract

Reinforcement learning with linear and non-linear function approximation has been studied extensively in the last decade. However, as opposed to other fields of machine learning such as supervised learning, the effect of finite sample has not been thoroughly addressed within the reinforcement learning framework. In this paper we propose to use regularization in reinforcement learning and planning. More specifically, we control the complexity of the value function approximation using L2 regularization. We consider the fitted Q-iteration algorithm, provide generalization bounds that account for small sample sizes. A realistic visual-servoing problem is used to illustrate the benefits of using a regularized procedure

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:6349
Deposited By:Csaba Szepesvari
Deposited On:08 March 2010