PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

SPSA based Actor-Critic Algorithm by Using Deterministic Perturbation Sequences
CVL Raju
(2005) Technical Report. Centre for Discrete and Applicable Mathematics, London, UK.

Abstract

We develop a simulation-based actor-critic algorithm for infinite horizon Markov decision processes with finite state space and finite action space, with a discounted cost criterion. The algorithm essentially does gradient search in the space of randomized policies and uses simultaneous deterministic perturbation stochastic approximation (SDPSA) type estimates. The algorithm combines the features of two-time scale actor-critic algorithms with those of gradient search based SDPSA technique.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Monograph (Technical Report)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Theory & Algorithms
ID Code:1452
Deposited By:Martin Anthony
Deposited On:28 November 2005