PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Distributed Asynchronous Policy Iteration in Dynamic Programming
Dimitri Bertsekas and Huizhen Yu
In: The 48th Allerton Conference on Communication, Control and Computing(2010).

Abstract

We consider the distributed solution of dynamic programming (DP) problems by policy iteration. We envision a network of processors, each updating asynchronously a local policy and a local cost function, defined on a portion of the state space. The computed values are communicated asynchronously between processors and are used to perform the local policy and cost updates. The natural algorithm of this type can fail even under favorable circumstances, as shown by Williams and Baird [WiB93]. We propose an alternative and almost as simple algorithm, which converges to the optimum under the most general conditions, including asynchronous updating by multiple processors using outdated local cost functions of other processors.

EPrint Type:Conference or Workshop Item (Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:8065
Deposited By:Huizhen Yu
Deposited On:17 March 2011