PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Concurrent Probabilistic Temporal Planning with Policy-Gradients
Douglas Aberdeen and Olivier Buffet
In: ICAPS Digital Library Proceedings of the Seventeenth International Conference on Automated Planning and Scheduling ICAPS Digital Library Proceedings of the Seventeenth International Conference on Automated Planning and Scheduling (2007) The AAAI Press , Menlo Park, California , pp. 10-17. ISBN 978-1-57735-344-7

Abstract

We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search that attempts to optimise a parameterised policy using gradient ascent. Low memory use, plus the use of function approximation methods, plus factorisation of the policy, allow us to scale to challenging domains. This Factored Policy Gradient (FPG) Planner also attempts to optimise \emph{both} steps to goal and the probability of success. We compare the FPG planner to other planners on CPTP domains, and on simpler but better studied probabilistic non-temporal domains.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Book Section
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:4045
Deposited By:S V N Vishwanathan
Deposited On:25 February 2008