PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Natural Actor-Critic for Road Traffic Optimisation
Silvia Richter, Douglas Aberdeen and Jin Yu
In: Advances in Neural Information Processing Systems (2007) MIT Press , Cambridge, MA , pp. 1169-1176. ISBN 0262195682


Current road-traffic optimisation practice around the world is a combination of hand tuned policies with a small degree of automatic adaption. Even state-of-the art research controllers need good models of the road traffic, which cannot be obtained directly from existing sensors. We use a policy-gradient reinforcement learning approach to directly optimise the traffic signals, mapping currently deployed sensor observations to control signals. Our trained controllers are (theoretically) compatible with the traffic system used in Sydney and many other cities around the world. We apply two policy-gradient methods: (1) the recent natural actor-critic algorithm, and (2) a vanilla policy-gradient algorithm for comparison. Along the way we extend natural-actor critic approaches to work for distributed and online infinite-horizon problems.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Book Section
Additional Information:Pre-proceedings version supplied.
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:4042
Deposited By:S V N Vishwanathan
Deposited On:25 February 2008