PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Risk Sensitive Path Integral Control
Bart Broek, Bert Kappen and Bert Kappen
In: BNAIC 2010, 25 Oct- 26 Oct 2010, Luxemburg.

Abstract

1 Introduction The objective in conventional stochastic optimal control is to minimize an expected cost-to-go. Risk sensitive optimal control generalizes this objective by minimizing an expected exponentiated cost-to-go. Depending on its risk parameter , expected exponentiated cost-to-go puts more emphasis on the mode of the distribution of the cost-to-go, or on its tail, and in that way allows for a modelling of more risk seeking ( < 0) or risk averse ( > 0) behaviour. The conventional optimal control can be viewed as a special case of risk sensitive optimal control with a risk neutral parameter = 0. Risk sensitive control was first considered in continuous space in the LEQG problem [1], which is the risk sensitive analogue of the Linear Quadratic Gaussian (LQG) problem. Relations with other fields such as differential games and robust control have initiated a lot of interest for risk sensitive control. The dynamic programming (DP) principle provides a well-known approach to a global solution in stochastic optimal control. In the continuous time and state setting that we will consider, it follows from the DP principle that the solution to the control problem satisfies the so-called Hamilton-Jacobi-Bellman (HJB) equation, which is a second order nonlinear partial differential equation. If the dynamics is linear and the cost is quadratic in both state and control, the HJB equation can be solved exactly, both for LQG and LEQG. Recently, a path integral formalism has been developed to solve the HJB equation. This formalism is applicable if (1) both the noise and the control are additive to the (nonlinear) dynamics, (2) the cost is quadratic in the control (but arbitrary in the state), and (3) the noise satisfies certain additional conditions. Under these conditions the nonlinear HJB equation can be transformed into a linear one, which can be solved by forward integration of a diffusion process [2]. This formalism contains LQG control as a special case. In our full paper [3] we show how path integral control generalizes to risk sensitive control problems. The required conditions to apply path integral control in the risk sensitive case are the same as those in the risk neutral setting. As a consequence, characteristics of path integral control, such as superposition of controls, symmetry breaking and approximate inference, carry over to the setting of risk sensitive control.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Machine Vision
User Modelling for Computer Human Interaction
Learning/Statistics & Optimisation
ID Code:7040
Deposited By:Bert Kappen
Deposited On:03 February 2011