Hierarchical POMDP Controller Optimization by Likelihood Maximization
M Toussaint, L Charlin and P Poupart
In: Uncertainty in Artificial Intelligence(2008).
Planning can often be simplified by decomposing
the task into smaller tasks arranged
hierarchically. Charlin et al.  recently
showed that the hierarchy discovery problem
can be framed as a non-convex optimization
problem. However, the inherent computational
difficulty of solving such an optimization
problem makes it hard to scale to realworld
problems. In another line of research,
Toussaint et al.  developed a method
to solve planning problems by maximumlikelihood
estimation. In this paper, we show
how the hierarchy discovery problem in partially
observable domains can be tackled using
a similar maximum likelihood approach.
Our technique first transforms the problem
into a dynamic Bayesian network through
which a hierarchical structure can naturally
be discovered while optimizing the policy.
Experimental results demonstrate that this
approach scales better than previous techniques
based on non-convex optimization.