A Game-Theoretic Procedure for Learning Hierarchically Structured
This paper addresses the problem of acquiring a hierarchically structured robotic skill in a nonstationary environment. This is achieved through a combination of learning primitive strategies from observation of an expert, and autonomously synthesising composite strategies from that basis. Both aspects of this problem are approached from a game theoretic viewpoint, building on prior work in the area of multiplicative weights learning algorithms. The utility of this procedure is demonstrated through simulation experiments motivated by the problem of autonomous driving. We show that this procedure allows the agent to come to terms with two forms of uncertainty in the world – continually varying goals (due to oncoming traffic) and nonstationarity of optimisation criteria (e.g., driven by changing navigability of the road). We argue that this type of factored task specification and learning is a necessary ingredient for robust autonomous behaviour in a “large-world” setting.