Learning Complex Motions by Sequencing Simpler Motion Templates
Abstraction of complex, longer motor tasks into simpler elemental movements enables humans and animals to exhibit motor skills which have not yet been matched by robots. We intuitively decompose complex motions into smaller, simpler segments. For example when describing simple movements like drawing a triangle with a pen, we can easily name the basic steps of this movement. Surprisingly, such abstractions have rarely been used in artificial motor skill learning algorithms. In the standard setting, the agent has to choose actions (such as a torque or a force) at a fast time-scale. As a result, both policy and temporal credit assignment problem become unnecessarily complex - often beyond the reach of current machine learning methods. We introduce a new framework for temporal abstractions in reinforcement learning (RL), i.e. RL with motion templates. This setup is demanding for reinforcement learning because we have to make precise, continuous valued decisions. We propose a new algorithm which can deal with this requirement and facilitates learning at an abstract level. The algorithm is an extension of the Locally-Advantage WEighted Regression algorithm and can learn high-quality policies only by making few abstract decisions.