A Practical and Conceptual Framework for Learning in Control
Marc Deisenroth and Carl Edward Rasmussen
University of Washington.
We propose a fully Bayesian approach for efficient reinforcement
learning (RL) in Markov decision processes with continuous-valued
state and action spaces when no expert knowledge is available. Our
framework is based on well-established ideas from statistics and
machine learning and learns fast since it carefully models,
quantifies, and incorporates available knowledge when making
decisions. The key ingredient of our framework is a probabilistic
model, which is implemented using a Gaussian process (GP), a
distribution over functions. In the context of dynamic systems, the
GP models the transition function. By considering all plausible
transition functions simultaneously, we reduce model bias, a problem
that frequently occurs when deterministic models are used.
Due to its generality and efficiency, our RL framework can be
considered a conceptual and practical approach to learning models
and controllers when expert knowledge is difficult to obtain or
simply not available, which makes system identification hard.