Adaptive state aggregation for Reinforcement Learning
Reinforcement Learning (RL) is a promising computational approach for constructing autonomous systems that improve their performance with experience. Its applications range from robotics, through industrial manufacturing and scheduling, to combinatorial search problems such as board games. For small problems, there exist efficient RL algorithms, with formal guarantees and polynomial learning rate. However, these algorithms are infeasible in cases where the state and/or action spaces are very large or infinite, since their time and space complexity is typically polynomial in the size of the space. On the other hand, most of the existing algorithms for solving large problems are heuristic in nature, without formal guarantees. In this thesis we propose new algorithms that are aimed to solve the online, continuous state space reinforcement learning problem, with provably efficient exploration of the state space. The proposed algorithms use an adaptive state aggregation approach, going from coarse to fine grids over the state space, which enables to use finer resolution in the important areas of the state space, and coarser resolution elsewhere. We consider an on-line learning approach, in which we discover these important areas on-line, using a confidence intervals exploration technique. Polynomial learning rates (in terms of sample complexity) are established for these algorithms.