|
Regularization in Reinforcement Learning AbstractWe develop regularized counterparts of standard Approximate Value Iteration and Approximate Policy Iteration algorithms. Our statistical analysis show that these methods have an almost optimal finite-sample sample-complexity convergence rate for value function estimation.
[Edit] |