Boosting Active Learning to Optimality: A tractable Monte-Carlo based Approach.
Olivier Teytaud, Michele Sebag and Philippe Rolet
In: ECML 2009(2009).
This paper focuses on Active Learning with a limited num-
ber of queries; in application domains such as Numerical Engineering, the
size of the training set might be limited to a few dozen or hundred exam-
ples due to computational constraints. Active Learning under bounded
resources is formalized as a ﬁnite horizon Reinforcement Learning prob-
lem, where the sampling strategy aims at minimizing the expectation of
the generalization error. A tractable approximation of the optimal (in-
tractable) policy is presented, the Bandit-based Active Learner (BAAL)
algorithm. Viewing Active Learning as a single-player game, BAAL com-
bines UCT, the tree structured multi-armed bandit algorithm proposed
by Kocsis and Szepesvari (2006), and billiard algorithms. A proof of principle of the approach demonstrates its good empirical convergence
toward an optimal policy and its ability to incorporate prior AL crite-
ria. Its hybridization with the Query-by-Committee approach is found
to improve on both stand-alone BAAL and stand-alone QbC.