Weighted Sampling for Large-Scale Boosting
Zdenek Kalal, Jiri Matas and Krystian Mikolajczyk
This paper addresses the problem of learning from very large
databases where batch learning is impractical or even infeasible.
Bootstrap is a popular technique applicable in such situations. We
show that sampling strategy used for bootstrapping has a significant
impact on the resulting classifier performance. We design a new
general sampling strategy "quasi-random weighted sampling +
trimming" (QWS+) that includes well established strategies as
special cases. The QWS+ approach minimizes the variance of
hypothesis error estimate and leads to significant improvement in
performance compared to standard sampling techniques. The superior
performance is demonstrated on several problems including profile
and frontal face detection.