Optimization framework for learning a hierarchical shape vocabulary for object class detection
This paper proposes a stochastic optimization framework for unsupervised learning of a hierarchical vocabulary of object shape intended for object class detection. We build on the approach by , which has two drawbacks: 1.) learning is performed strictly bottom-up; and 2.) the selection of vocabulary shapes is done solely on their frequency of appearance. This makes the method prone to overfitting of certain parts of object shape while losing the more discriminative shape information. The idea of this paper is to cast the vocabulary learning into an optimization framework that iteratively improves the hierarchy as a whole. Optimization is two-fold: one that learns and selects the vocabulary of shapes at each layer in a bottom-up phase and the other that extends/improves it by top-down feedback from the higher layers. The algorithm then loops between the two learning stages several times. We have evaluated the proposed learning approach for object class detection on 11 diverse object classes taken from the standard recognition data sets. Compared to the original approach , we obtain a 3 times more compact vocabulary, a 2:5 times faster inference, and a 10% higher detection performance at the expense of 5 times longer training time (25min vs 5min). The approach attains a competitive detection performance with respect to the current state-of-the-art at both, faster inference as well as shorter training times.