PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Weak hypotheses and boosting for generic object detection and recognition
Andreas Opelt, Michael Fussenegger, Axel Pinz and Peter Auer
In: ECCV 2004, 11-14 May 2004, Prague, Czech Republic.

There is a more recent version of this eprint available. Click here to view it.


In this paper we describe the first stage of a new learning system for object detection and recognition. For our system we propose Boosting [5] as the underlying learning technique. This allows the use of very diverse sets of visual features in the learning process within a common framework: Boosting — together with a weak hypotheses finder — may choose very inhomogeneous features as most relevant for combination into a final hypothesis. As another advantage the weak hypotheses finder may search the weak hypotheses space without explicit calculation of all available hypotheses, reducing computation time. This contrasts the related work of Agarwal and Roth [1] where Winnow was used as learning algorithm and all weak hypotheses were calculated explicitly. In our first empirical evaluation we use four types of local descriptors: two basic ones consisting of a set of grayvalues and intensity moments and two high level descriptors: moment invariants [8] and SIFTs [12]. The descriptors are calculated from local patches detected by an interest point operator. The weak hypotheses finder selects one of the local patches and one type of local descriptor and efficiently searches for the most discriminative similarity threshold. This differs from other work on Boosting for object recognition where simple rectangular hypotheses [22] or complex classifiers [20] have been used. In relatively simple images, where the objects are prominent, our approach yields results comparable to the state-of-the-art [3]. But we also obtain very good results on more complex images, where the objects are located in arbitrary positions, poses, and scales in the images. These results indicate that our flexible approach, which also allows the inclusion of features from segmented regions and even spatial relationships, leads us a significant step towards generic object recognition.

EPrint Type:Conference or Workshop Item (Paper)
Additional Information:Available also from
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Machine Vision
Learning/Statistics & Optimisation
ID Code:133
Deposited By:Peter Auer
Deposited On:25 November 2004

Available Versions of this Item