Past the limits of bag-of-features
PhD thesis, Institut Polytechnique de Grenoble.
This dissertation explores and extends the limits of bag-of-features based visual recognition. A state-of-the-art recognition framework based on local features and non-linear kernel methods is developed and evaluated, then possible extensions are explored.
We start with a large-scale evaluation of a bag-of-features image and video classification framework. Features most useful for visual recognition in real-world conditions are chosen carefully and different orderless or locally orderless feature distributions are evaluated. State-of-the-art texture and object recognition results are achieved by combining different feature channels. To address recognition of natural human actions in diverse and realistic video settings, we discuss the problem of learning with automatically annotated data.
To tackle object class recognition in difficult, real-world conditions, we study the influence of background correlations and clutter on bag-of-features based methods. Subsequently, weak geometrical constraints over the orderless representation are proposed to reduce background clutter and improve classification performance. Those constraints are based on shape masks demarking object extents and can also be used to localize objects with approximate outlines. Localization results show that the proposed method handles well multiple object views, articulations and occlusions.
Finally, we study class hierarchies in the context of object recognition. First, the construction of class hierarchies from visual data is evaluated. We show that current approaches incorporate separability assumptions that are unlikely to hold for a large number of object categories. An appropriate relaxation which avoids this assumption is proposed. Second, we investigate the extraction of semantic class hierarchies from lexical networks. We add semantic awareness to the recognition process and describe the new perspectives it opens in generic object recognition.
|EPrint Type:||Thesis (PhD)|
|Project Keyword:||Project Keyword UNSPECIFIED|
|Deposited By:||Marcin Marszalek|
|Deposited On:||24 March 2009|