Region matching techniques for spatial bag of visual words based image category recognition
Histograms of local features---bags of visual words (BoV)---have proven to be powerful representations in image categorisation and object detection. The BoV representations have usefully been extended in spatial dimension by taking the features' spatial distribution into account. In this paper we describe region matching strategies to be used in conjunction with such extensions. Of these, the rigid region matching is most commonly used. Here we present an alternative based on the Integrated Region Matching (IRM) technique, loosening the constraint of geometrical rigidity of the images. After having described the techniques, we evaluate them in image category detection experiments that utilise 5000 photographic images taken from the PASCAL VOC Challenge 2007 benchmark. Experiments show that for many image categories, the rigid region matching performs slightly better. However, for some categories IRM matching is significantly more accurate an alternative. As a consequence, on average we did not observe a significant difference. The best results were obtained by combining the two schemes.