PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Fast Realistic Multi-Action Recognition using Mined Dense Spatio-temporal Features
Andrew Gilbert, John Illingworth and Richard Bowden
In: 12th International Conference on Computer Vision (ICCV), 27 Sept - 04 Oct, Kyoto, Japan.


Within the field of action recognition, features and descriptors are often engineered to be sparse and invariant to transformation. While sparsity makes the problem tractable, it is not necessarily optimal in terms of class separability and classification. This paper proposes a novel approach that uses very dense corner features that are spatially and temporally grouped in a hierarchical process to produce an overcomplete compound feature set. Frequently reoccurring patterns of features are then found through data mining, designed for use with large data sets. The novel use of the hierarchical classifier allows real time operation while the approach is demonstrated to handle camera motion, scale, human appearance variations, occlusions and background clutter. The performance of classification, outperforms other state-of-the-art action recognition algorithms on the three datasets; KTH, multi-KTH, and realworld movie sequences containing broad actions. Multiple action localisation is performed, though no groundtruth localisation data is required, using only weak supervision of class labels for each training sequence. The realworld movie dataset contain complex realistic actions from movies, the approach outperforms the published accuracy on this dataset and also achieves real time performance.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Machine Vision
ID Code:6835
Deposited By:Teo de Campos
Deposited On:08 March 2010