A time series kernel for action recognition
Adrien Gaidon, Zaid Harchaoui and Cordelia Schmid
In: A time series kernel for action recognition, 29 Aug - 02 Sep 2011, United Kingdom.
We address the problem of action recognition by describing actions as time series of frames and introduce a new kernel to compare their dynamical aspects. Action recognition in realistic videos has been successfully addressed using kernel methods like SVMs. Most existing approaches average local features over video volumes and compare the resulting vectors using kernels on bags of features. In contrast, we model actions as time series of per-frame representations and propose a kernel specifically tailored for the purpose of action recognition. Our main contributions are the following: (i) we provide a new principled way to compare the dynamics and temporal structure of actions by computing the distance between their auto-correlations, (ii) we derive a practical formulation to compute this distance in any feature space deriving from a base kernel between frames and (iii) we report experimental results on recent action recognition datasets showing that it provides useful complementary information to the average distribution of frames, as used in state-of-the-art models based on bag-of-features.