Monocular human motion capture with a mixture of regressors
William Triggs and Ankur Agarwal
In: CVPR 2005, 20-26 June 2005, San Diego, California.
We address 3D human motion capture from monocular images, taking a learning
based approach to construct a probabilistic pose estimation model from a set of
labelled human silhouettes. To compensate for ambiguities in the pose
reconstruction problem, our model explicitly calculates several possible pose
hypotheses. It uses locality on a manifold in the input space and connectivity
in the output space to identify regions of multi-valuedness in the mapping from
silhouette to 3D pose. This information is used to fit a mixture of regressors
on the input manifold, giving us a global model capable of predicting the
possible poses with corresponding probabilities. These are then used in a
dynamical-model based tracker that automatically detects tracking failures and
re-initializes in a probabilistically correct manner. The system is trained on
optical sensor based motion capture data, using the corresponding real human
silhouettes supplemented with silhouettes synthesized artificially from several
different models for improved robustness to inter-person variations.
Static pose estimation is illustrated on a variety of silhouettes. The
robustness of the method is demonstrated by tracking on a real image sequence
requiring multiple automatic re-initializations.