Monocular 3D Reconstruction of Human Motion in Long Action Sequences
Gareth Loy, Josephine Sullivan and Stefan Carlsson
Springer Lecture Notes in Computer Science
Abstract. A novel algorithm is presented for the 3D reconstruction of human action in long (> 30 second) monocular image sequences.Asequence is represented by a small set of automatically found representative keyframes. The skeletal joint positions are manually located in each keyframe and mapped to all other frames in the sequence. For each keyframe a 3D key pose is created, and interpolation between these 3D body poses, together with the incorporation of limb length and symmetry constraints, provides a smooth initial approximation of the 3D motion. This is then fitted to the image data to generate a realistic 3D reconstruction. The degree of manual input required is controlled by the diversity of the sequence s content. Sports footage is ideally suited to this approach as it frequently contains a limited number of repeated actions. Our method is demonstrated on a long (36 second) sequence of a woman playing tennis filmed with a non-stationary camera. This sequence required manual initialisation on < 1.5% of the frames, and demonstrates that the system can deal with very rapid motion, severe selfocclusions, motion blur and clutter occurring over several concurrent frames. The monocular 3D reconstruction is verified by synthesising a view from the perspective of a ground truth reference camera, and the result is seen to provide a qualitatively accurate 3D reconstruction of the motion.