Identifying individuals in video by combining generative and discriminative head models
Mark Everingham and Andrew Zisserman
In: ICCV 2005, 17-20 Oct 2005, Beijing, China.
The objective of this work is automatic detection and identification of individuals in unconstrained consumer video, given a minimal number of labelled faces as training data. Whilst much work has been done on (mainly frontal) face detection and recognition, current methods are not sufficiently robust to deal with the wide variations in pose and appearance found in such video. These include variations in scale, illumination, expression, partial occlusion, motion blur, etc.
We describe two areas of innovation: the first is to capture the 3-D appearance of the entire head, rather than just the face region, so that visual features such as the hairline can be exploited. The second is to combine discriminative and `generative' approaches for detection and recognition. Images rendered using the head model are used to train a discriminative tree-structured classifier giving efficient detection and pose estimates over a very wide pose range with three degrees of freedom. Subsequent verification of the identity is obtained using the head model in a `generative' framework. We demonstrate excellent performance in detecting and identifying three characters and their poses in a TV situation comedy.
|EPrint Type:||Conference or Workshop Item (Poster)|
|Project Keyword:||Project Keyword UNSPECIFIED|
|Deposited By:||Mudigonda Pawan Kumar|
|Deposited On:||20 October 2005|