PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Multi-person visual focus of attention from head pose and meeting contextual cues
Sileye Ba and Jean-Marc Odobez
IEEE Trans. on Pattern Analysis and Machine Intelligence, accepted for publication 2009.


This paper introduces a novel contextual model for the recognition of people’s visual focus of attention (VFOA) estimation in meetings from audio-visual perceptual cues. More specifically, instead of independently recognizing the VFOA of each meeting participant from his own head pose, we propose to jointly recognize the participants’ visual attention in order to introduce context dependent interaction models that relates to group activity and the social dynamics of communication. Meeting contextual information is represented by the location of people, conversational events identifying floor holding patterns, and a presentation activity variable. By modeling the interactions between the different contexts and their combined and sometimes contradictory impact on the gazing behavior, our model allows to handle VFOA recognition in difficult task-based meetings in- volving artifacts, presentations, and moving people. We validated our model through rigorous evaluation on a publicly available and challenging dataset of 12 real meetings (five hours of data). The results demonstrated that the integration of the presentation and conversation dynamical context using our model can lead to significant performance improvements.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:User Modelling for Computer Human Interaction
Multimodal Integration
ID Code:6209
Deposited By:Jean-Marc Odobez
Deposited On:08 March 2010