Machine Learning for Multimodal Interaction: Second International Workshop, MLMI'2005
Steve Renals and Samy Bengio, ed.
Lecture Notes in Computer Science
, Volume 3869
This book contains a selection of refereed papers presented at the Second Workshop on Machine Learning for Multimodal Interaction (MLMI 2005), held in Edinburgh, Scotland, during 11-13 July 2005. The workshop was organized and sponsored jointly by two European integrated projects, three European Networks of Excellence and a Swiss national research network: AMI, CHIL, HUMAINE, PASCAL, SIMILAR, and IM2. In addition to the main workshop, MLMI 2005 hosted the NIST (US National Institute of Standards and Technology) Meeting Recognition Workshop. This workshop (the third such sponsored by NIST) was centered on the Rich Transcription 2005 Spring Meeting Recognition (RT-05) evaluation of speech technologies within the meeting domain. Building on the success of the RT-04 spring evaluation, the RT-05 evaluation continued the speech-to-text and speaker diarization evaluation tasks and added two new evaluation tasks: speech activity detection and source localization. Given the multiple links between the above projects and several related research areas, and the success of the first MLMI 2004 workshop, it was decided to organize once again a joint workshop bringing together researchers working around the common theme of advanced machine learning algorithms for processing and structuring multimodal human interaction. The motivation for creating such a forum, which could be perceived as a number of papers from different research disciplines, evolved from an actual need that arose from these projects and the strong motivation of their partners for such a multidisciplinary workshop. The areas covered included: Human-human communication modeling, Speech and visual processing, Multimodal processing, fusion and fission, Multimodal dialog modeling, Human-human interaction modeling, Multimodal data structuring and presentation, Multimedia indexing and retrieval, Meeting structure analysis, Meeting summarizing, Multimodal meeting annotation, and Machine learning applied to the above.