PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Enhancing Audio Speech using Visual Speech Features
Ibrahim Almajai and Ben Milner
In: Interspeech 2009, 6-10 Sep 2009, Brighton, UK.


This work presents a novel approach to speech enhancement by exploiting the bimodality of speech and the correlation that exists between audio and visual speech features. For speech enhancement, a visually-derived Wiener filter is developed. This obtains clean speech statistics from visual features by modelling their joint density and making a maximum a posteriori estimate of clean audio from visual speech features. Noise statistics for the Wiener filter utilise an audio-visual voice activity detector which classifies input audio as speech or nonspeech, enabling a noisemodel to be updated. Analysis shows estimation of speech and noise statistics to be effective with human listening tests measuring the effectiveness of the resulting Wiener filter.

EPrint Type:Conference or Workshop Item (Oral)
Project Keyword:Project Keyword UNSPECIFIED
Multimodal Integration
ID Code:5616
Deposited By:Ibrahim Almajai
Deposited On:08 March 2010