PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Model-driven detection of clean speech patches in noise
Jonathan Laidler, Martin Cooke and Neil Lawrence
In: Interspeech 2007, 27-31 Aug 2007, Antwerp, Belgium.


Listeners may be able to recognise speech in adverse conditions by glimpsing time-frequency regions where the target speech is dominant. Previous computational attempts to identify such regions have been source-driven, using primitive cues. This paper describes a model-driven approach in which the likelihood of spectro-temporal patches of a noisy mixture representing speech is given by a generative model. The focus is on patch size and patch modelling. Small patches lead to a lack of discrimination, while large patches are more likely to contain contributions from other sources. A cleanness measure reveals that a good patch size is one which extends over a quarter of the speech frequency range and lasts for 40 ms. Gaussian mixture models are used to represent patches. A compact representation based on a 2D discrete cosine transform leads to reasonable speech/background discrimination.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
ID Code:3808
Deposited By:Neil Lawrence
Deposited On:25 February 2008