PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Mask estimation and sparse imputation for missing data speech recognition in multisource reverberant environments
Heikki Kallasjoki, Sami Keronen, Guy J. Brown, Jort F. Gemmeke, Ulpu Remes and Kalle Palomäki
In: CHiME 2011 Workshop on Machine Listening in Multisource Environments(2011).


This work presents an automatic speech recognition system which uses a missing data approach to compensate for environmental noise. The missing, noise-corrupted components are identified using binaural features or a support vector machine (SVM) classifier. To perform speech recognition using the partially observed data, the missing components are substituted with clean speech estimates calculated using sparse imputation. Evaluated on the CHiME reverberant multisource environment corpus, the missing data approach significantly improved the keyword recognition accuracy in moderate and poor SNR conditions. The best results were achieved when the missing components were identified using the binaural features and the clean speech estimates associated with observation uncertainty estimates.

EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
ID Code:8943
Deposited By:Kalle Palomäki
Deposited On:21 February 2012