PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

A High-Dimensional Subband Speech Representation and SVM Framework for Robust Speech Recognition
J K Yousafzai, Z Cvetkovic, P Sollich and M Ager
IEEE Transactions on Signal Processing 2011.


This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) front-end that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. The proposed front-end is compared with state-of-the-art ASR front-ends in terms of robustness to additive noise and linear filtering. Experiments performed on the TIMIT phoneme classification task demonstrate the benefits of the proposed subband based SVM representation: it outperforms the standard cepstral front-end in the presence of noise and linear filtering for signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed front-end with a conventional representation such as MFCC yields further improvements over the individual front-ends across the full range of noise levels.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
ID Code:7686
Deposited By:Jibran Yousafzai
Deposited On:17 March 2011