PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

On the Potential for Robust ASR with Combined Subband-Waveform and Cepstral Features
J K Yousafzai, Z Cvetkovic and P Sollich
In: IEEE International Symposium on Information Theory 2011(2011).

Abstract

This work explores the potential for robust classification of phonemes in the presence of additive noise and linear filtering using high-dimensional features in the subbands of acoustic waveforms. The proposed technique is compared with state-of-the-art automatic speech recognition (ASR) front-ends on the TIMIT phoneme classification task using support vector machines (SVMs). The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. Experiments demonstrate the benefits of the classification in the subbands of acoustic waveforms: it outperforms the standard cepstral front-end in the presence of noise and linear filtering for all signal-to-noise ratios (SNRs) below a crossover point between 12dB and 6dB. Combining the subband-waveform and cepstral classifiers achieves further performance improvements over both individual classifiers.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
Speech
ID Code:7688
Deposited By:Jibran Yousafzai
Deposited On:17 March 2011