Combined Waveform-Cepstral Representationfor Robust Speech Recognition
Matthew Ager, Zoran Cvetkovic and Peter Sollich
In: 2011 IEEE International Symposium on Information Theory, 31 July - 5 Aug 2011, Russia.
High-dimensional acoustic waveform representations are studied as a front-end for noise robust automatic speech recognition using generative methods, in particular Gaussian mixture models and hidden Markov models. The proposed representations are compared with standard cepstral features on phoneme classiﬁcation and recognition tasks. While lower error rates are achieved using cepstral features at very low noise levels, the models using acoustic waveform representations are more robust to additive noise. A combination of acoustic waveforms and cepstral features achieves better results than either of the individual representations across all noise levels.
|EPrint Type:||Conference or Workshop Item (Paper)|
|Project Keyword:||Project Keyword UNSPECIFIED|
|Deposited By:||Matthew Ager|
|Deposited On:||17 March 2011|