PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

High-Dimensional Linear Representations for Robust Speech Recognition
Matthew Ager, Zoran Cvetkovic and Peter Sollich
In: Information Theory and Applications Workshop, 31 Jan - 05 Feb 2010, San Diego, USA.

Abstract

Phoneme classification is investigated in linear feature domains with the aim of improving the robustness to additive noise. Linear feature domains allow for exact noise adaptation and so should result in more accurate classification than representations involving non-linear processing and dimensionality reduction. We develop a generative framework for phoneme classification using linear features. We first show results for a representation consisting of concatenated frames from the centre of the phoneme, each containing f frames. As no single f is optimal for all phonemes, we further average over models with a range of values of f . Next we improve results by including information from the entire phoneme. In the presence of additive noise, classification in this framework performs better than an analogous PLP classifier, adapted to noise using cepstral mean and variance normalisation, below 18dB SNR.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Speech
ID Code:6107
Deposited By:Matthew Ager
Deposited On:08 March 2010