High-Dimensional Linear Representations for Robust Speech Recognition
Matthew Ager, Zoran Cvetkovic and Peter Sollich
In: Information Theory and Applications Workshop, 31 Jan - 05 Feb 2010, San Diego, USA.
Phoneme classiﬁcation is investigated in linear feature domains with
the aim of improving the robustness to additive noise. Linear feature domains allow for exact noise adaptation and so should result
in more accurate classiﬁcation than representations involving non-linear processing and dimensionality reduction. We develop a generative framework for phoneme classiﬁcation using linear features.
We ﬁrst show results for a representation consisting of concatenated
frames from the centre of the phoneme, each containing f frames.
As no single f is optimal for all phonemes, we further average over
models with a range of values of f . Next we improve results by
including information from the entire phoneme. In the presence of
additive noise, classiﬁcation in this framework performs better than
an analogous PLP classiﬁer, adapted to noise using cepstral mean
and variance normalisation, below 18dB SNR.
|EPrint Type:||Conference or Workshop Item (Talk)|
|Project Keyword:||Project Keyword UNSPECIFIED|
|Deposited By:||Matthew Ager|
|Deposited On:||08 March 2010|