LDA based feature estimation methods for LVCSR
In: Interspeech 2006, 17-21 Sep 2006, Pittsburgh PA, USA.
Features that model temporal aspects of phonemes are important
in speech recognition. One method is to use linear discriminant analysis
(LDA) to find discriminative features from a spectro-temporal input
formed by concatenating consecutive frames of short-time spectrum features. Others use e.g. neural networks to process longer
span spectral segments to improve recognition accuracy. Still
the most widely used method for including temporal cues is to augment
the short-time spectral features with simple time derivatives.
In this paper a new feature estimation method based on pairwise
linear discriminants is presented. We compare it and some of
its variants to traditional MFCC features and to LDA estimated
features in a large vocabulary continuous speech recognition (LVCSR)
task. The features obtained with the new estimation method show
significant improvements in recognition accuracy over MFCC and