Comparison of Subspace Methods for Gaussian Mixture Models in Speech Recognition
Matti Varjokallio and Mikko Kurimo
In: Interspeech 2007, 27-31 Aug 2007, Antwerp, Belgium.
Speech recognizers typically use high-dimensional feature vectors
to capture the essential cues for speech recognition purposes.
The acoustics are then commonly modeled with a Hidden
Markov Model with Gaussian Mixture Models as observation
probability density functions. Using unrestricted Gaussian
parameters might lead to intolerable model costs both
evaluation- and storagewise, which limits their practical use
only to some high-end systems. The classical approach to tackle
with these problems is to assume independent features and constrain
the covariance matrices to being diagonal. This can be
thought as constraining the second order parameters to lie in a
fixed subspace consisting of rank-1 terms. In this paper we discuss
the differences between recently proposed subspace methods
for GMMs with emphasis placed on the applicability of the
models to a practical LVCSR system.