PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Comparison of Subspace Methods for Gaussian Mixture Models in Speech Recognition
Matti Varjokallio and Mikko Kurimo
In: Interspeech 2007, 27-31 Aug 2007, Antwerp, Belgium.

Abstract

Speech recognizers typically use high-dimensional feature vectors to capture the essential cues for speech recognition purposes. The acoustics are then commonly modeled with a Hidden Markov Model with Gaussian Mixture Models as observation probability density functions. Using unrestricted Gaussian parameters might lead to intolerable model costs both evaluation- and storagewise, which limits their practical use only to some high-end systems. The classical approach to tackle with these problems is to assume independent features and constrain the covariance matrices to being diagonal. This can be thought as constraining the second order parameters to lie in a fixed subspace consisting of rank-1 terms. In this paper we discuss the differences between recently proposed subspace methods for GMMs with emphasis placed on the applicability of the models to a practical LVCSR system.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Natural Language Processing
Speech
ID Code:3716
Deposited By:Mikko Kurimo
Deposited On:14 February 2008