Estimating VTLN warping factors by distribution matching
In: Interspeech 2007, 27-31 Aug 2007, Antwerp, Belgium.
Several methods exist for estimating the warping factors for vocal
tract length normalization (VTLN), most of which rely on an exhaustive
search over the warping factors to maximize the likelihood of the
adaptation data. This paper presents a method for warping factor
estimation that is based on matching Gaussian distributions by
Kullback-Leibler divergence. It is computationally more efficient than
most maximum likelihood methods, but above all it can be used to
incorporate the speaker normalization very early in the training
process. This can greatly simplify and speed up the training. The
estimation method is compared to the baseline maximum likelihood
method in three large vocabulary continuous speech recognition
tasks. The results confirm that the method performs well in a variety
of tasks and configurations.