PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Growing an n-gram model
Vesa Siivola and Bryan. L. Pellom
In: Interspeech 2005(2005).


Traditionally, when building an n-gram model, we decide the span of the model history, collect the relevant statistics and estimate the model. The model can be pruned down to a smaller size by manipulating the statistics or the estimated model. This paper shows how an n-gram model can be built by adding suitable sets of n-grams to a unigram model until desired complexity is reached. Very high order n-grams can be used in the model, since the need for handling the full unpruned model is eliminated by the proposed technique. We compare our growing method to entropy based pruning. In Finnish speech recognition tests, the models trained by the growing method outperform the entropy pruned models of similar size.

Postscript - Requires a viewer, such as GhostView
EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:1772
Deposited By:Vesa Siivola
Deposited On:28 November 2005