Growing an n-gram model
Vesa Siivola and Bryan. L. Pellom
In: Interspeech 2005(2005).
Traditionally, when building an n-gram model, we decide the span of
the model history, collect the relevant statistics and estimate the
model. The model can be pruned down to a smaller size by
manipulating the statistics or the estimated model. This paper
shows how an n-gram model can be built by adding suitable sets of
n-grams to a unigram model until desired complexity is reached.
Very high order n-grams can be used in the model, since the need for
handling the full unpruned model is eliminated by the proposed
technique. We compare our growing method to entropy based pruning.
In Finnish speech recognition tests, the models trained by the
growing method outperform the entropy pruned models of similar size.