|
Methods for combining language models in speech recognition AbstractStatistical language models have a vital part in contemporary speech recognition systems, where its task is to score the word hypothesis based on the linguistical knowledge. A lot of language models have been presented in the literature. The best results have been achieved when different language models have been used together. Several combination methods have been presented, but a thorough investigation of the methods has not been done. In this work, combination methods that have been used with language models are studied. Also, a new approach based on likelihood density function estimation using histograms is presented. In addition to theoretical consideration, four combining methods are evaluated in speech recognition experiments and perplexity calculations that measure the quality of the language models. The test data consist of Finnish news articles. Four language models, that are presented in the work, work as the component models. In the perplexity experiments, all combining methods produced statistically significant improvement compared to the 4-gram model that worked as a baseline. The best result, 46 \% improvement to the 4-gram model, was achieved when combining several language models together by using the new bin estimation method. In the speech recognition experiments, 4 \% reduction to the word error and 7 \% reduction to the phonem error was achieved.
[Edit] |