PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

A Word-Based Naïve Bayes Classifier for Confidence Estimation in Speech Recognition
Alberto Sanchis, Alfons Juan and Enrique Vidal
IEEE Transactions on Audio, Speech, and Language Processing Volume 20, Number 2, pp. 565-574, 2012. ISSN 1558-7916


Confidence estimation has been largely used in speech recognition to detect words in the recognized sentence that have been likely misrecognized. Confidence estimation can be seen as a conventional pattern classification problem in which a set of features is obtained for each hypothesized word in order to classify it as either correct or incorrect. We propose a smoothed naïve Bayes classification model to profitably combine these features. The model itself is a combination of word-dependent (specific) and word-independent (generalized) naïve Bayes models. As in statistical language modeling, the purpose of the generalized model is to smooth the (class posterior) estimates given by the specific models. Our classification model is empirically compared with confidence estimation based on posterior probabilities computed on word graphs. Empirical results clearly show that the good performance of word graph-based posterior probabilities can be improved by using the naïve Bayes combination of features.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:9231
Deposited By:Alfons Juan
Deposited On:21 February 2012