Mobile Texting: Can Post-ASR Correction Solve the Issues? An Experimental Study on Gain vs. Costs
Michael Feld, Saaedeh Momtazi, Farina Freigang, Dietrich Klakow and Christian Müller
In: IUI 2012, 14 Feb - 17 Feb 2012, Lisbon, Portugal.
The next big step in embedded, mobile speech recognition
will be to allow completely free input as it is needed for messaging
like SMS or email. However, unconstrained dictation
remains error-prone, especially when the environment is
noisy. In this paper, we compare different methods for improving
a given free-text dictation system used to enter textbased
messages in embedded mobile scenarios, where distraction,
interaction cost, and hardware limitations enforce
strict constraints over traditional scenarios. We present a corpus-
based evaluation, measuring the trade-off between improvement
of the word error rate versus the interaction steps
that are required under various parameters. Results show that
by post-processing the output of a “black box” speech recognizer
(e.g. a web-based speech recognition service), a reduction
of word error rate by 55% (10.3% abs.) can be obtained.
For further error reduction, however, a richer representation
of the original hypotheses (e.g. lattice) is necessary.