Speech transcription and spoken document retrieval in Finnish
Mikko Kurimo, Ville Turunen and Inger Ekman
MLMI'04: Proceedings of the Workshop on Machine Learning for Multimodal Interaction
Lecture notes in computer science
This paper presents a baseline spoken document retrieval system in Finnish
that is based on unlimited vocabulary continuous speech recognition.
Due to its agglutinative structure, Finnish speech can not be adequately
transcribed using the standard large vocabulary continuous speech
The definition of a sufficient lexicon and the training of the statistical
language models are difficult, because the words appear transformed
by many inflections and compounds.
In this work we apply the recently developed language model that enables
n-gram models of morpheme-like subword units
discovered in an unsupervised manner.
In addition to word-based indexing, we also propose an indexing based on
the subword units provided directly by our speech recognizer,
and a combination of the both.
In an initial evaluation of newsreading in Finnish,
we obtained a fairly low recognition error rate and
average document retrieval precisions close to what can be obtained
from human reference transcripts.