PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Statistical framework for a Spanish spoken dialogue corpus
C. D. Martínez, José Miguel Benedí and Ramón Granell
Speech Communication Volume 50, Number 11-12, pp. 992-1008, 2008. ISSN 0167-6393

Abstract

Dialogue systems are one of the most interesting applications of speech and language technologies. There have recently been some attempts to build dialogue systems in Spanish, and some corpora have been acquired and annotated. Using these corpora, statistical machine learning methods can be applied to try to solve problems in spoken dialogue systems. In this paper, two statistical models based on the maximum likelihood assumption are presented, and two main applications of these models on a Spanish dialogue corpus are shown: labelling and decoding. The labelling application is useful for annotating new dialogue corpora. The decoding application is useful for implementing dialogue strategies in dialogue systems. Both applications centre on unsegmented dialogue turns. The obtained results show that, although limited, the proposed statistical models are appropriate for these applications.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Natural Language Processing
Speech
ID Code:4554
Deposited By:Alfons Juan
Deposited On:24 March 2009