PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Improving unsegmented dialogue turns annotation with n-gram transducers.
Carlos David Martínez-Hinarejos, Vicent Tamarit and José Miguel Benedí
In: the 23rd Pacific Asia Conference on Language, Information and Computation, 03 -05 Dec 2009, Hong Kong.


The statistical models used for dialogue systems need annotated data (dialogues) to infer their statistical parameters. Dialogues are usually annotated in terms of Dialogue Acts (DA). The annotation problem can be attacked with statistical models, that avoid annotating the dialogues from scratch. Most previous works on automatic statistical annotation assume that the dialogue turns are segmented into the corresponding meaningful units. However, this segmentation is not usually available. Most recent works tried the annotation with unsegmented turns using an extension of the models used in the segmented case, but they showed a dramatical decrease in their performance. In this work we propose an enhanced annotation technique based on N-gram transducers that outperforms the accuracy of the classical HMM-based model for annotation and segmentation of unsegmented turns.

EPrint Type:Conference or Workshop Item (Oral)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:5657
Deposited By:Alfons Juan
Deposited On:08 March 2010