PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Speaker indexing in audio archives using test utterance Gaussian mixture modeling
Hagai Aronowitz, David Burshtein and Amihood Amir
In: ICSLP 2004, 4-8 Oct 2004, Jeju, South Korea.

Abstract

Speaker Indexing has recently emerged as an important task due to the rapidly growing volume of audio archives. Current filtration techniques still suffer from problems both in accuracy and efficiency. The major reason for the drawbacks of existing solutions is the use of inaccurate anchor models. The contribution of this paper is two-fold. On the theoretical side, a new method is developed for simulating GMM scoring. This enables to fit a GMM not only to every target speaker but also to every test utterance, and then compute the likelihood of the test call using these GMMs instead of using the original data. The second contribution of this paper is in harnessing this GMM simulation to achieve very efficient speaker indexing in terms of both search time and index size. Results on the SPIDRE corpus show that our approach maintains the accuracy of the conventional GMM algorithm.

Postscript - Requires a viewer, such as GhostView
PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Speech
ID Code:347
Deposited By:Hagai Aronowitz
Deposited On:16 December 2004