PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Speaker indexing in audio archives using Gaussian mixture scoring simulation
Hagai Aronowitz, David Burshtein and Amihood Amir
In: Workshop on Machine Learning for Multimodal Interaction, 21-23 Jun 2004, Martigny, Swiss.

Abstract

Speaker indexing has recently emerged as an important task due to the rapidly growing volume of audio archives. Current filtration techniques still suffer from problems both in accuracy and efficiency. In this paper an efficient method to simulate GMM scoring is presented. Simulation is done by fitting a GMM not only to every target speaker but also to every test utterance, and then computing the likelihood of the test call using these GMMs instead of using the original data. GMM simulation is used to achieve very efficient speaker indexing in terms of both search time and index size. Results on the SPIDRE and NIST-2004 speaker evaluation corpuses show that our approach maintains and sometimes exceeds the accuracy of the conventional GMM algorithm and achieves efficient indexing capabilities: 6000 times faster than a conventional GMM with 1% overhead in storage.

Postscript - Requires a viewer, such as GhostView
PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Speech
ID Code:345
Deposited By:Hagai Aronowitz
Deposited On:16 December 2004