Speaker indexing in audio archives using test utterance Gaussian mixture modeling
Hagai Aronowitz, David Burshtein and Amihood Amir
In: ICSLP 2004, 4-8 Oct 2004, Jeju, South Korea.
Speaker Indexing has recently emerged as an important task due to the rapidly growing volume of audio archives. Current filtration techniques still suffer from problems both in accuracy and efficiency. The major reason for the drawbacks of existing solutions is the use of inaccurate anchor models. The contribution of this paper is two-fold. On the theoretical side, a new method is developed for simulating GMM scoring. This enables to fit a GMM not only to every target speaker but also to every test utterance, and then compute the likelihood of the test call using these GMMs instead of using the original data. The second contribution of this paper is in harnessing this GMM simulation to achieve very efficient speaker indexing in terms of both search time and index size. Results on the SPIDRE corpus show that our approach maintains the accuracy of the conventional GMM algorithm.
|EPrint Type:||Conference or Workshop Item (Paper)|
|Project Keyword:||Project Keyword UNSPECIFIED|
|Deposited By:||Hagai Aronowitz|
|Deposited On:||16 December 2004|