PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Enhancing Speaker Recognition with Virtual Examples
Yosef Solewicz and Hagai Aronowitz
In: IBM Speech Technologies Seminar 2008, 02-Jul-2008, Israel.


Support vector machines (SVMs) combined with Gaussian mixture models (GMMs) using universal background models (UBMs) have recently emerged as the state-of-the-art approach to speaker recognition. Typically, linear kernel SVMs are defined in a space in which speakers are represented by supervectors. A supervector is formed by stacking the Maximum-A-Posteriori (MAP) adapted means of the UBM, given the speaker data, so that a whole speaker conversation is condensed into a single point in the supervector space. Due to the limitations in target, as opposed to impostor data, this framework leads to highly imbalanced training. Virtual examples (VEs) refers to the creation of artificial examples generated from the original labeled ones and is one of the proposed solutions to alleviate the imbalanced training problem. It has been successfully applied in tasks such as text and handwriting recognition. In this work, we present preliminary results obtained using VEs in the context of 4-wire and 2-wire speaker recognition.

EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:4681
Deposited By:Yosef Solewicz
Deposited On:24 March 2009