PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Algorithms for source separation - with cocktail party applications
Rasmus Olsson
(2007) PhD thesis, Tehcn. Univ. of Denmark.

Abstract

In the thesis, a number of possible solutions to source separation are suggested. Although they differ significantly in shape and intent, they share a heavy reliance on prior domain knowledge. Most of the developed algorithms are intended for speech applications, and hence, strutural features of speech have been incorporated. Single-channel separation of speech is a particularly challenging signal processing task, where the purpose is to exreacr a number of speech signals fro a single observed mixture. I present a few methods to obtain separation, which rely on the sparsity and structure of speech in time-frequency representations. My own contributions are based on learning dictionaries to separate a mixture. Sparse decompositions required for the decomposition are computed using non-negative matrix factorization as well as basic pursuit. In my work on the multi-channel problem, I have focused on convolutive mixtures, which is the appopriate model in acoustic setups. We have been successful in incorporating a harmonic speech model into a greater probabilistic formulation. Furthermore, we have presented several learing schemes for the parameters of such models, more specifically, the expectation-maximization (EM) algorithm and stochastic and Newton-type gradient optimization.

EPrint Type:Thesis (PhD)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Speech
ID Code:3492
Deposited By:Jan Larsen
Deposited On:11 February 2008