Learning manifolds with the Parametrized Self-Organizing Map and Unsupervised Kernel Regression
This thesis presents several new developments in the field of manifold learning and nonlinear dimensionality reduction. The main text can be divided into three parts, the first of which presents a smoothness-based regularizer that is specifically tuned to the Parametrized Self-Organizing Map (PSOM). The regularization approach makes it possible to deal with noisy or missing data in a principled manner, and it facilitates the construction of PSOMs from data that are not organized in a grid topology. In the second part, the manifold learning algorithm Unsupervised Kernel Regression (UKR) is introduced as a counterpart to the classical Nadaraya-Watson estimator. In a nutshell, UKR requires very little parameters to be chosen a priori: In its simplest form, a UKR model is fully specified by the dimensionality of latent space and the choice of a density kernel, and it can be regularized automatically by using leave-one-out cross-validation without additional computational cost. The low dimensional coordinates (latent variables) together with a mapping from latent space to data space are retrieved by minimizing some error criterion. The third part presents four possible extensions to UKR, specifically 1) a more general cross-validation scheme, aimed at avoiding unsmooth manifolds, 2) the inclusion of loss functions beyond the usual squared error, which can enhance the robustness towards outliers, and by which UKR can be tuned to specific noise levels, 3) a "landmark" variant which helps to reduce the computational cost, and 4) Unsupervised Local Polynomial Regression, where the Nadaraya-Watson estimator is replaced by local linear or local quadratic regression models, the latter showing less bias in the presence of curvature.