## AbstractTo extract knowledge from a small number of samples containing very complex high-dimensional data is impossible if the data has no particular structure. This presentation will consider the case when the high dimensional data lies on an unknown low dimensional manifold. In this situation, one can estimate the dimension of the manifold and avoid the curse of dimensionality by using algorithms based on appropriate neighborhood graphs. The graph Laplacians of such graphs are used successfully in several machine learning methods like transductive learning, dimensionality reduction and spectral clustering. We will first provide a method of estimating the dimension d of the manifold supporting the data based on a finite number n of samples. The method uses the nearest-neighborhood structure of the dataset, is guaranteed to predict with exponentially high probability the correct dimension as soon as n^{1/d} is sufficiently large, and is indeed accurate in this case. We will then present the pointwise limit of the three different graph Laplacians used in the literature as the sample size increases and the neighborhood size approaches zero. We show that for a uniform measure on the submanifold all graph Laplacians have the same limit up to constants. However in the case of a nonuniform measure on the submanifold only the so called random walk graph Laplacian converges to the weighted Laplace-Beltrami operator. Finally, we will illustrate Graph Laplacian based tranductive learning by considering two computer vision tasks: segmenting an image into regions consistent with user-supplied seeds (e.g., a sparse set of broad brush strokes), and interactive image search in a large image database.
[Edit] |