PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Toward Manifold-Adaptive Learning
Amir massoud Farahmand, Csaba Szepesvari and Jean-Yves Audibert
In: NIPS 2007, Dec 2007, Whistler, Canada.


Inputs coming from high-dimensional spaces are common in many real-world problems such as a robot control with visual inputs. Yet learning in such cases is in general difficult, a fact often referred to as the ''curse of dimensionality''. In particular, in regression or classification, in order to achieve a certain accuracy algorithms are known to require exponentially many samples in the dimension of the inputs in the worst-case [1]. The exponential dependence on the input dimension forces us to develop methods that are efficient in exploiting regularities of the data. Classically, smoothness is the best known example of such a regularity. In this abstract we outline two methods for two problems that are efficient in exploiting when the data points lie on a low dimensional submanifold of the input space. Specifically, we consider the case when the data points lie on a manifold M of dimension d, which is embedded in the higher-dimensional input space with dimension D. A method is called manifold-adaptive if its sample complexity can be bounded by a quantity whose exponent depends only on d and not on D. Thus a manifold-adaptive method may enjoy a considerably better sample complexity whenever d is much smaller than D. Although there are many learning methods that are designed to be manifold adaptive (or manifold friendly), they more often than not lack a rigorous proof of this property (one exception is the recent work of Scott and Nowak on dyadic decision trees in a classification context, cf. [2]). The first method, proposed by us earlier in [3], concerns the problem of estimating the dimension of a manifold based on points sampled from it. The second method is the classical k-nearest neighbor regressor. We find it intriguing that this method was not specifically designed to be manifold adaptivity, yet it is relatively simple to prove that it possesses this property.

EPrint Type:Conference or Workshop Item (Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:3176
Deposited By:Jean-Yves Audibert
Deposited On:03 January 2008