A tree-based regressor that adapts to intrinsic dimension. ## AbstractWe consider the problem of nonparametric regression, consisting of learning an arbitrary mapping f : X to Y from a data set of (x; y) pairs in which the y values are corrupted by noise of mean zero. This statistical task is known to to be subject to a severe curse of dimensionality: if X in RD, and if the only smoothness assumption on f is that it satisfies a Lipschitz condition, it is known that any estimator based on n data points will have an error rate (risk) of n^{-2/(2+D)}. Here we present a tree-based regressor whose risk depends only on the doubling dimension of X, not on D. This notion of dimension generalizes two cases of contemporary interest: when X is a low-dimensional manifold, and when X is sparse. The tree is built using random hyperplanes as splitting criteria, building upon recent work of Dasgupta and Freund [DF08]; and we show that axis-parallel splits cannot achieve the same finite-sample rate of convergence.
[Edit] |