PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Scrambled objects for least-squares regression
Odalric-Ambrym Maillard and Rémi Munos
Advances in Neural Information Processing Systems Volume 23, pp. 1549-1557, 2010.


We consider least-squares regression using a randomly generated subspace G_P \subset F of finite dimension P, where F is a function space of infinite dimension, e.g. L_2([0, 1]^d). GP is defined as the span of P random features that are linear combinations of the basis functions of F weighted by random Gaussian i.i.d. coefficients. In particular, we consider multi-resolution random combinations at all scales of a given mother function, such as a hat function or a wavelet. In this latter case, the resulting Gaussian objects are called scrambled wavelets and we show that they enable to approximate functions in Sobolev spaces H^s([0, 1]^d). As a result, given N data, the least-squares estimate bg built from P scrambled wavelets has excess risk ||f^* − \hat g||^2_P = O(||f^*||^2_{H^s([0,1]^d)}(logN)/P + P(logN)/N) for target functions f^* \in H^s([0, 1]^d) of smoothness order s > d/2. An interesting aspect of the resulting bounds is that they do not depend on the distribution P from which the data are generated, which is important in a statistical regression setting considered here. Randomization enables to adapt to any possible distribution. We conclude by describing an efficient numerical implementation using lazy expansions with numerical complexity \tilde O (2^dN^{3/2} logN + N^2), where d is the dimension of the input space.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:7518
Deposited By:Odalric-Ambrym Maillard
Deposited On:17 March 2011