Learning the kernel function via regularization
Massimiliano Pontil and Charles Micchelli
We study the problem of finding an optimal kernel from
a prescribed convex set of kernels $\calK$ for learning
a real-valued function by regularization. We establish for
a wide variety of regularization functionals that this leads
to a convex optimization problem and, for square loss
regularization, we characterize the solution of this problem.
We show that, although $\calK$ may be an uncountable set,
the optimal kernel is always obtained as a convex combination of at most $m+1$ kernels, where $m$ is the number of data examples.
In particular, our results apply to learning the optimal radial kernel or
the optimal dot product kernel.