A DC-Programming Algorithm for Kernel Selection
Andreas Argyriou, Raphael Hauser, Charles A. Micchelli and Massimiliano Pontil
In: 23rd International Conference on Machine Learning, June 25-29, 2006, Pittsburgh, USA.
We address the problem of learning a kernel for a given supervised learning task. Our approach consists in searching within the convex hull of a prescribed set
of basic kernels for one which minimizes a convex regularization functional. A unique feature of this approach compared to others in the literature is that
the number of basic kernels can be infinite. We only require that they are
continuously parameterized. For example, the basic kernels could be isotropic
Gaussians with variance in a prescribed interval or even Gaussians parameterized
by multiple continuous parameters. Our work builds upon a formulation
involving a minimax optimization problem and a recently proposed greedy
algorithm for learning the kernel. Although this optimization problem is not
convex, it belongs to the larger class of DC (difference of convex functions) programs.
Therefore, we apply
recent results from DC optimization theory to create a new algorithm for
learning the kernel. Our experimental results on benchmark data sets show that
this algorithm outperforms a previously proposed method.