Graph Kernels by Spectral Transforms
Xiaojin Zhu, Jaz Kandola, John Lafferty and Zoubin Ghahramani
Many graph-based semi-supervised learning methods can be viewed as imposing smoothness conditions on the target function with respect to a graph representing the data points to be labeled. The smoothness properties of the functions are encoded in terms of Mercer kernels over the graph. The central quantity in such regularization is the spectral decomposition of the graph Laplacian, a matrix derived from the graph's edge weights. The eigenvectors with small eigenvalues are smooth, and ideally represent large cluster structures within the data. The eigenvectors having large eigenvalues are rugged, and considered noise. Different weightings of the eigenvectors of the graph Laplacian lead to different measures of smoothness. Such weightings can be viewed as spectral transforms, that is, as transformations of the standard eigenspectrum that lead to different regularizers over the graph. Familiar kernels, such as the diffusion kernel resulting by solving a discrete heat equation on the graph, can be seen as simple parametric spectral transforms. The question naturally arises whether one can obtain effective spectral transforms automatically. In this paper we develop an approach to searching over a nonparametric family of spectral transforms by using convex optimization to maximize kernel alignment to the labeled data. Order constraints are imposed to encode a preference for smoothness with respect to the graph structure. This results in a flexible family of kernels that is more data-driven than the standard parametric spectral transforms. Our approach relies on a quadratically constrained quadratic program (QCQP), and is computationally practical for large datasets.