PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Efficient Graphlet Kernels for Large Graph Comparison
Nino Shervashidze, S.V.N. Vishwanathan, Tobias H. Petri, Kurt Mehlhorn and Karsten M. Borgwardt
In: AISTATS 2009, 16-18 Apr 2009, Clearwater Beach, Florida, USA.

Abstract

State-of-the-art graph kernels do not scale to large graphs with hundreds of nodes and thousands of edges. In this article we attempt to rectify this situation. We compare graphs by counting common {\it graphlets}, \ie subgraphs with k nodes where k \in { 3, 4, 5 }. Exhaustive enumeration of all graphlets is prohibitively expensive, scaling as O(n^k), where n is the number of nodes. We propose two theoretically grounded speedups. First, by bounding the deviation of the empirical estimates of the distribution of these graphlets from their true distribution, we show that it suffices to sample a fixed number of graphlets, independent of the size of the input graphs. Second, we show that for graphs of bounded degree, as is often the case with large sparse real-world graphs, exhaustively enumerating graphlets takes only O(nd^{k-1}) time, where d << n is the maximum degree. In our experimental evaluation, our novel kernels allow us to efficiently compare large graphs that cannot be tackled by existing graph kernels.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:4545
Deposited By:Nino Shervashidze
Deposited On:10 March 2009