PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Randomization techniques for graphs
Sami Hanhijärvi, Gemma Garriga and Kai Puolamäki
SIAM Data Mining 2009.

Abstract

Mining graph data is an active research area. Several data mining methods and algorithms have been proposed to identify structures from graphs; still, the evaluation of those results is lacking. Within the framework of statistical hypothesis testing, we focus in this paper on randomization techniques for unweighted undirected graphs. Randomization is an important approach to assess the statistical significance of data mining results. Given an input graph, our randomization method will sample data from the class of graphs that share certain structural properties with the input graph. Here we describe three alternative algorithms based on local edge swapping and Metropolis sampling. We test our framework with various graph data sets and mining algorithms for two applications, namely graph clustering and frequent subgraph mining.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:4486
Deposited By:Gemma Garriga
Deposited On:13 March 2009