Randomization techniques for graphs
Sami Hanhijärvi, Gemma Garriga and Kai Puolamäki
SIAM Data Mining
Mining graph data is an active research area. Several
data mining methods and algorithms have been proposed to identify structures
from graphs; still, the evaluation of those results is lacking.
Within the framework of statistical hypothesis testing, we focus in this paper
on randomization techniques for unweighted undirected graphs.
is an important approach to assess the statistical significance of
data mining results. Given an input graph, our randomization method will
sample data from the class of graphs that share certain structural
properties with the input graph. Here we describe three alternative
algorithms based on local edge swapping and Metropolis sampling.
We test our framework with various graph data sets and mining
algorithms for two applications, namely graph clustering and
frequent subgraph mining.