PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Mining frequent closed graphs on evolving data streams
Albert Bifet, Geoff Holmes, Bernhard Pfahringer and Ricard Gavaldà
In: KDD 2011, August 21-24, 2011, San Diego, USA.

Abstract

Graph mining is a challenging task by itself, and even more so when processing data streams which evolve in real-time. Data stream mining faces hard constraints regarding time and space for processing, and also needs to provide for concept drift detection. In this paper we present a framework for studying graph pattern mining on time-varying streams. Three new methods for mining frequent closed subgraphs are presented. All methods work on coresets of closed subgraphs, compressed representations of graph sets, and maintain these sets in a batch-incremental manner, but use different approaches to address potential concept drift. An evaluation study on datasets comprising up to four million graphs explores the strength and limitations of the proposed methods. To the best of our knowledge this is the first work on mining frequent closed subgraphs in non-stationary data streams.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:8649
Deposited By:Albert Bifet
Deposited On:17 February 2012