Discovery of Serial Episodes from Streams of Events
In: SSDBM 2004, 21-23 Jun 2004, Santorini Island, Greece.
A very important problem in data mining is finding patterns from sequential data. There is a vast number of sources for sequential data such as biological sequences, text documents, telecommunication alarm sequences, click streams, market basket data, web logs, and other time series. One of the most popular patterns mined from sequential data are the episodes, i.e., directed acyclic graphs with labeled nodes. An important subclass of episodes are the serial episodes, which are essentially sequences. Serial episodes are useful in many applications, including network monitoring and molecular biology.
Currently, there are many situations were so much sequential data is produced that it cannot even be stored without great difficulties. That kind of sequential sources are called data streams. In this paper we focus on finding serial episodes from data streams. To the best of our knowledge the problem of mining serial episodes from data streams has been studied in depth only for length-1 episodes.