Artemis: Assessing the Similarity of Event-Interval Sequences
In several application domains, such as sign language, medi- cine, and sensor networks, events are not necessarily instantaneous but they can have a time duration. Sequences of interval-based events may contain useful domain knowledge; thus, searching, indexing, and mining such sequences is crucial. We introduce two distance measures for com- paring sequences of interval-based events which can be used for several data mining tasks such as classification and clustering. The first measure maps each sequence of interval-based events to a set of vectors that hold information about all concurrent events. These sets are then compared using an existing dynamic programming method. The second method, called Artemis, finds correspondence between intervals by mapping the two sequences into a bipartite graph. Similarity is inferred by employing the Hungarian algorithm. In addition, we present a linear-time lower- bound for Artemis. The performance of both measures is tested on data from three domains: sign language, medicine, and sensor networks. Ex- periments show the superiority of Artemis in terms of robustness to high levels of artificially introduced noise.