PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

A Subsequence Matching with Gaps-Range-Tolerances Framework: A Query-By-Humming Application
Alexios Kotsifakos, Panagiotis Papapetrou, Jaakko Hollmen and Dimitrios Gunopulos
Proceedings of the Very Large DataBases Endowemen Volume 4, Number 11, pp. 761-771, 2011.


We propose a novel subsequence matching framework that allows for gaps in both the query and target sequences, variable match- ing tolerance levels efficiently tuned for each query and target se- quence, and also constrains the maximum match length. Using this framework, a space and time efficient dynamic programming method is developed: given a short query sequence and a large database, our method identifies the subsequence of the database that best matches the query, and further bounds the number of con- secutive gaps in both sequences. In addition, it allows the user to constrain the minimum number of matching elements between a query and a database sequence. We show that the proposed method is highly applicable to music retrieval. Music pieces are repre- sented by 2-dimensional time series, where each dimension holds information about the pitch and duration of each note, respectively. At runtime, the query song is transformed to the same 2-dimensional representation. We present an extensive experimental evaluation using synthetic and hummed queries on a large music database. Our method outperforms, in terms of accuracy, several DP-based subsequence matching methods—with the same time complexity— and a probabilistic model-based method.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Theory & Algorithms
ID Code:8921
Deposited By:Panagiotis Papapetrou
Deposited On:21 February 2012