PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

State-space Dynamics for Clustering Sequential Data
Darío García-García, Emilio Parrado-Hernandez and Fernando Díaz-de-María
Pattern Recognition Volume 44, Number 5, pp. 1014-1022, 2011.

Abstract

This paper proposes a novel similarity measure for clustering sequential data. We first construct a common state-space by training a single probabilistic model with all the sequences in order to get a unified representation for the dataset. Then, distances are obtained attending to the transition matrices induced by each sequence in that state-space. This approach solves some of the usual overfitting and scalability issues of the existing semi-parametric techniques, that rely on training a model for each sequence. Empirical studies on both synthetic and real-world datasets illustrate the advantages of the proposed similarity measure for clustering sequences.

PDF - PASCAL Members only - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:7552
Deposited By:Darío García-García
Deposited On:17 March 2011