PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Cluster Characterization through a Representativity Measure
MJ Lesot and B Bouchon-Meunier
In: Flexible Queries Answering Systems, 24-26 June 2004, Lyon, France.


Clustering is an unsupervised learning task which provides a decomposition of a dataset into subgroups that summarize the initial base and give information about its structure. We propose to enrich this result by a numerical coefficient that describes the cluster representativity and indicates the extent to which they are characteristic of the whole dataset. It is defined for a specific clustering algorithm, called Outlier Preserving Clustering Algorithm, OPCA, which detects clusters associated with major trends but also with marginal behaviors, in order to offer a complete description of the inital dataset. The proposed representativity measure exploits the iterative process of OPCA to compute the typicality of each identified cluster

EPrint Type:Conference or Workshop Item (Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:550
Deposited By:Marie-Jeanne Lesot
Deposited On:25 December 2004