PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Clustering of contingency table and mixture model
Gérard Govaert and Mohamed Nadif
Global Optimization 2005.


Basing cluster analysis on mixture models has become a classical and powerful approach. It enables some classical criteria such as the well-known k-means criterion to be explained. To classify the rows or the columns of a contingency table, an adapted version of k-means known as Mndki2, which uses the chi-square distance, can be used. Unfortunately, this simple, effective method which can be used jointly with correspondence analysis based on the same representation of the data, cannot be associated with a mixture model in the same way as the classical k-means algorithm. In this paper we show that the Mndki2 algorithm can be viewed as an approximation of a classifying version of the EM algorithm for a mixture of multinomial distributions. A comparison of the algorithms belonging in this context are experimentally investigated using Monte Carlo simulations.

PDF - PASCAL Members only - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
ID Code:1943
Deposited By:Gérard Govaert
Deposited On:30 December 2005