PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

An EM algorithm for the Block Mixture Model of contingency table
Mohamed Nadif and Gérard Govaert
In: 3rd International Association for Statistical Computing (IASC) world conference on Computational Statistics \& Data Analysis, 28-31 Oct 2005, Limassol, Chypre.

Abstract

Although many clustering procedures aim to construct an optimal partition of objects or, sometimes, of variables, there are other methods, called block clustering methods, which consider simultaneously the two sets and organize the data into homogeneous blocks. Recently, we have proposed a new mixture model called block mixture model which takes into account this situation. This model allows one to embed simultaneous clustering of objects and variables in a mixture approach. Setting this model in the maximum likelihood, we have proposed an EM algorithm called block EM by using a variational approximation. We have studied its performance on binary data in the estimation and clustering contexts. This kind of methods has pratical importance in a wide of variety of applications such as text and market basket data analysis. Typically, the data that arises in these applications is arranged as two-way contingency table. Recently, using Poisson distributions, we have proposed a block mixture model for these data and, setting it under the classification maximum likelihood, we have proposed a block CEM algorithm. In this work, we extend the block EM algorithm to this model. We evaluate its performance and compare it to block CEM and to a simple use of EM applied separately on the rows and colums of the contingency table. We present detailed experimental results on simulated and real data.

EPrint Type:Conference or Workshop Item (Invited Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
ID Code:1953
Deposited By:Gérard Govaert
Deposited On:30 December 2005