PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Statistical Models for Partial Membership
Katherine Heller, Sinead Williamson and Zoubin Ghahramani
In: ICML 2008, Helsinki, Finland(2008).

Abstract

We present a principled Bayesian framework for modeling partial memberships of data points to clusters. Unlike a standard mix- ture model which assumes that each data point belongs to one and only one mixture component, or cluster, a partial membership model allows data points to have fractional membership in multiple clusters. Algorithms which assign data points partial memberships to clusters can be useful for tasks such as clus- tering genes based on microarray data (Gasch & Eisen, 2002). Our Bayesian Partial Mem- bership Model (BPM) uses exponential fam- ily distributions to model each cluster, and a product of these distibtutions, with weighted parameters, to model each datapoint. Here the weights correspond to the degree to which the datapoint belongs to each cluster. All parameters in the BPM are continuous, so we can use Hybrid Monte Carlo to perform inference and learning. We discuss relation- ships between the BPM and Latent Dirichlet Allocation, Mixed Membership models, Ex- ponential Family PCA, and fuzzy clustering. Lastly, we show some experimental results and discuss nonparametric extensions to our model.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
ID Code:6735
Deposited By:Katherine Heller
Deposited On:08 March 2010