PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Variational inference for nonparametric multiple Clustering
Y Guan, J G Dy, D Niu and Zoubin Ghahramani
In: KDD10 Workshop on Discovering, Summarizing and Using Multiple Clusterings, 25-28 July 2010, Washington, USA.


Abstract: Most clustering algorithms produce a single clustering solution. Similarly, feature selection for clustering tries to find one feature subset where one interesting clustering solution resides. However, a single data set may be multi-faceted and can be grouped and interpreted in many different ways, especially for high dimensional data, where feature selection is typically needed. Moreover, different clustering solutions are interesting for different purposes. Instead of committing to one clustering solution, in this paper we introduce a probabilistic nonparametric Bayesian model that can discover several possible clustering solutions and the feature subset views that generated each cluster partitioning simultaneously. We provide a variational inference approach to learn the features and clustering partitions in each view. Our model allows us not only to learn the multiple clusterings and views but also allows us to automatically learn the number of views and the number of clusters in each view.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:7799
Deposited By:Zoubin Ghahramani
Deposited On:17 March 2011