PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Tree-Based Inference for Dirichlet Process Mixtures
Yang Xu, Katherine Heller and Zoubin Ghahramani
In: AISTATS 2009, Florida, USA(2009).

Abstract

The Dirichlet process mixture (DPM) is a widely used model for clustering and for general nonparametric Bayesian density es- timation. Unfortunately, like in many sta- tistical models, exact inference in a DPM is intractable, and approximate methods are needed to perform efficient inference. While most attention in the literature has been placed on Markov chain Monte Carlo (MCMC) [1, 2, 3], variational Bayesian (VB) [4] and collapsed variational methods [5], [6] recently introduced a novel class of approx- imation for DPMs based on Bayesian hier- archical clustering (BHC). These tree-based combinatorial approximations efficiently sum over exponentially many ways of partitioning the data and offer a novel lower bound on the marginal likelihood of the DPM [6]. In this paper we make the following contribu- tions: (1) We show empirically that the BHC lower bounds are substantially tighter than the bounds given by VB [4] and by collapsed variational methods [5] on synthetic and real datasets. (2) We also show that BHC offers a more accurate predictive performance on these datasets. (3) We further improve the tree-based lower bounds with an algorithm that efficiently sums contributions from al- ternative trees. (4) We present a fast approx- imate method for BHC. Our results suggest that our combinatorial approximate inference methods and lower bounds may be useful not only in DPMs but in other models as well.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
ID Code:6739
Deposited By:Katherine Heller
Deposited On:08 March 2010