Large Scale Nonparametric Bayesian Inference: Data Parallelisation in the Indian Buffet Process
Finale Doshi, Shakir Mohamed, David Knowles and Zoubin Ghahramani
In: Neural Information Processing Systems 22, 7-12 Dec 2009, Vancouver, Canada.
Nonparametric Bayesian models provide a framework for flexible probabilistic modelling of complex datasets. Unfortunately, the high-dimensional averages required for Bayesian methods can be slow, especially with the unbounded representations used by nonparametric models. We address the challenge of scaling Bayesian inference to the increasingly large datasets found in real-world applications. We focus on parallelisation of inference in the Indian Buffet Process (IBP), which allows data points to have an unbounded number of sparse latent features. Our novel MCMC sampler divides a large data set between multiple processors and uses message passing to compute the global likelihoods and posteriors. This algorithm, the first parallel inference scheme for IBP-based models, scales to datasets orders of magnitude larger than have previously been possible.