Bayesian CCA via Group Sparsity
Seppo Virtanen, Arto Klami and Samuel Kaski
In: The 28th International Conference on Machine Learning, 28 Jun - 02 Jul 2011, Bellewue, US.
Bayesian treatments of Canonical Correlation Analysis (CCA) -type latent variable models have been recently proposed for coping with overfitting in small sample sizes, as well as for producing factorizations of the data sources into correlated and non-shared effects. However, all of the current implementations of Bayesian CCA and its extensions are computationally inefficient for high-dimensional data and, as shown in this paper, break down completely for high-dimensional sources with low sample count. Furthermore, they cannot reliably separate the correlated effects from non-shared ones. We propose a new Bayesian CCA variant that is computationally efficient and works for high-dimensional data, while also learning the factorization more accurately. The improvements are gained by introducing a group sparsity assumption and an improved variational approximation. The method is demonstrated to work well on multi-label prediction tasks and in analyzing brain correlates of naturalistic audio stimulation.