Optimally-Weighted Herding is Bayesian Quadrature
Herding and kernel herding are determinis- tic methods of choosing samples which sum- marise a probability distribution. A related task is choosing samples for estimating inte- grals using Bayesian quadrature. We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadra- ture. We then show that sequential Bayesian quadrature can be viewed as a weighted ver- sion of kernel herding which achieves perfor- mance superior to any other weighted herd- ing method. We demonstrate empirically a rate of convergence faster than O(1/N). Our results also imply an upper bound on the em- pirical error of the Bayesian quadrature esti- mate.