PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Computing the NML for Bayesian Forests via Matrices and Generating Polynomials
Tommi Mononen and Petri Myllymäki
In: 2008 IEEE Information Theory Workshop, 5-9 May 2008, Porto, Portugal.


The Minimum Description Length (MDL) is an information-theoretic principle that can be used for model selection and other statistical inference tasks. One way to implement this principle in practice is to compute the Normalized Maximum Likelihood (NML) distribution for a given parametric model class. Unfortunately this is a computationally infeasible task for many model classes of practical importance. In this paper we present a fast algorithm for computing the NML for the model class of Bayesian forests, which are graphical dependency models for multi-dimensional domains with the constraint that each node (variable) has at most one predecessor. The resulting algorithm has the time complexity of O(n^(2K+L-3)), where n is the number of data vectors, and K and L are the maximal number of values (alphabet sizes) of different types of variables in the model.

PDF - PASCAL Members only - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
ID Code:4185
Deposited By:Tommi Mononen
Deposited On:23 October 2008