PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Monte Carlo estimation of minimax regret with an application to MDL model selection
Teemu Roos
In: IEEE Information Theory Workshop 2008, 5-9 May 2008, Porto, Portugal.


Minimum description length (MDL) model selection, in its modern NML formulation, involves a model complexity term which is equivalent to minimax/maximin regret. The term is a logarithm of a sum of maximized likelihoods over all possible data-sets. Because the sum has an exponential number of terms, its evaluation is in many cases intractable. In the continuous case, the sum is replaced by an integral for which a closed form is available in only a few cases. We present an approach based on Monte Carlo sampling, which works for all model classes, and gives strongly consistent estimators of the minimax regret. The estimates convergence almost surely to the correct value with increasing number of iterations. For the important class of Markov models, one of the presented estimators is particularly efficient: in empirical experiments, accuracy that is sufficient for model selection is usually achieved already on the first iteration, even for long sequences.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
ID Code:3927
Deposited By:Teemu Roos
Deposited On:25 February 2008