PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Computing the regret table for multinomial data
Petri Kontkanen and Petri Myllymäki
(2005) Technical Report. HIIT, Helsinki, Finland.


Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as model selection or data clus- tering. In the case of multinomial data, comput- ing the modern version of stochastic complexity, defined as the Normalized Maximum Likelihood (NML) criterion, requires computing a sum with an exponential number of terms. Furthermore, in order to apply NML in practice, one often needs to compute a whole table of these exponential sums. In our previous work, we were able to compute this table by a recursive algorithm. The purpose of this paper is to significantly improve the time complex- ity of this algorithm. The techniques used here are based on the discrete Fourier transform and the con- volution theorem.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Monograph (Technical Report)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
ID Code:1813
Deposited By:Petri Myllymäki
Deposited On:28 November 2005