PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

MDL Histogram Density Estimation
Petri Kontkanen and Petri Myllymäki
In: Eleventh International Conference on Artificial Intelligence and Statistics (AISTATS 2007), 21-24 Mar 2007, San Juan, Puerto Rico.


We regard histogram density estimation as a model selection problem. Our approach is based on the information-theoretic minimum description length (MDL) principle, which can be applied for tasks such as data clustering, density estimation, image denoising and model selection in general. MDL-based model selection is formalized via the normalized maximum likelihood (NML) distribution, which has several desirable optimality properties. We show how this framework can be applied for learning generic, irregular (variable-width bin) histograms, and how to compute the NML model selection criterion efficiently. We also derive a dynamic programming algorithm for finding both the MDL-optimal bin count and the cut point locations in polynomial time. Finally, we demonstrate our approach via simulation tests.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
ID Code:2983
Deposited By:Petri Kontkanen
Deposited On:19 April 2007