MDL Histogram Density Estimation
Petri Kontkanen and Petri Myllymäki
In: Eleventh International Conference on Artificial Intelligence and Statistics (AISTATS 2007), 21-24 Mar 2007, San Juan, Puerto Rico.
We regard histogram density estimation as a model selection problem.
Our approach is based on the information-theoretic minimum description
length (MDL) principle, which can be applied for tasks
such as data clustering, density estimation, image denoising and
model selection in general. MDL-based model selection is formalized via
the normalized maximum likelihood (NML) distribution, which has
several desirable optimality properties.
We show how this framework can be applied for learning generic,
irregular (variable-width bin) histograms, and how to compute the NML model selection criterion efficiently. We also derive a dynamic programming algorithm for finding both the MDL-optimal bin count and the cut point locations in polynomial time. Finally, we demonstrate our
approach via simulation tests.