PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Algorithmic statistics and Kolmogorov's Structure Functions
Paul M.B. Vitanyi
In: P.M.B. Vitanyi, Algorithmic statistics and Kolmogorov's Structure Functions. In: Advances in Minimum Description Length: Theory and Applications (2004) MIT Press .


Naively speaking, Statistics deals with gathering data, ordering and representing data, and using the data to determine the process that causes the data. That this viewpoint is a little too simplistic is immediately clear: suppose that the true cause of a sequence of outcomes of coin flips is a ``fair'' coin, where both sides come up with equal probability. It is possible that the sequence consists of ``heads'' only. Suppose that our statistical inference method succeeds in identifying the true cause (fair coin flips)\index{statistical inference}\index{inference} from these data. Such a method is clearly at fault: from an all-heads sequence a good inference should conclude that the cause is a coin with a heavy bias toward ``heads'', irrespective of what the true cause is. That is, a good inference method must assume that the data is ``typical'' for the cause---that is, we don't aim at finding the ``true'' cause, but we aim at finding a cause for which the data is as ``typical'' as possible. Such a cause is called a {\em model} for the data. But what if the data consists of a sequence of precise alternations ``head--tail''? This is as unlikely an outcome for a fair coin flip as the all-heads sequence. Yet, within the coin-type models we have no alternative than to choose a fair coin. But we know very well that the true cause must be different. For some data it may not even make sense to ask for a ``true cause''. This suggests that truth is not our goal; but within given constraints on the model class we try to find the model for which the data is most ``typical'' in an appropriate sense, the model that best ``fits'' the data. Considering the available model class as a a magnifying glass, finding the best fitting model for the data corresponds to findingposition of the magnifying glass that best brings the object into focus. In the coin-flipping example we presented it is possible that the data have no sharply focused model, but within the allowed resolution---here ignoring the order of the outcomes but only counting the number of ``heads'' in the total---we find the best model .

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Book Section
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
ID Code:814
Deposited By:Paul Vitányi
Deposited On:01 January 2005