PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Discovering the truth by conducting experiments
Wouter Koolen
(2006) Masters thesis, University of Amsterdam and CWI.

Abstract

Paul Vitanyi's 2003 Kolmogorov complexity lecture included a computer exercise in which a polynomial relation had to be learnt from samples.1 The following data were provided: a sequence of pairs of numbers (h1, d1), (h2, d2), . . . , (hn, dn), supposedly noisy measurements of a classical urn, hi being the height from the floor and di being the diameter of the urn at the height hi. The goal was to infer a polynomial that represented the relation between height and diameter. For a given degree, this can easily be done using linear algebra. The crux of the exercise was finding the best degree. To me, learning from given data is only part of a more general concept of learning, and I started to wonder whether the techniques that I learnt during my studies could be adapted to an interactive setting, allowing the learner to perform experiments. For example, when learning polynomials, the learner could be allowed to choose a point, and she would then receive the value of the polynomial at that point. For this thesis, I started working on the interactive polynomial learning problem, but it turned out to be much too hard. I then devised the balance scale problem, a toy problem that conserves the important features of the polynomial learning problem: it is interactive, probabilistic, model-based, but finite. I had by then developed a slight aversion to subjective Bayesian methods, for my initial work on the polynomial learning problem suggested that they are not robust. It seemed that a subjective Bayesian learner can be tricked into assigning high posterior probability to a certain proposition while this proposition is false, and additionally, great confidence in this proposition leads to great confidence in the usefulness of experiments that in fact do not help to determine that this proposition is false. With this in mind, I decided to perform a worst-case analysis of the balance scale problem, and of similar problems in general. This problem naturally decomposed into the truth-finding problem, where we want to find the true model from given data, and the experiment-design problem, where experiments have to be selected, whose outcomes subsequently serve as the data for truth finding. I have yet to solve the balance scale problem completely. But I have already learned and discovered much more than I could initially imagine. I hope that this thesis will provide inspiration to others.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Thesis (Masters)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
ID Code:3440
Deposited By:Wouter Koolen
Deposited On:11 February 2008