PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Can Gaussian Process Regression Be Made Robust Against Model Mismatch?
Peter Sollich
In: Lecture notes in computer science (2005) Springer .


Learning curves for Gaussian process (GP) regression can be strongly affected by a mismatch between the `student' model and the `teacher' (true data generation process), exhibiting e.g. multiple overfitting maxima and logarithmically slow learning. I investigate whether GPs can be made robust against such effects by adapting student model hyperparameters to maximize the evidence (data likelihood). An approximation for the average evidence is derived and used to predict the optimal hyperparameter values and the resulting generalization error. For large input space dimension, where the approximation becomes exact, Bayes-optimal performance is obtained at the evidence maximum, but the actual hyperparameters (e.g. the noise level) do not necessarily reflect the properties of the teacher. Also, the theoretically achievable evidence maximum cannot always be reached with the chosen set of hyperparameters, and maximizing the evidence in such cases can actually make generalization performance worse rather than better. In lower-dimensional learning scenarios, the theory predicts---in excellent qualitative and good quantitative accord with simulations---that evidence maximization eliminates logarithmically slow learning and recovers the optimal scaling of the decrease of generalization error with training set size.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
Postscript - Requires a viewer, such as GhostView
EPrint Type:Book Section
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:957
Deposited By:Peter Sollich
Deposited On:07 March 2005