Relative loss bounds for predicting almost as well as any function in a union of Gaussian reproducing kernel spaces with varying widths
Mark Herbster
In: Mathematical Foundations of Learning Theory, 18-23 June 2004, Barcelona, Spain.

## Abstract

We consider a two-layer network algorithm. The first layer consists of an uncountable number of linear units. Each linear unit is an {\em LMS} algorithm whose inputs are first kernelized.'' Each unit is indexed by the value of a parameter corresponding to a parameterized reproducing kernel, here an isotropic Gaussian Kernel parameterized by its width. The first-layer outputs are then connected to an {\em Exponential Weights} algorithm which combines them to produce the final output. We give performance guarantees for this algorithm. As a guarantee of performance, we give a {\em relative loss bound} for this online algorithm. By online, we refer to the fact that learning proceeds in trials where on each trial the algorithm first receives a pattern, then it makes a prediction, after which it receives the true outcome, and finally incurs a loss on that trial measuring the discrepancy between its prediction and the true outcome. By relative loss bound, we refer to the fact on any trial, we can bound the cumulative of loss of the algorithm by the cumulative loss of any predictor in a comparison class of predictors plus an additive term. Hence the goal is that the performance of algorithm be almost as good as any predictor in the class; therefore we desire a small additive term. Often these bounds may be given without any probabilistic assumptions. In this note the comparison class is the set of functions obtained by a union of reproducing kernel spaces formed by isotropic Gaussian kernels of varying widths.

 PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type: Conference or Workshop Item (Poster) Project Keyword UNSPECIFIED Learning/Statistics & OptimisationTheory & Algorithms 598 Mark Herbster 27 December 2004