## AbstractWe consider a two-layer network algorithm. The first layer consists of an uncountable number of linear units. Each linear unit is an {\em LMS} algorithm whose inputs are first ``kernelized.'' Each unit is indexed by the value of a parameter corresponding to a parameterized reproducing kernel, here an isotropic Gaussian Kernel parameterized by its width. The first-layer outputs are then connected to an {\em Exponential Weights} algorithm which combines them to produce the final output. We give performance guarantees for this algorithm. As a guarantee of performance, we give a {\em relative loss bound} for this online algorithm. By online, we refer to the fact that learning proceeds in trials where on each trial the algorithm first receives a pattern, then it makes a prediction, after which it receives the true outcome, and finally incurs a loss on that trial measuring the discrepancy between its prediction and the true outcome. By relative loss bound, we refer to the fact on any trial, we can bound the cumulative of loss of the algorithm by the cumulative loss of any predictor in a comparison class of predictors plus an additive term. Hence the goal is that the performance of algorithm be almost as good as any predictor in the class; therefore we desire a small additive term. Often these bounds may be given without any probabilistic assumptions. In this note the comparison class is the set of functions obtained by a union of reproducing kernel spaces formed by isotropic Gaussian kernels of varying widths.
[Edit] |