Different paradigms for choosing sequential reweighting algorithms
Analyses of the success of ensemble methods in classification
have pointed out the important role played by
the ``margin'' distribution function on the training and test sets.
While it is acknowledged that one should generally try to achieve high
margins on the training set, the more precise shape of the
empirical margin distribution function one should favor
in practice is subject to different approaches.
We first present two concurrent philosophies for choosing the
empirical margin profile, one we call ``minimax margin paradigm''
and the other ``mean and variance paradigm''. The best known
representative of the first paradigm is the AdaBoost algorithm,
and this philosophy has been shown by several other authors to be
closely related to the principle of the SVM. On the
other hand, we show that the second paradigm is very close in
spirit to Fisher's linear discriminant (in a feature space).
We construct two boosting-type
algorithms, very similar in their form, dedicated to one or the other
philosophy. We consequently derive by interpolation a very simple family of iterative
reweighting algorithms that can be understood as different tradeoffs
between the two above paradigms, and argue from experiments that this
can allow for a suitable adaptivity to different classification
problems, particularly in the presence of noise and/or excessive
complexity of the base classifiers.
|Postscript - PASCAL Members only - Requires a viewer, such as GhostView|