PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Leveraging the Margin More Carefully
Nir Krause and Yoram Singer
The Twenty-First International Conference on Machine Learning 2004.

Abstract

Boosting is a popular approach for building accurate classifiers. Despite the initial popular belief, boosting algorithms do exhibit overfitting and are sensitive to label noise. Part of the sensitivity of boosting algorithms to outliers and noise can be attributed to the unboundedness of the margin-based loss functions that they employ. In this paper we describe two leveraging algorithms that build on boosting techniques and employ a bounded loss function of the margin. The first algorithm interleaves the expectation maximization (EM) algorithm with boosting steps. The second algorithm decomposes a non-convex loss into a difference of two convex losses. We prove that both algorithms converge to a stationary point. We also analyze the generalization properties of the algorithms using the Rademacher complexity. We describe experiments with both synthetic data and natural data (OCR and text) that demonstrate the merits of our framework, in particular robustness to outliers.

Postscript - Requires a viewer, such as GhostView
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:53
Deposited By:Yoram Singer
Deposited On:14 May 2004