PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Learning from Noisy Data using Hyperplane Sampling and Sample Averages
Guillaume Stempfel, Liva Ralaivola and François Denis
(2007) Technical Report. HAL - CNRS, France.

Abstract

We present a new classification algorithm capable of learning from data corrupted by a class dependent uniform classification noise. The produced classifier is a linear classifier, and the algorithm works seamlessly when using kernels. The algorithm relies on the sampling of random hyperplanes that help the building of new training examples of which the correct classes are known; a linear classifier (e.g. an svm) is learned from these examples and output by the algorithm. The produced examples are sample averages computed from the data at hand with respect to areas of the space defined by the random hyperplanes and the target hyperplane. A statistical analysis of the properties of these sample averages is provided as well as results from numerical simulations conducted on synthetic datasets. These simulations show that the linear and kernelized versions of our algorithm are effective for learning from both noise-free and noisy data.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Monograph (Technical Report)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:3564
Deposited By:Liva Ralaivola
Deposited On:11 February 2008