Learning from Noisy Data using Hyperplane Sampling and Sample Averages
Guillaume Stempfel, Liva Ralaivola and François Denis
HAL - CNRS, France.
We present a new classification algorithm capable of learning from data corrupted by a class dependent uniform classification noise. The produced classifier is a linear classifier, and the algorithm works seamlessly when using kernels. The algorithm relies on the sampling of random hyperplanes that help the building of new training examples of which the correct classes are known; a linear classifier (e.g. an svm) is learned from these examples and output by the algorithm. The produced examples are sample averages computed from the data at hand with respect to areas of the space defined by the random hyperplanes and the target hyperplane. A statistical analysis of the properties of these sample averages is provided as well as results from numerical simulations conducted on synthetic datasets. These simulations show that the linear and kernelized versions of our algorithm are effective for learning from both noise-free and noisy data.
|EPrint Type:||Monograph (Technical Report)|
|Project Keyword:||Project Keyword UNSPECIFIED|
|Subjects:||Theory & Algorithms|
|Deposited By:||Liva Ralaivola|
|Deposited On:||11 February 2008|