Agnostically Learning Halfspaces
We give the first algorithm that (under distributional assumptions) efficiently learns halfspaces in the notoriously difficult {\em agnostic} framework of Kearns, Schapire, \& Sellie\ignore{I've been told no references in an abstract \cite{KSS:94}}, where a learner is given access to labeled examples drawn from a distribution, without restriction on the labels (e.g. adversarial noise). The algorithm constructs a hypothesis whose error rate on future examples is within an additive $\eps$ of the optimal halfspace, in time poly$(n)$ for any constant $\eps>0$, under the uniform distribution over $\bits^n$ or the unit sphere in $\Reals^n,$ as well as under any log-concave distribution over $\Reals^n$. It also agnostically learns Boolean disjunctions in time $2^{\tilde{O}\left(\sqrt{n}\right)}$ with respect to {\em any} distribution. The new algorithm, essentially {\em $L_1$ polynomial regression}, is a noise-tolerant arbitrary-distribution generalization of the low-degree'' Fourier algorithm of Linial, Mansour, \& Nisan. We also give a new algorithm for PAC learning halfspaces under the uniform distribution on the unit sphere with the current best bounds on tolerable rate of malicious noise.''