PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Linear classification and selective sampling under low noise conditions
Giovanni Cavallanti, Nicolò Cesa-Bianchi and Claudio Gentile
In: Advances in Neural Information Processing Systems 22 (NIPS 2008), Vancouver, Canada(2009).


We provide a new analysis of an efficient margin-based algorithm for selective sampling in classification problems. Using the so-called Tsybakov low noise condition to parametrize the instance distribution, we show bounds on the convergence rate to the Bayes risk of both the fully supervised and the selective sampling versions of the basic algorithm. Our analysis reveals that, excluding logarithmic factors, the average risk of the selective sampler converges to the Bayes risk at rate N^{−(1+a)(2+a)/2(3+a}) where N denotes the number of queried labels, and a > 0 is the exponent in the low noise condition. For all a > √3 − 1 ≈ 0.73 this convergence rate is asymptotically faster than the rate N^{−(1+a)/(2+a)} achieved by the fully supervised version of the same classifier, which queries all labels, and for a → ∞ the two rates exhibit an exponential gap. Experiments on textual data reveal that simple variants of the proposed selective sampler perform much better than popular and similarly efficient competitors.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:4706
Deposited By:Nicolò Cesa-Bianchi
Deposited On:24 March 2009