Datum-wise classification. A sequential Approach to sparsity
We propose a novel classiﬁcation technique whose aim is to select an appropriate representation for each datapoint, in contrast to the usual approach of selecting a representation encompassing the whole dataset. This datum-wise representation is found by using a sparsity inducing empirical risk, which is a relaxation of the standard L0 regular- ized risk. The classiﬁcation problem is modeled as a sequential decision process that sequentially chooses, for each datapoint, which features to use before classifying. Datum-Wise Classiﬁcation extends naturally to multi-class tasks, and we describe a speciﬁc case where our inference has equivalent complexity to a traditional linear classiﬁer, while still using a variable number of features. We compare our classiﬁer to classical L1 regularized linear models (L1 -SVM and LARS) on a set of common bi- nary and multi-class datasets and show that for an equal average number of features used we can get improved performance using our method.