Multiclass Learnability and the ERM principle
Amit Daniely, Sivan Sabato, Shai Ben-David and Shai Shalev-Shwartz
In: COLT 2011, June 2011, Budapest, Hungary.
Multiclass learning is an area of growing practical relevance, for which the currently available theory is still far from providing satisfactory understanding. We study the learnability of multiclass prediction, and derive upper and lower bounds on the sample complexity of multiclass hypothesis classes in diﬀerent learning models: batch/online, realizable/unrealizable, full information/bandit feedback. Our analysis reveals a surprising
phenomenon: In the multiclass setting, in sharp contrast to binary classiﬁcation, not all
Empirical Risk Minimization (ERM) algorithms are equally successful. We show that there
exist hypotheses classes for which some ERM learners have lower sample complexity than
others. Furthermore, there are classes that are learnable by some ERM learners, while
other ERM learner will fail to learn them. We propose a principle for designing good ERM
learners, and use this principle to prove tight bounds on the sample complexity of learning symmetric multiclass hypothesis classes (that is, classes that are invariant under any
permutation of label names). We demonstrate the relevance of the theory by analyzing
the sample complexity of two widely used hypothesis classes: generalized linear multiclass
models and reduction trees. We also obtain some practically relevant conclusions.