PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Optimized Cutting Plane Algorithm for Large-Scale Risk Minimization
Vojtech Franc and Sören Sonnenburg
Journal of Machine Learning Research Volume 10, pp. 2157-2192, 2009.

Abstract

We have developed an optimized cutting plane algorithm (OCA) for solving large scale risk minimization problems. We prove that the number of iterations OCA requires to converge to a epsilon-precise solution is approximately linear in the sample size. We derive OCAS, an OCA based linear binary Support Vector Machine (SVM) solver, and OCAM, a linear multi-class SVM solver. In an extensive empirical evaluation we show that OCAS outperforms current state of the art SVM solvers, like SVMlight, SVMperf and BMRM, achieving speedups of over 1,200 on some datasets over SVMlight and 29 over SVMperf, while obtaining the same precise Support Vector solution. OCAS even in the early optimization steps shows often faster convergence than the so far in this domain prevailing approximative methods SGD and Pegasos. In addition, our proposed linear multi-class SVM solver OCAM achieves speedups of up to 10 compared to SVMmulti-class . Finally, we apply OCAS and OCAM to two real-world applications, the problem of human acceptor splice site detection and malware detection. Effectively parallelizing OCAS we achieve state-of-the art results on the acceptor splice site recognition problem only by being able to learn on all the available 50 million examples in a 12 million dimensional feature space.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:3276
Deposited By:Sören Sonnenburg
Deposited On:08 April 2010