Quadratic programming and learning. Large size and parsimony
Chapter 5, written by Gaëlle Loosli and Stéphane Canu, deals with an aspect of optimization that can currently be encountered within signals and images, for example in shape recognition, i.e. learning processes. More precisely, the chapter focuses on the formulation of learning as a problem in convex quadratic programmation on a large scale (several million variables and constraints). This formulation was obtained by the “nucleus methods”, which emerged about a decade after neural networks. Its main aim is linked to the fact that the solution in question is often “parsimonious”, i.e more than 99% of all unknown variables are zero. Using this specific feature enables learning algorithms to solve large scale quadratic programming problems within a reasonable amount of time. The best-performing methods, known as “active constraints”, work “off-line”. In other words, they are able to determine the exact solution of a given problem if they are provided with all the data that is used in the learning process. To carry out an “online” learning process, a method of iterative stochastic optimization is used, which allows us to obtain an approximate solution. This chapter describes one of the methods which is part of the “support vector machine” (SVM) type. The efficiency of this technique is illustrated by results of experiments which focused on the problem of recognizing handwritten numbers.