Optimal parameter selection in support vector machines ## AbstractThe purpose of the paper is to apply a nonlinear programming algorithm to compute kernel and related parameters of a support vector machine (SVM) by a two-level approach. Available training data are split into two groups, one set for formulating a quadratic SVM with L2-soft margin and another one for minimizing the generalization error, where the optimal SVM variables are inserted. Subsequently, the SVM is again solved, but now for the entire set of training data, and the total generalization error is evaluated for a separate set of test data. Derivatives of the functions by which the optimization problem is defined, are evaluated in an analytical way, where an existing Cholesky decomposition needed for solving the quadratic SVM, is exploited. The approach is implemented and tested on a couple of standard data sets with up to 4,800 patterns. The results show a significant reduction of the generalization error, an increase of the margin, and a reduction of the number of support vectors in all cases where the data sets are sufficiently large. By a second set of test runs, kernel parameters are assigned to individual features. Redundant attributes are identified and suitable relative weighting factors are computed.
[Edit] |