PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Kernel-based Learning of Hierarchical Multilabel Classification Models
Juho Rousu, Craig Saunders, Sandor Szedmak and John Shawe-Taylor
Journal of Machine Learning Research 2005.


We present a kernel-based algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is a variant of the Maximum Margin Markov Network framework, where the classification hierarchy is represented as a Markov tree equipped with an exponential family defined on the edges. We present an efficient optimization algorithm based on incremental conditional gradient ascent in single-example subspaces spanned by the marginal dual variables. The optimization is facilitated with a dynamic programming based algorithm that computes best update directions in the feasible set. Experiments show that the algorithm can feasibly optimize training sets of thousands of examples and classification hierarchies consisting of hundreds of nodes. In our tests, training of the full hierarchical model was in fact faster than training inpedendent SVM classifiers for each node. The algorithm's predictive accuracy was found to be competitive with other recently introduced hierarchical multi-category or multilabel classification learning algorithms.

PDF - PASCAL Members only - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
Information Retrieval & Textual Information Access
ID Code:1840
Deposited By:Juho Rousu
Deposited On:29 November 2005