PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Text classification with active learning
Blaz Novak, Dunja Mladenić and Marko Grobelnik
In: 29th Annual Conference of the German Classification Society, March 9-11, 2005, Magdeburg, 9-11 March 2005, Magdeburg, Germany.

Abstract

In many real world machine learning tasks, labeled training examples are expensive to obtain, while at the same time there is a lot of unlabeled examples available. One such class of learning problems is text classification. Active learning strives to reduce the required labeling effort while retaining the accuracy by intelligently selecting the examples to be labeled. However, very little comparison exists between different active learning methods. The effects of the ratio of positive to negative examples on the accuracy of such algorithms also received very little attention. This paper presents a comparison of two most promising methods and their performance on a range of categories from the Reuters Corpus Vol. 1 news article dataset.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Information Retrieval & Textual Information Access
ID Code:1423
Deposited By:Dunja Mladenić
Deposited On:28 November 2005