PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Learning from Relevant Tasks Only
Samuel Kaski and Jaakko Peltonen
In: 18th European Conference on Machine Learning (ECML 2007), 17-21 Sep 2007, Warsaw, Poland.

Abstract

We introduce a problem called relevant subtask learning, a variant of multi-task learning. The goal is to build a classifier for a task-of-interest having too little data. We also have data for other tasks but only some are relevant, meaning they contain samples classified in the same way as in the task-of-interest. The problem is how to utilize this ``background data'' to improve the classifier in the task-of-interest. We show how to solve the problem for logistic regression classifiers, and show that the solution works better than a comparable multi-task learning model. The key is to assume that data of all tasks are mixtures of relevant and irrelevant samples, and model the irrelevant part with a sufficiently flexible model such that it does not distort the model of relevant data. ©2007 Springer-Verlag. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

EPrint Type:Conference or Workshop Item (Paper)
Additional Information:http://www.cis.hut.fi/projects/mi/abstracts/ecml07.html
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Theory & Algorithms
ID Code:3028
Deposited By:Jaakko Peltonen
Deposited On:30 August 2007