PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Leveraging Sequence Classification by Taxonomy-based Multitask Learning
Christian Widmer, Jose Leiva-Murillo, Yasemin Altun and Gunnar Raetsch
In: NIPS 09 Workshop on Transfer Learning on Structured Data, 12 Dec 2009, Whistler, Canada.

Abstract

In a previous publication at last year’s NIPS [2], we compared a number of recent domain adaptation algorithms in a scenario that assumes one source domain with an abundance of data, and one target domain with only little training data. As prediction problem, we considered the supervised classification task of mRNA splice site recognition, which is representative for many other prediction tasks in sequence biology. We observed that considerable improvements over baseline methods are possible, which encouraged us to further pursue this direction of research. Hence, in our current research, we move from one source and one target organism to a scenario where we consider transfer learning between greater number of organisms, whose relationship to each other is given by a hierarchical structure or phylogeny. We explore several extensions of domain adaptation algorithms that allow the exploitation of hierarchical task relations for transfer learning. These algorithms were designed with large-scale applications in mind, allowing for a great number of training examples. The performance of the presented methods is demonstrated in an experiment where we combine splice-site data from 15 eukaryotic genomes. In general, we argue that transfer learning is well suited for applications computational biology, as different organisms can be regarded as different domains, which enables us to cast a wide range of prediction problems into the transfer learning framework.

EPrint Type:Conference or Workshop Item (Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
ID Code:6509
Deposited By:Jose Leiva-Murillo
Deposited On:08 March 2010