Functional Inference in FunCat through the Combination of Hierarchical Ensembles with Data Fusion Methods
Nicolò Cesa-Bianchi, Matteo Re and Giorgio Valentini
In: ICML Workshop on learning from Multi-Label Data MLD'10, 25 Jun 2010, Haifa, Israel.
The multi-label hierarchical prediction of gene functions at genome and ontology-wide level is a central problem in bioinformatics, and raises challenging questions from a machine learning standpoint. In this con-
text, multi-label hierarchical ensemble methods that take into account the hierarchical relationships between functional classes have been recently proposed. Various studies also showed that the integration of multiple sources of data is one of the key issues to significantly improve gene function prediction. We propose an integrated approach that combines local data fusion strategies with global hierarchical multi-label methods.
The label unbalance typically occurring in gene functional classes is taken into account through the use of cost-sensitive techniques.
Ontology-wide results with the yeast model organism, using the FunCat taxonomy, show the effectiveness of the proposed methodological approach.