An experimental comparison of Hierarchical Bayes and True Path Rule ensembles for protein function prediction
Matteo Re and Giorgio Valentini
Nineth International Workshop on Multiple Classifier Systems MCS 2010
Lecture Notes in Computer Science
The computational genome-wide annotation of gene functions requires the prediction of hierarchically structured functional classes and can be formalized as a multiclass, multilabel, multipath hierarchical classification problem, characterized by very unbalanced classes. We recently proposed two hierarchical protein function prediction methods: the Hierarchical Bayes (hbayes) and True Path Rule (tpr) ensemble methods, both able to reconcile the prediction of component classifiers trained locally at each term of the ontology and to control the overall precision-recall trade-off. In this contribution, we focus on the experimental comparison of the hbayes and tpr hierarchical gene function prediction methods and their cost-sensitive variants, using the model organism S. cerevisiae and the FunCat taxonomy. The results show that cost-sensitive variants of these methods achieve comparable results, and significantly outperform both flat and their non cost-sensitive hierarchical counterparts.