Hierarchical cost-sensitive algorithms for genome-wide gene function prediction
Nicolò Cesa-Bianchi and Giorgio Valentini
Journal of Machine Learning Research, W&C Proceedings
In this work we propose new ensemble methods for the hierarchical classification of gene functions. Our methods exploit the hierarchical relationships between the classes in different ways: each ensemble node is trained “locally”, according to its position in the hierarchy; moreover, in the evaluation phase the set of predicted annotations is built so to minimize a global loss function defined over the hierarchy. We also address the problem of sparsity of annotations by introducing a cost-sensitive parameter that allows to control the precision-recall trade-off. Experiments with the model organism S. cerevisiae, using the
FunCat taxonomy and seven biomolecular data sets, reveal a significant advantage of our techniques over “flat” and cost-insensitive hierarchical ensembles.