Inferring biological networks with output kernel trees
Pierre Geurts, Nizar Touleimat, Marie Dutreix and Florence d'Alché-Buc
Background: Elucidating biological networks between
proteins appears nowadays as one of the most important challenge in systems biology. Computational approaches to this problem are important to complement high-throughput technologies and to help biologists in designing new experiments. In this work, we focus on the
completion of a biological network from various sources of
Results: We propose a new machine learning approach for
the supervised inference of biological networks, which is based on a kernelization of the output of regression trees. It inherits several features of tree-based algorithms such as interpretability, robustness to irrelevant variables, and input scalability. We applied this method on the inference of
two kinds of networks: a protein-protein interaction network and an enzyme network. In both cases, we obtain results competitive with existing approaches. Our method also provides relevant insights on input data regarding their potential relationship with the existence of interactions. We also show
the biological validaty of our predictions in the context of an analysis of gene expression data.
Conclusions: The output kernel tree method is a simple
and efficient technique for the inference of biological networks from experimental data. Its main strengths are its simplicity and interpretability which should make it of great value for biologists.