Exploiting long-range dependencies in protein beta-sheet secondary structure prediction
Yizhao Ni and Mahesan Niranjan
In: Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics, Nijmegen, the Netherlands(2010).
We investigate if interactions of longer range than typically considered in local protein secondary structure prediction methods can be captured in a simple machine learning framework to improve the prediction of beta-sheets. We use support vector machines and recursive feature elimination to show that the small signals available in long range interactions can indeed be exploited. The improvement is small but statistically significant on the benchmark datasets we used. We also show that feature selection within a long window and over amino acids at specific positions typically selects amino acids that are shown to be more relevant in the initiation and termination of beta-sheet formation.