PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Pattern analysis for the prediction of fungal pro-peptide cleavage sites
Sureyya Ozogur, John Shawe-Taylor, G.W. Weber and Z.B. Ögel
Discrete Applied Mathematics 2008. ISSN 0166-218X


Support vector machines (SVMs) have many applications in investigating biological data from gene expression arrays to understanding EEG signals of sleep stages. In this paper, we have developed an application that will support the prediction of the pro-peptide cleavage site of fungal extracellular proteins which display mostly a monobasic or dibasic processing site. Many of the secretory proteins and peptides are synthesized as inactive precursors and they become active after post-translational processing. A collection of fungal proprotein sequences are used as a training data set. A specifically designed kernel is expressed as an application of the well-known Gaussian kernel via feature spaces defined for our problem. Rather than fixing the kernel parameters with cross validation or other methods, we introduce a novel approach that simultaneously performs model selection together with the test of accuracy and testing confidence levels. This leads us to higher accuracy at significantly reduced training times. The results of the server ProP1.0 which predicts propeptide cleavage sites are compared with the results of this study. A similar mathematical approach may be adapted to pro-peptide cleavage prediction in other eukaryotes.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:4776
Deposited By:Sureyya Ozogur
Deposited On:24 March 2009