Counting-based Output Prediction for Orphan Screening
We investigate orphan screening, the search for small molecule ligands of proteins for which no binding ligands are known in advance. Predicting interactions between biologically active molecules is an important step towards effective drug discovery. We propose novel classification and ranking algorithms for orphan screening which are based on counting feature combinations in molecular fingerprints. For the training process we only use positive examples and additional knowledge about the considered proteins and ligands. This knowledge is available in form of protein similarity values in a database of molecule compounds. Our algorithms have runtime linear in the number of unlabelled examples.