PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Computing Positional Oligomer Importance Matrices (POIMs)
Alexander Zien, Petra Philips and Sören Sonnenburg
(2007) Technical Report. Fraunhofer Institute, Berlin, Germany.

Abstract

We show how to efficiently compute Positional Oligomer Importance Matrices (POIMs) which are a novel and powerful way to extract, rank, and visualize higher order (i.e. oligo-nucleotide) compositional information for nucleotide sequences. Given a scoring function for nucleotide sequences which is linear w.r.t. positionwise occurrences of oligomers, POIMs quantify the increase (or decrease) of the expected score caused by information about each k-mer at each position. We demonstrate how to obtain a recursive algorithm which enables us to efficiently compute POIMs by using string index data structures. This is especially useful for scoring functions whose linear weighting is sparse, as is the case for the scoring function produced by string kernel classifiers.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Monograph (Technical Report)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:3275
Deposited By:Sören Sonnenburg
Deposited On:05 February 2008