Computing Positional Oligomer Importance Matrices (POIMs)
Alexander Zien, Petra Philips and Sören Sonnenburg
Fraunhofer Institute, Berlin, Germany.
We show how to efficiently compute Positional Oligomer Importance Matrices (POIMs) which are a novel and powerful way to extract, rank, and visualize higher order (i.e. oligo-nucleotide) compositional information for nucleotide sequences. Given a scoring function for nucleotide sequences which is linear w.r.t. positionwise occurrences of oligomers, POIMs quantify the increase (or decrease) of the expected score caused by information about each k-mer at each position. We demonstrate how to obtain a recursive algorithm which enables us to efficiently compute POIMs by using string index data structures. This is especially useful for scoring functions whose linear weighting is sparse, as is the case for the scoring function produced by string kernel classifiers.