Identifying interface elements implied in protein-protein interactions using statistical tests and Frequent Item Sets
Christine Martin and Antoine Cornuéjols
In: BIBM-08 (The IEEE International Conference on Bioinformatics and Biomedicine), 3-5 Nov 2008, Philadelphia, USA.
Understanding what are the characteristics of protein-protein interfaces is at the core of numerous applications. This paper introduces a method in which the proteins are described with surfacic geometrical elements. Starting from a database of known interfaces, the method produces the elements and combinations thereof that are characteristic of the interfaces. This is done thanks to a frequent item set technique and the use of statistical tests to ensure a marked difference with a null hypothesis. This approach allows one to easily interpret the results, as compared to techniques that operate as ``black-boxes''. Furthermore, it is naturally adapted to discover disjunctive concepts, i.e. different underlying processes. The results obtained on a set of 459 protein-protein interfaces from the PDB database confirm that the findings are consistent with current knowledge about protein-protein interfaces.