|
Identifying interface elements implied in protein-protein interactions using statistical tests and Frequent Item Sets AbstractUnderstanding what are the characteristics of protein-protein interfaces is at the core of numerous applications. This paper introduces a method in which the proteins are described with surfacic geometrical elements. Starting from a database of known interfaces, the method produces the elements and combinations thereof that are characteristic of the interfaces. This is done thanks to a frequent item set technique and the use of statistical tests to ensure a marked difference with a null hypothesis. This approach allows one to easily interpret the results, as compared to techniques that operate as ``black-boxes''. Furthermore, it is naturally adapted to discover disjunctive concepts, i.e. different underlying processes. The results obtained on a set of 459 protein-protein interfaces from the PDB database confirm that the findings are consistent with current knowledge about protein-protein interfaces.
[Edit] |