Correlations and co-occurrences of taxa: the role of temporal, geographic and taxonomic restrictions
Aleksi Kallio, Kai Puolamäki, Mikael Fortelius and Heikki Mannila
Correlation between occurrences of taxa is a fundamental concept in the analysis of presence-absence data. Such correlations can result from ecologically relevant processes, such as existence and evolution of species communities. Correlations are typically quantified by some sort of similarity index based on co-occurrence counts. We argue that the individual values of a similarity index are not useful as such: rather, we have to be able to estimate the statistical significance of the index value. Secondly, we argue that before computing the correlations one has to carefully select what is the underlying base set of locations for which the co-occurrence counts, similarity indices, and their significance is computed. We demonstrate base set selection with synthetic examples and conclude with an analysis of real data from a large database of fossil land mammals.