PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

On a new correlation coefficient, its orthogonal decomposition and associated tests of independence
Wicher Bergsma
annals of statistics pp. 1-45, 2006.


A possible drawback of the ordinary correlation coefficient $\rho$ for two real random variables $X$ and $Y$ is that zero correlation does not imply independence. In this paper we introduce a new correlation coefficient $\rho^*$ which assumes values between zero and one, equalling zero iff the two variables are independent and equalling one iff the two variables are linearly related. The coefficients $\rho^*$ and $\rho^2$ are shown to be closely related algebraically, and they coincide for distributions on a $2\times 2$ contingency table. We derive an orthogonal decomposition of $\rho^*$ as a positively weighted sum of squared ordinary correlations between certain marginal eigenfunctions. Estimation of $\rho^*$ and its component correlations and their asymptotic distributions are discussed, and we develop visual tools for assessing the nature of a possible association in a bivariate data set. The paper includes consideration of grade (rank) versions of $\rho^*$ as well as the use of $\rho^*$ for contingency table analysis. As a special case a new generalization of the Cram{\'e}r-von Mises test to $K$ ordered samples is obtained.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:2888
Deposited By:Wicher Bergsma
Deposited On:23 November 2006