PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Datasets of the Causation and Prediction Challenge
Isabelle Guyon, Constantin Aliferis, Greg Cooper, Andre Elisseeff, Jean-Philippe Pellet, Peter Spirtes and Alexander Statnikov
(2008) Technical Report. Clopinet, Berkeley, California, USA.


We prepared four datasets for the first challenge on causality we organized for the World Congress on Computational Intelligence, WCCI 2008. The focus of this challenge, entitled “Causation and Prediction”, was on the evaluation of causal modeling techniques, aiming at predicting the effect of “interventions" performed by an external agent. Examples of that problem are found in the medical domain to predict the effect of a drug prior to administering it, or in econometrics to predict the effect of a new policy prior to issuing it. We concentrated on a given target variable to be predicted (e.g., health status of a patient) from a number of candidate predictive variables or “features" (e.g., risk factors in the medical domain). We limited ourselves to binary target variables (two-class classification problems), but the input variables are either binary or continuous. For each task, a training set drawn from a “natural" distribution is given and three test sets: one test set from the same distribution as the training set and two test sets obtained after an external agent manipulated certain variables (i.e. set them to arbitrary values, not drawn from the natural distribution). The target variable itself is never manipulated and it is assumed that the external agent interventions do not alter the mechanisms by which one variable is determined by the value of others. The participants were asked to provide predictions of the target variable on test data and the list of variables (features) used to make predictions. The challenge platform remains open for post-challenge submissions (see The datasets were also used for the task LOCANET, which was part of the second causality challenge we organized for the Neural Information Processing Systems conference (NIPS 2008). The goal of LOCANET was to uncover the LOcal CAusal NETwork around the target. This report was not available to the participants of the challenges.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Monograph (Technical Report)
Additional Information:This report supplements the paper Design and Analysis of the Causation and Prediction Challenge by the same authors, JMLR W&CP 3:1-33, 2008.
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:4566
Deposited By:Isabelle Guyon
Deposited On:13 March 2009