PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

An integral approach to causal inference with latent variables
Sam Maes, Stijn Meganck and Philippe Leray
In: Causality and Probability in the Sciences (2007) London College Publications , pp. 17-41. ISBN 1-904987-35-4


This article discusses graphical models that can handle latent variables with- out explicitly modeling them quantitatively. In the uncertainty in artificial intelligence area there exist several paradigms for such problem domains. Two of them are semi-Markovian causal models and maximal ancestral graphs. Applying these techniques to a problem domain consists of sev- eral steps, typically: structure learning from observational and experimental data, parameter learning, probabilistic inference, and, quantitative causal inference. The main problem is that each of the existing approaches only focuses on one or a few of all the steps involved in the process of modeling a problem including latent variables. The goal of this article is to investigate the integral process from observational and experimental data unto different types of efficient inference. Semi-Markovian causal models (SMCMs) (Pearl, 2000; Tian and Pearl, 2002a) are an approach developed by Tian and Pearl. They are specifically suited for performing quantitative causal inference in the presence of latent variables. However, at this time no efficient parametrisation of such models is provided and there are no techniques for performing efficient probabilistic inference. Furthermore there are no techniques to learn these models from data issued from observations, experiments or both. Maximal ancestral graphs (MAGs) (Richardson and Spirtes, 2002) are an approach developed by Richardson and Spirtes. They are specifically suited for structure learning in the presence of latent variables from observational data. However, the techniques only learn up to Markov equivalence and provide no clues on which additional experiments to perform in order to obtain the fully oriented causal graph. See Eberhardt et al. (2005); Meganck et al. (2006) for that type of results for Bayesian networks without latent variables. Furthermore, as of yet no parametrisation for discrete variables is provided for MAGs and no techniques for probabilistic inference have been developed. There is some work on algorithms for causal inference, but it is restricted to causal inference quantities that are the same for an entire Markov equivalence class of MAGs (Spirtes et al., 2000; Zhang, 2006). We have chosen to use SMCMs as a final representation in our work, because they are the only formalism that allows to perform causal inference while fully taking into account the influence of latent variables. However, we will combine existing techniques to learn MAGs with newly developed meth- ods to provide an integral approach that uses both observational data and experiments in order to learn fully oriented semi-Markovian causal models. Furthermore, we have developed an alternative representation for the probability distribution represented by a SMCM, together with a para- metrisation for this representation, where the parameters can be learned from data with classical techniques. Finally, we discuss how probabilistic and quantitative causal inference can be performed in these models with the help of the alternative representation and its associated parametrisation. The next section introduces the necessary notations and definitions. It also discusses the semantical and other differences between SMCMs and MAGs. In section 3, we discuss structure learning for SMCMs. Then we introduce a new representation for SMCMs that can easily be parametrised. We also show how both probabilistic and causal inference can be performed with the help of this new representation.

EPrint Type:Book Section
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Theory & Algorithms
ID Code:3766
Deposited By:Philippe Leray
Deposited On:21 February 2008