PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Causal graphical models with latent variables : learning and inference
Philippe Leray, Stijn Meganck, Sam Maes and Bernard Manderick
In: Innovations in Bayesian Networks: Theory and Applications (2008) Springer , pp. 1-33.


This chapter discusses causal graphical models for discrete variables that can handle latent variables without explicitly modeling them quantitatively. In the \textit{uncertainty in artificial intelligence} area there exist several paradigms for such problem domains. Two of them are \textit{semi-Markovian causal models} and \textit{maximal ancestral graphs}. Applying these techniques to a problem domain consists of several steps, typically: structure learning from observational and experimental data, parameter learning, probabilistic inference, and, quantitative causal inference. We will start this chapter by introducing causal graphical models without latent variables and then move on to models with latent variables. We will discuss the problem that each of the existing approaches for causal modeling with latent variables only focuses on one or a few of all the steps involved in a generic knowledge discovery approach. The goal of this chapter is to investigate the integral process from observational and experimental data unto different types of efficient inference. Semi-Markovian causal models (SMCMs) are an approach developed by \citep{Pearl2000,TianEtAl2002}. They are specifically suited for performing quantitative causal inference in the presence of latent variables. However, at this time no efficient parametrisation of such models is provided and there are no techniques for performing efficient probabilistic inference. Furthermore there are no techniques to learn these models from data issued from observations, experiments or both. Maximal ancestral graphs (MAGs) are an approach developed by \citep{RichardsonEtAL2002}. They are specifically suited for structure learning in the presence of latent variables from observational data. However, the techniques only learn up to Markov equivalence and provide no clues on which additional experiments to perform in order to obtain the fully oriented causal graph. See \cite{EberhardtEtAl2005,MeganckEtAl2006a} for that type of results for Bayesian networks without latent variables. Furthermore, as of yet no parametrisation for discrete variables is provided for MAGs and no techniques for probabilistic inference have been developed. There is some work on algorithms for causal inference, but it is restricted to causal inference quantities that are the same for an entire Markov equivalence class of MAGs \citep{SpirtesEtAl2000a,Zhang2006}. We have chosen to use SMCMs as a final representation in our work, because they are the only formalism that allows to perform causal inference while fully taking into account the influence of latent variables. However, we will combine existing techniques to learn MAGs with newly developed methods to provide an integral approach that uses both observational data and experiments in order to learn fully oriented semi-Markovian causal models. Furthermore, we have developed an alternative representation for the probability distribution represented by a SMCM, together with a parametrisation for this representation, where the parameters can be learned from data with classical techniques. Finally, we discuss how probabilistic and quantitative causal inference can be performed in these models with the help of the alternative representation and its associated parametrisation\footnote{By the term parametrisation we understand the definition of a complete set of parameters that describes the joint probability distribution which can be efficiently used in computer implementations of probabilistic inference, causal inference and learning algorithms.}. The next section introduces the simplest causal models and their importance. Then we discuss causal models with latent variables. In section \ref{sec:learning}, we discuss structure learning for those models and in the next section we introduce techniques for learning a SMCM with the help of experiments. Then we propose a new representation for SMCMs that can easily be parametrised. We also show how both probabilistic and causal inference can be performed with the help of this new representation.

EPrint Type:Book Section
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Theory & Algorithms
ID Code:3765
Deposited By:Philippe Leray
Deposited On:21 February 2008