Fiveable

📊Causal Inference Unit 9 Review

QR code for Causal Inference practice questions

9.3 Structural causal models (SCMs)

📊Causal Inference
Unit 9 Review

9.3 Structural causal models (SCMs)

Written by the Fiveable Content Team • Last updated September 2025
Written by the Fiveable Content Team • Last updated September 2025
📊Causal Inference
Unit & Topic Study Guides

Structural causal models (SCMs) are powerful tools for understanding cause and effect relationships. They combine graphs and equations to represent how variables influence each other, allowing us to estimate causal effects and answer "what if" questions.

SCMs help us move beyond simple correlations to truly grasp causal mechanisms. By encoding our assumptions about causal structure, SCMs provide a framework for inferring causal effects from data, evaluating interventions, and reasoning about counterfactuals in complex systems.

Structural causal models (SCMs)

  • SCMs provide a formal framework for representing and reasoning about causal relationships in a system
  • They combine graphical models with structural equations to encode causal assumptions and enable causal inference
  • SCMs are a key tool in the field of Causal Inference for estimating causal effects and answering counterfactual questions

Definition of SCMs

  • An SCM is a mathematical model that describes the causal relationships between variables in a system
  • It consists of a set of variables, a directed acyclic graph (DAG) representing the causal structure, and a set of structural equations
  • The structural equations specify the functional relationships between variables and their direct causes

Components of SCMs

  • Variables: The set of variables in the system, which can be observed or unobserved
  • Directed acyclic graph (DAG): A graphical representation of the causal relationships between variables, where edges represent direct causal influences
  • Structural equations: Mathematical equations that describe how each variable is determined by its direct causes in the DAG
  • Probability distribution: A joint probability distribution over the variables in the SCM, which is implied by the structural equations and the DAG

Endogenous vs exogenous variables

  • Endogenous variables are variables that are determined by other variables within the SCM
  • Exogenous variables are variables that are determined by factors outside the SCM and are not caused by any other variables in the model
  • Exogenous variables are often represented as root nodes in the DAG and are assumed to be independently distributed

Structural equations

  • Structural equations specify the functional relationships between variables in the SCM
  • They describe how each endogenous variable is determined by its direct causes and an error term
  • The error terms represent unobserved factors or randomness that influence the variable
  • Example: $Y = f(X, \epsilon_Y)$, where $Y$ is an endogenous variable, $X$ is its direct cause, and $\epsilon_Y$ is the error term

Directed acyclic graphs (DAGs)

  • DAGs are graphical representations of the causal structure in an SCM
  • Nodes in the DAG represent variables, and directed edges represent direct causal influences
  • The absence of an edge between two nodes implies that there is no direct causal relationship between the corresponding variables
  • DAGs satisfy the acyclicity property, meaning there are no directed cycles in the graph

Causal Markov condition

  • The causal Markov condition is a fundamental assumption in SCMs
  • It states that a variable is independent of its non-descendants given its direct causes (parents) in the DAG
  • This assumption allows for the factorization of the joint probability distribution based on the DAG structure
  • Example: In a DAG $X \rightarrow Y \rightarrow Z$, the causal Markov condition implies that $X$ and $Z$ are independent given $Y$

Causal sufficiency

  • Causal sufficiency is an assumption in SCMs that states that all common causes of the observed variables are included in the model
  • It implies that there are no unmeasured confounders that affect multiple observed variables
  • Causal sufficiency is a strong assumption and may not always hold in practice
  • Violations of causal sufficiency can lead to biased estimates of causal effects

Causal faithfulness

  • Causal faithfulness is an assumption that the independence relationships implied by the DAG are exactly the independence relationships in the probability distribution
  • It ensures that the causal structure and the probability distribution are compatible
  • Faithfulness rules out certain types of cancellations or exact balancing of causal effects
  • Violations of faithfulness can occur in practice, but they are often considered to be rare

Representing interventions with SCMs

  • Interventions are actions that modify the causal structure of a system by setting variables to specific values
  • SCMs provide a formal framework for representing and reasoning about interventions
  • Interventions allow us to estimate the causal effects of variables on outcomes and answer counterfactual questions

Interventional distributions

  • An interventional distribution is the probability distribution of variables in an SCM after an intervention has been performed
  • It represents the distribution of the system under a specific intervention, where some variables are set to fixed values
  • Interventional distributions are denoted as $P(Y \mid do(X=x))$, where $X$ is the variable being intervened upon and set to value $x$
  • Interventional distributions can be derived from the original SCM by modifying the structural equations and the DAG

Graph mutilation

  • Graph mutilation is a process of modifying the DAG of an SCM to represent an intervention
  • When a variable $X$ is intervened upon and set to a fixed value, the incoming edges to $X$ are removed in the mutilated graph
  • The mutilated graph represents the causal structure of the system under the intervention
  • Graph mutilation is a graphical approach to deriving interventional distributions from the original SCM

Truncated factorization

  • Truncated factorization is a method for deriving the interventional distribution from the original SCM
  • It involves factorizing the joint probability distribution of the SCM based on the modified DAG after an intervention
  • The truncated factorization formula is given by $P(Y \mid do(X=x)) = \prod_{i=1}^n P(Y_i \mid PA_i, do(X=x))$, where $PA_i$ are the parents of $Y_i$ in the mutilated graph
  • Truncated factorization provides a mathematical approach to deriving interventional distributions

Identification of causal effects

  • Identification of causal effects refers to the process of determining whether a causal effect can be estimated from observational data
  • SCMs provide graphical criteria and algorithms for identifying causal effects
  • Identification is a crucial step in causal inference, as it determines whether a causal question can be answered using the available data

Back-door criterion

  • The back-door criterion is a graphical condition for identifying causal effects in an SCM
  • It states that a set of variables $Z$ satisfies the back-door criterion relative to a pair of variables $(X, Y)$ if $Z$ blocks all back-door paths between $X$ and $Y$ and no node in $Z$ is a descendant of $X$
  • If a set $Z$ satisfies the back-door criterion, the causal effect of $X$ on $Y$ can be identified by adjusting for $Z$
  • Example: In a DAG $X \leftarrow Z \rightarrow Y$, the set $Z$ satisfies the back-door criterion relative to $(X, Y)$

Front-door criterion

  • The front-door criterion is another graphical condition for identifying causal effects in an SCM
  • It applies when there are no variables that satisfy the back-door criterion
  • The front-door criterion requires a set of variables $Z$ that intercepts all directed paths from $X$ to $Y$, and there are no unblocked back-door paths from $X$ to $Z$ or from $Z$ to $Y$
  • If the front-door criterion is satisfied, the causal effect of $X$ on $Y$ can be identified by adjusting for $Z$
  • Example: In a DAG $X \rightarrow Z \rightarrow Y$ with an unobserved confounder between $X$ and $Y$, the set $Z$ satisfies the front-door criterion

Instrumental variables

  • Instrumental variables (IVs) are a method for identifying causal effects when there are unmeasured confounders
  • An instrumental variable $Z$ is a variable that affects the treatment $X$ but does not directly affect the outcome $Y$ except through its effect on $X$
  • IVs satisfy the following conditions: (1) $Z$ is associated with $X$, (2) $Z$ does not directly affect $Y$ except through $X$, and (3) $Z$ is independent of all confounders of the $X$-$Y$ relationship
  • If a valid instrumental variable is available, the causal effect of $X$ on $Y$ can be identified
  • Example: In a study of the effect of education on income, a person's date of birth can serve as an instrumental variable, as it affects education through compulsory schooling laws but does not directly affect income

Mediation analysis

  • Mediation analysis is a method for identifying the causal pathways through which a treatment affects an outcome
  • It decomposes the total effect of a treatment into direct and indirect effects mediated by intermediate variables
  • Mediation analysis requires assumptions about the causal structure and the absence of unmeasured confounders
  • SCMs provide a framework for conducting mediation analysis and estimating direct and indirect effects
  • Example: In a study of the effect of a drug on blood pressure, the drug may have a direct effect on blood pressure and an indirect effect mediated by changes in heart rate

Counterfactuals in SCMs

  • Counterfactuals are statements about what would have happened under different hypothetical scenarios
  • SCMs provide a framework for reasoning about counterfactuals and estimating counterfactual quantities
  • Counterfactual reasoning is important for understanding the causal effects of interventions and for answering "what if" questions

Potential outcomes framework

  • The potential outcomes framework is a counterfactual approach to causal inference
  • It defines causal effects in terms of potential outcomes, which are the outcomes that would be observed under different treatment conditions
  • In the potential outcomes framework, each unit has a set of potential outcomes corresponding to different treatment levels
  • The causal effect is defined as the difference between the potential outcomes under different treatment conditions
  • Example: In a binary treatment setting, each unit has two potential outcomes: $Y(1)$ for the treated condition and $Y(0)$ for the untreated condition

Counterfactual queries

  • Counterfactual queries are questions about what would have happened under different hypothetical scenarios
  • SCMs allow for the estimation of counterfactual quantities by simulating interventions on the model
  • Counterfactual queries can be expressed using the $do$-operator, which represents an intervention on a variable
  • Example: "What would the patient's blood pressure be if they had not taken the drug?" can be represented as $P(Y \mid do(X=0))$, where $Y$ is blood pressure and $X$ is the drug treatment

Twin networks

  • Twin networks are a graphical tool for representing and reasoning about counterfactuals in SCMs
  • They consist of two copies of the original SCM: one representing the actual world and the other representing the counterfactual world
  • The two networks are connected by a set of shared exogenous variables, which capture the common factors between the actual and counterfactual worlds
  • Twin networks allow for the estimation of counterfactual quantities by propagating the effects of interventions through the counterfactual network
  • Example: In a twin network for the effect of a drug on blood pressure, the actual network represents the observed treatment and outcome, while the counterfactual network represents the hypothetical scenario where the drug was not taken

Learning SCMs from data

  • Learning SCMs from data involves estimating the causal structure and parameters of the model from observational or experimental data
  • There are different approaches to learning SCMs, including causal structure learning and parameter estimation
  • Learning SCMs from data is a challenging task due to the presence of latent variables, measurement error, and limited sample sizes

Causal structure learning

  • Causal structure learning aims to infer the causal DAG from observational data
  • It involves finding the DAG that best explains the observed dependencies and independencies in the data
  • Causal structure learning methods can be classified into constraint-based, score-based, and hybrid approaches
  • Constraint-based methods (PC algorithm) use conditional independence tests to identify the causal structure
  • Score-based methods (GES algorithm) search for the DAG that optimizes a scoring function, such as the Bayesian Information Criterion (BIC)
  • Hybrid methods combine constraint-based and score-based approaches to improve the accuracy and efficiency of structure learning

Constraint-based methods

  • Constraint-based methods for causal structure learning use conditional independence tests to identify the causal DAG
  • They rely on the causal Markov condition and the faithfulness assumption to infer the causal structure
  • The PC algorithm is a well-known constraint-based method that starts with a fully connected graph and iteratively removes edges based on conditional independence tests
  • Constraint-based methods are computationally efficient but can be sensitive to violations of the assumptions and to the choice of independence test

Score-based methods

  • Score-based methods for causal structure learning search for the DAG that optimizes a scoring function
  • The scoring function measures the fit of the DAG to the observed data while penalizing model complexity
  • Common scoring functions include the Bayesian Information Criterion (BIC) and the Akaike Information Criterion (AIC)
  • The Greedy Equivalence Search (GES) algorithm is a popular score-based method that starts with an empty graph and iteratively adds and removes edges to improve the score
  • Score-based methods are less sensitive to individual independence tests but can be computationally expensive for large search spaces

Hybrid methods

  • Hybrid methods for causal structure learning combine constraint-based and score-based approaches
  • They aim to leverage the strengths of both approaches while mitigating their weaknesses
  • Hybrid methods typically use constraint-based methods to identify a set of candidate DAGs and then use score-based methods to select the best DAG among the candidates
  • The Max-Min Hill-Climbing (MMHC) algorithm is an example of a hybrid method that combines the Max-Min Parents and Children (MMPC) algorithm for constraint-based pruning with a hill-climbing search for score-based optimization
  • Hybrid methods can achieve a balance between computational efficiency and robustness to assumption violations

Parameter estimation

  • Parameter estimation involves estimating the parameters of the structural equations in an SCM given the causal structure
  • It aims to find the parameter values that best fit the observed data
  • Parameter estimation can be performed using maximum likelihood estimation (MLE) or Bayesian methods
  • MLE finds the parameter values that maximize the likelihood of the observed data given the SCM structure
  • Bayesian methods specify prior distributions over the parameters and update them based on the observed data to obtain posterior distributions
  • Parameter estimation requires assumptions about the functional form of the structural equations and the distribution of the error terms

Applications of SCMs

  • SCMs have a wide range of applications in various domains, including social sciences, epidemiology, economics, and artificial intelligence
  • They provide a principled framework for estimating causal effects, evaluating policies, and making predictions under different interventions

Causal effect estimation

  • SCMs allow for the estimation of causal effects from observational data
  • By leveraging the causal structure and assumptions encoded in the SCM, researchers can estimate the effects of interventions on outcomes
  • Causal effect estimation is crucial for understanding the impact of treatments, policies, or exposures on outcomes of interest
  • Example: Estimating the causal effect of a drug on patient outcomes using observational data from electronic health records

Policy evaluation

  • SCMs can be used to evaluate the potential impact of policies or interventions before their implementation
  • By simulating interventions on the SCM, researchers can predict the effects of different policy options on outcomes of interest
  • Policy evaluation using SCMs can inform decision-making and help optimize resource allocation
  • Example: Evaluating the impact of different taxation policies on income inequality and economic growth

Transportability

  • Transportability refers to the problem of applying causal knowledge learned from one population to another population with different characteristics
  • SCMs provide a framework for assessing the transportability of causal effects across different settings or populations
  • Transportability analysis involves identifying the conditions under which causal effects can be generalized from one setting to another
  • Example: Applying the causal effect of a drug estimated from a clinical trial to a target population with different demographics and comorbidities

Causal discovery

  • Causal discovery aims to infer the causal structure of a system from observational data
  • SCMs provide a framework for causal discovery by learning the causal DAG and estimating the structural equations
  • Causal discovery methods based on SCMs can uncover the underlying causal relationships between variables and generate hypotheses for further investigation
  • Example: Discovering the causal factors influencing the development of a disease using observational data from a cohort study

Limitations and extensions of SCMs

  • While SCMs provide a powerful framework for causal inference, they have certain limitations and can be extended to address more complex scenarios
  • Researchers should be aware of these limitations and consider appropriate extensions when applying SCMs to real-world problems

Latent confounding

  • Latent confounding refers to the presence of unmeasured common causes that affect both the treatment and the outcome
  • SCMs assume causal sufficiency, which means that all relevant confounders are included in the model
  • In practice, latent confounding can lead to biased estimates of causal effects if not properly accounted for
  • Extensions of SCMs, such as latent variable models and sensitivity analysis, can help address latent confounding
  • Example: In a study of the effect of smoking on lung cancer, there may be unmeasured genetic factors that influence both smoking behavior and cancer risk

Cyclic causal models

  • Standard SCMs assume acyclicity, meaning that there are no feedback loops or cycles in the causal structure
  • However, many real-world systems involve feedback mechanisms and reciprocal relationships between variables
  • Cyclic causal models extend SCMs to allow for cycles in the causal structure
  • Cyclic models require different assumptions and estimation techniques compared to acyclic models
  • Example: Modeling the reciprocal relationship between job satisfaction and job performance, where satisfaction influences performance and vice versa

Time-varying treatments

  • SCMs typically assume static treatment variables that are determined at a single point in time
  • In many applications, treatments