📊Causal Inference Unit 9 Review

9.3 Structural causal models (SCMs)

📊Causal Inference
Unit 9 Review

9.3 Structural causal models (SCMs)

Written by the Fiveable Content Team • Last updated September 2025

📊Causal Inference

Unit & Topic Study Guides

9.1 Directed acyclic graphs (DAGs)

9.2 d-separation and backdoor criterion

9.3 Structural causal models (SCMs)

9.4 Interventions and do-calculus

Structural causal models (SCMs) are powerful tools for understanding cause and effect relationships. They combine graphs and equations to represent how variables influence each other, allowing us to estimate causal effects and answer "what if" questions.

SCMs help us move beyond simple correlations to truly grasp causal mechanisms. By encoding our assumptions about causal structure, SCMs provide a framework for inferring causal effects from data, evaluating interventions, and reasoning about counterfactuals in complex systems.

Structural causal models (SCMs)

SCMs provide a formal framework for representing and reasoning about causal relationships in a system
They combine graphical models with structural equations to encode causal assumptions and enable causal inference
SCMs are a key tool in the field of Causal Inference for estimating causal effects and answering counterfactual questions

Definition of SCMs

An SCM is a mathematical model that describes the causal relationships between variables in a system
It consists of a set of variables, a directed acyclic graph (DAG) representing the causal structure, and a set of structural equations
The structural equations specify the functional relationships between variables and their direct causes

Components of SCMs

Variables: The set of variables in the system, which can be observed or unobserved
Directed acyclic graph (DAG): A graphical representation of the causal relationships between variables, where edges represent direct causal influences
Structural equations: Mathematical equations that describe how each variable is determined by its direct causes in the DAG
Probability distribution: A joint probability distribution over the variables in the SCM, which is implied by the structural equations and the DAG

Endogenous vs exogenous variables

Endogenous variables are variables that are determined by other variables within the SCM
Exogenous variables are variables that are determined by factors outside the SCM and are not caused by any other variables in the model
Exogenous variables are often represented as root nodes in the DAG and are assumed to be independently distributed

Structural equations

Structural equations specify the functional relationships between variables in the SCM
They describe how each endogenous variable is determined by its direct causes and an error term
The error terms represent unobserved factors or randomness that influence the variable
Example: $Y = f(X, \epsilon_Y)$, where $Y$ is an endogenous variable, $X$ is its direct cause, and $\epsilon_Y$ is the error term

Directed acyclic graphs (DAGs)

DAGs are graphical representations of the causal structure in an SCM
Nodes in the DAG represent variables, and directed edges represent direct causal influences
The absence of an edge between two nodes implies that there is no direct causal relationship between the corresponding variables
DAGs satisfy the acyclicity property, meaning there are no directed cycles in the graph

Causal Markov condition

The causal Markov condition is a fundamental assumption in SCMs
It states that a variable is independent of its non-descendants given its direct causes (parents) in the DAG
This assumption allows for the factorization of the joint probability distribution based on the DAG structure
Example: In a DAG $X \rightarrow Y \rightarrow Z$, the causal Markov condition implies that $X$ and $Z$ are independent given $Y$

Causal sufficiency

Causal sufficiency is an assumption in SCMs that states that all common causes of the observed variables are included in the model
It implies that there are no unmeasured confounders that affect multiple observed variables
Causal sufficiency is a strong assumption and may not always hold in practice
Violations of causal sufficiency can lead to biased estimates of causal effects

Causal faithfulness

Causal faithfulness is an assumption that the independence relationships implied by the DAG are exactly the independence relationships in the probability distribution
It ensures that the causal structure and the probability distribution are compatible
Faithfulness rules out certain types of cancellations or exact balancing of causal effects
Violations of faithfulness can occur in practice, but they are often considered to be rare

Representing interventions with SCMs

Interventions are actions that modify the causal structure of a system by setting variables to specific values
SCMs provide a formal framework for representing and reasoning about interventions
Interventions allow us to estimate the causal effects of variables on outcomes and answer counterfactual questions

Interventional distributions

An interventional distribution is the probability distribution of variables in an SCM after an intervention has been performed
It represents the distribution of the system under a specific intervention, where some variables are set to fixed values
Interventional distributions are denoted as $P(Y \mid do(X=x))$, where $X$ is the variable being intervened upon and set to value $x$
Interventional distributions can be derived from the original SCM by modifying the structural equations and the DAG

Graph mutilation

Graph mutilation is a process of modifying the DAG of an SCM to represent an intervention
When a variable $X$ is intervened upon and set to a fixed value, the incoming edges to $X$ are removed in the mutilated graph
The mutilated graph represents the causal structure of the system under the intervention
Graph mutilation is a graphical approach to deriving interventional distributions from the original SCM

Truncated factorization

Truncated factorization is a method for deriving the interventional distribution from the original SCM
It involves factorizing the joint probability distribution of the SCM based on the modified DAG after an intervention
The truncated factorization formula is given by $P(Y \mid do(X=x)) = \prod_{i=1}^n P(Y_i \mid PA_i, do(X=x))$, where $PA_i$ are the parents of $Y_i$ in the mutilated graph
Truncated factorization provides a mathematical approach to deriving interventional distributions

Identification of causal effects

Identification of causal effects refers to the process of determining whether a causal effect can be estimated from observational data
SCMs provide graphical criteria and algorithms for identifying causal effects
Identification is a crucial step in causal inference, as it determines whether a causal question can be answered using the available data

Back-door criterion

The back-door criterion is a graphical condition for identifying causal effects in an SCM
It states that a set of variables $Z$ satisfies the back-door criterion relative to a pair of variables $(X, Y)$ if $Z$ blocks all back-door paths between $X$ and $Y$ and no node in $Z$ is a descendant of $X$
If a set $Z$ satisfies the back-door criterion, the causal effect of $X$ on $Y$ can be identified by adjusting for $Z$
Example: In a DAG $X \leftarrow Z \rightarrow Y$, the set $Z$ satisfies the back-door criterion relative to $(X, Y)$

Front-door criterion

The front-door criterion is another graphical condition for identifying causal effects in an SCM
It applies when there are no variables that satisfy the back-door criterion
The front-door criterion requires a set of variables $Z$ that intercepts all directed paths from $X$ to $Y$, and there are no unblocked back-door paths from $X$ to $Z$ or from $Z$ to $Y$
If the front-door criterion is satisfied, the causal effect of $X$ on $Y$ can be identified by adjusting for $Z$
Example: In a DAG $X \rightarrow Z \rightarrow Y$ with an unobserved confounder between $X$ and $Y$, the set $Z$ satisfies the front-door criterion

Instrumental variables

Instrumental variables (IVs) are a method for identifying causal effects when there are unmeasured confounders
An instrumental variable $Z$ is a variable that affects the treatment $X$ but does not directly affect the outcome $Y$ except through its effect on $X$
IVs satisfy the following conditions: (1) $Z$ is associated with $X$, (2) $Z$ does not directly affect $Y$ except through $X$, and (3) $Z$ is independent of all confounders of the $X$-$Y$ relationship
If a valid instrumental variable is available, the causal effect of $X$ on $Y$ can be identified
Example: In a study of the effect of education on income, a person's date of birth can serve as an instrumental variable, as it affects education through compulsory schooling laws but does not directly affect income

Mediation analysis

Mediation analysis is a method for identifying the causal pathways through which a treatment affects an outcome
It decomposes the total effect of a treatment into direct and indirect effects mediated by intermediate variables
Mediation analysis requires assumptions about the causal structure and the absence of unmeasured confounders
SCMs provide a framework for conducting mediation analysis and estimating direct and indirect effects
Example: In a study of the effect of a drug on blood pressure, the drug may have a direct effect on blood pressure and an indirect effect mediated by changes in heart rate

Counterfactuals in SCMs

Counterfactuals are statements about what would have happened under different hypothetical scenarios
SCMs provide a framework for reasoning about counterfactuals and estimating counterfactual quantities
Counterfactual reasoning is important for understanding the causal effects of interventions and for answering "what if" questions

Potential outcomes framework

The potential outcomes framework is a counterfactual approach to causal inference
It defines causal effects in terms of potential outcomes, which are the outcomes that would be observed under different treatment conditions
In the potential outcomes framework, each unit has a set of potential outcomes corresponding to different treatment levels
The causal effect is defined as the difference between the potential outcomes under different treatment conditions
Example: In a binary treatment setting, each unit has two potential outcomes: $Y(1)$ for the treated condition and $Y(0)$ for the untreated condition

Counterfactual queries

Counterfactual queries are questions about what would have happened under different hypothetical scenarios
SCMs allow for the estimation of counterfactual quantities by simulating interventions on the model
Counterfactual queries can be expressed using the $do$-operator, which represents an intervention on a variable
Example: "What would the patient's blood pressure be if they had not taken the drug?" can be represented as $P(Y \mid do(X=0))$, where $Y$ is blood pressure and $X$ is the drug treatment

Twin networks

Twin networks are a graphical tool for representing and reasoning about counterfactuals in SCMs
They consist of two copies of the original SCM: one representing the actual world and the other representing the counterfactual world
The two networks are connected by a set of shared exogenous variables, which capture the common factors between the actual and counterfactual worlds
Twin networks allow for the estimation of counterfactual quantities by propagating the effects of interventions through the counterfactual network
Example: In a twin network for the effect of a drug on blood pressure, the actual network represents the observed treatment and outcome, while the counterfactual network represents the hypothetical scenario where the drug was not taken

Learning SCMs from data

Learning SCMs from data involves estimating the causal structure and parameters of the model from observational or experimental data
There are different approaches to learning SCMs, including causal structure learning and parameter estimation
Learning SCMs from data is a challenging task due to the presence of latent variables, measurement error, and limited sample sizes

Causal structure learning

Causal structure learning aims to infer the causal DAG from observational data
It involves finding the DAG that best explains the observed dependencies and independencies in the data
Causal structure learning methods can be classified into constraint-based, score-based, and hybrid approaches
Constraint-based methods (PC algorithm) use conditional independence tests to identify the causal structure
Score-based methods (GES algorithm) search for the DAG that optimizes a scoring function, such as the Bayesian Information Criterion (BIC)
Hybrid methods combine constraint-based and score-based approaches to improve the accuracy and efficiency of structure learning

Constraint-based methods

Constraint-based methods for causal structure learning use conditional independence tests to identify the causal DAG
They rely on the causal Markov condition and the faithfulness assumption to infer the causal structure
The PC algorithm is a well-known constraint-based method that starts with a fully connected graph and iteratively removes edges based on conditional independence tests
Constraint-based methods are computationally efficient but can be sensitive to violations of the assumptions and to the choice of independence test

Score-based methods

Score-based methods for causal structure learning search for the DAG that optimizes a scoring function
The scoring function measures the fit of the DAG to the observed data while penalizing model complexity
Common scoring functions include the Bayesian Information Criterion (BIC) and the Akaike Information Criterion (AIC)
The Greedy Equivalence Search (GES) algorithm is a popular score-based method that starts with an empty graph and iteratively adds and removes edges to improve the score
Score-based methods are less sensitive to individual independence tests but can be computationally expensive for large search spaces

Hybrid methods

Hybrid methods for causal structure learning combine constraint-based and score-based approaches
They aim to leverage the strengths of both approaches while mitigating their weaknesses
Hybrid methods typically use constraint-based methods to identify a set of candidate DAGs and then use score-based methods to select the best DAG among the candidates
The Max-Min Hill-Climbing (MMHC) algorithm is an example of a hybrid method that combines the Max-Min Parents and Children (MMPC) algorithm for constraint-based pruning with a hill-climbing search for score-based optimization
Hybrid methods can achieve a balance between computational efficiency and robustness to assumption violations

Parameter estimation

Parameter estimation involves estimating the parameters of the structural equations in an SCM given the causal structure
It aims to find the parameter values that best fit the observed data
Parameter estimation can be performed using maximum likelihood estimation (MLE) or Bayesian methods
MLE finds the parameter values that maximize the likelihood of the observed data given the SCM structure
Bayesian methods specify prior distributions over the parameters and update them based on the observed data to obtain posterior distributions
Parameter estimation requires assumptions about the functional form of the structural equations and the distribution of the error terms

Applications of SCMs

SCMs have a wide range of applications in various domains, including social sciences, epidemiology, economics, and artificial intelligence
They provide a principled framework for estimating causal effects, evaluating policies, and making predictions under different interventions

Causal effect estimation

SCMs allow for the estimation of causal effects from observational data
By leveraging the causal structure and assumptions encoded in the SCM, researchers can estimate the effects of interventions on outcomes
Causal effect estimation is crucial for understanding the impact of treatments, policies, or exposures on outcomes of interest
Example: Estimating the causal effect of a drug on patient outcomes using observational data from electronic health records

Policy evaluation

SCMs can be used to evaluate the potential impact of policies or interventions before their implementation
By simulating interventions on the SCM, researchers can predict the effects of different policy options on outcomes of interest
Policy evaluation using SCMs can inform decision-making and help optimize resource allocation
Example: Evaluating the impact of different taxation policies on income inequality and economic growth

Transportability

Transportability refers to the problem of applying causal knowledge learned from one population to another population with different characteristics
SCMs provide a framework for assessing the transportability of causal effects across different settings or populations
Transportability analysis involves identifying the conditions under which causal effects can be generalized from one setting to another
Example: Applying the causal effect of a drug estimated from a clinical trial to a target population with different demographics and comorbidities

Causal discovery

Causal discovery aims to infer the causal structure of a system from observational data
SCMs provide a framework for causal discovery by learning the causal DAG and estimating the structural equations
Causal discovery methods based on SCMs can uncover the underlying causal relationships between variables and generate hypotheses for further investigation
Example: Discovering the causal factors influencing the development of a disease using observational data from a cohort study

Limitations and extensions of SCMs

While SCMs provide a powerful framework for causal inference, they have certain limitations and can be extended to address more complex scenarios
Researchers should be aware of these limitations and consider appropriate extensions when applying SCMs to real-world problems

Latent confounding

Latent confounding refers to the presence of unmeasured common causes that affect both the treatment and the outcome
SCMs assume causal sufficiency, which means that all relevant confounders are included in the model
In practice, latent confounding can lead to biased estimates of causal effects if not properly accounted for
Extensions of SCMs, such as latent variable models and sensitivity analysis, can help address latent confounding
Example: In a study of the effect of smoking on lung cancer, there may be unmeasured genetic factors that influence both smoking behavior and cancer risk

Cyclic causal models

Standard SCMs assume acyclicity, meaning that there are no feedback loops or cycles in the causal structure
However, many real-world systems involve feedback mechanisms and reciprocal relationships between variables
Cyclic causal models extend SCMs to allow for cycles in the causal structure
Cyclic models require different assumptions and estimation techniques compared to acyclic models
Example: Modeling the reciprocal relationship between job satisfaction and job performance, where satisfaction influences performance and vice versa

Time-varying treatments

SCMs typically assume static treatment variables that are determined at a single point in time
In many applications, treatments

📊Causal Inference Unit 9 Review

9.3 Structural causal models (SCMs)

📊Causal Inference Unit 9 Review

9.3 Structural causal models (SCMs)

Unit & Topic Study Guides

Structural causal models (SCMs)

Definition of SCMs

Components of SCMs

Endogenous vs exogenous variables

Structural equations

Directed acyclic graphs (DAGs)

Causal Markov condition

Causal sufficiency

Causal faithfulness

Representing interventions with SCMs

Interventional distributions

Graph mutilation

Truncated factorization

Identification of causal effects

Back-door criterion

Front-door criterion

Instrumental variables

Mediation analysis

Counterfactuals in SCMs

Potential outcomes framework

Counterfactual queries

Twin networks

Learning SCMs from data

Causal structure learning

Constraint-based methods

Score-based methods

Hybrid methods

Parameter estimation

Applications of SCMs

Causal effect estimation

Policy evaluation

Transportability

Causal discovery

Limitations and extensions of SCMs

Latent confounding

Cyclic causal models

Time-varying treatments

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

📊Causal Inference
Unit 9 Review