📊Causal Inference Unit 5 Review

5.3 Inverse probability weighting

📊Causal Inference
Unit 5 Review

5.3 Inverse probability weighting

Written by the Fiveable Content Team • Last updated September 2025

📊Causal Inference

Unit & Topic Study Guides

5.1 Propensity score

5.2 Matching methods

5.3 Inverse probability weighting

5.4 Doubly robust estimation

Inverse probability weighting is a powerful tool in causal inference. It helps estimate treatment effects in observational studies by creating a balanced pseudo-population. By weighting observations based on their likelihood of receiving treatment, IPW mimics randomized experiments.

IPW relies on key assumptions like positivity and exchangeability. Propensity scores are used to calculate weights, which are then applied in outcome models. Assessing covariate balance and understanding IPW's limitations are crucial for proper implementation and interpretation of results.

Overview of inverse probability weighting

Inverse probability weighting (IPW) is a statistical technique used in causal inference to estimate the average treatment effect (ATE) or average treatment effect on the treated (ATT) from observational data
IPW aims to create a pseudo-population where treatment assignment is independent of confounding variables, allowing for unbiased estimation of causal effects
The key idea behind IPW is to weight each observation by the inverse of the probability of receiving the treatment actually received, given the observed covariates

Motivation for weighting approach

In observational studies, treatment assignment is often influenced by confounding variables, leading to biased estimates of causal effects when using traditional regression methods
IPW addresses confounding by creating a weighted sample where the distribution of confounding variables is balanced between treatment groups
By weighting observations based on the propensity score (probability of treatment given covariates), IPW mimics a randomized experiment where treatment assignment is independent of confounders

Assumptions of IPW

Positivity assumption

The positivity assumption requires that every individual has a non-zero probability of receiving each level of the treatment, given their observed covariates
Formally, for all values of the confounders $X$ and treatment $A$, $P(A=a|X=x) > 0$ for all $a$ and $x$ in the support of $A$ and $X$
Violations of the positivity assumption can occur when there are regions of the covariate space where no individuals receive a particular treatment level, leading to extreme weights and unstable estimates

Exchangeability assumption

The exchangeability assumption, also known as the no unmeasured confounders assumption, states that treatment assignment is independent of potential outcomes given the observed covariates
Formally, $Y(a) \perp A | X$ for all values of $a$, where $Y(a)$ denotes the potential outcome under treatment level $a$
This assumption implies that all variables that influence both treatment assignment and the outcome are measured and included in the propensity score model
Violations of the exchangeability assumption can lead to biased estimates of causal effects, as there may be unobserved confounders that are not accounted for in the weighting process

Estimating weights

Propensity score models

The propensity score is the probability of receiving the observed treatment level given the observed covariates, denoted as $e(X) = P(A=1|X)$ for binary treatments
Propensity scores are typically estimated using logistic regression, modeling treatment assignment as a function of the observed confounders
The choice of variables to include in the propensity score model is crucial, as omitting important confounders can lead to biased estimates, while including too many variables can lead to overfitting and reduced efficiency

Stabilized vs unstabilized weights

Unstabilized weights are calculated as the inverse of the propensity score for treated individuals and the inverse of one minus the propensity score for untreated individuals: $w_i = \frac{A_i}{e(X_i)} + \frac{1-A_i}{1-e(X_i)}$
Stabilized weights include the marginal probability of treatment in the numerator, which helps to reduce the variability of the weights and improve efficiency: $sw_i = \frac{A_i P(A=1)}{e(X_i)} + \frac{(1-A_i) P(A=0)}{1-e(X_i)}$
Stabilized weights are generally preferred, as they have better statistical properties and are less sensitive to extreme propensity scores

Fitting outcome models

Weighted regression

After estimating the inverse probability weights, causal effects can be estimated by fitting a weighted regression model, where each observation is weighted by its corresponding IPW
For binary outcomes, a weighted logistic regression can be used, while for continuous outcomes, a weighted linear regression is appropriate
The weighted regression model should include the treatment variable and any additional confounders that were not sufficiently balanced by the weighting process

Estimating causal effects

The coefficient of the treatment variable in the weighted regression model provides an estimate of the average treatment effect (ATE) in the pseudo-population created by the IPW
For binary treatments, the ATE represents the difference in the expected outcome between the treated and untreated groups in the weighted sample
Confidence intervals for the ATE can be obtained using robust standard errors that account for the weights and the potential misspecification of the propensity score model

Assessing covariate balance

Standardized mean differences

After applying the inverse probability weights, it is important to assess the balance of the confounding variables between the treatment groups in the weighted sample
Standardized mean differences (SMDs) can be used to compare the means of continuous confounders between the treatment groups, with values close to zero indicating good balance
For binary confounders, SMDs can be calculated using the prevalence of the confounder in each treatment group
As a rule of thumb, SMDs below 0.1 are considered indicative of adequate balance

Variance ratios

In addition to comparing means, it is also important to assess the balance of the variances of the confounders between the treatment groups in the weighted sample
Variance ratios (VRs) can be calculated by dividing the variance of a confounder in the treated group by its variance in the untreated group
VRs close to one indicate good balance, while values far from one suggest that the weighting process may not have adequately balanced the confounder
If substantial imbalances remain after weighting, it may be necessary to modify the propensity score model or consider alternative methods for estimating causal effects

Comparison to other methods

IPW vs matching

Like IPW, matching methods aim to create balanced treatment groups by pairing treated and untreated individuals with similar values of the confounding variables
Matching can be done using propensity scores (propensity score matching) or by directly matching on the confounders (covariate matching)
Compared to IPW, matching may be more intuitive and easier to communicate, but it can be more sensitive to the choice of matching algorithm and may discard a substantial portion of the data

IPW vs stratification

Stratification involves dividing the sample into subgroups (strata) based on the values of the confounding variables and estimating causal effects within each stratum
Propensity score stratification is a common approach, where individuals are stratified based on their estimated propensity scores
Compared to IPW, stratification may be more robust to model misspecification, but it can be less efficient and may not fully remove confounding if the strata are too coarse

Limitations of IPW

Sensitivity to model misspecification

The performance of IPW relies heavily on the correct specification of the propensity score model
If important confounders are omitted from the model or if the functional form of the relationship between the confounders and treatment assignment is misspecified, the resulting weights may not adequately balance the confounders, leading to biased estimates of causal effects
Sensitivity analyses can be conducted to assess the robustness of the results to potential model misspecification, such as varying the set of confounders included in the propensity score model or using different functional forms

Instability with extreme weights

In some cases, the estimated propensity scores may be very close to zero or one, resulting in extremely large weights for some observations
These extreme weights can lead to unstable estimates of causal effects and inflated standard errors
To mitigate this issue, weight truncation or trimming can be employed, where weights above a certain threshold are set to a maximum value (truncation) or observations with extreme weights are removed from the analysis (trimming)
However, truncation and trimming may introduce bias and should be used with caution, as they may alter the target population and the interpretation of the causal effect

Applications of IPW

Time-varying treatments

IPW can be extended to handle time-varying treatments, where individuals may receive different levels of treatment over time
In this setting, weights are calculated based on the probability of an individual receiving their observed treatment history up to each time point, given their covariate history
Marginal structural models (MSMs) are often used in conjunction with IPW for time-varying treatments to estimate the causal effect of treatment trajectories on outcomes

Marginal structural models

MSMs are a class of models that use IPW to estimate the causal effect of a time-varying treatment on an outcome, while accounting for time-varying confounders that may be affected by prior treatment
MSMs model the marginal expectation of the potential outcomes as a function of the treatment history, without conditioning on the time-varying confounders
The weights used in MSMs are typically calculated using the inverse probability of treatment weighting (IPTW) approach, which accounts for both the probability of treatment at each time point and the probability of censoring or loss to follow-up

Simulation studies of IPW

Evaluating bias reduction

Simulation studies can be used to assess the performance of IPW in reducing bias due to confounding in a controlled setting
By generating data with known causal structures and comparing the estimated causal effects to the true values, researchers can evaluate the ability of IPW to recover unbiased estimates under various scenarios
Simulation studies can also be used to investigate the impact of violations of the positivity and exchangeability assumptions on the performance of IPW

Comparing to other estimators

Simulation studies can also be used to compare the performance of IPW to other causal inference methods, such as matching, stratification, or g-computation
By evaluating the bias, variance, and mean squared error of the estimators under different data-generating scenarios, researchers can gain insights into the relative strengths and weaknesses of each approach
The results of these simulation studies can inform the choice of method for a given research question and data structure

Extensions of IPW

Doubly robust estimation

Doubly robust (DR) estimation combines IPW with an outcome model to provide unbiased estimates of causal effects even if either the propensity score model or the outcome model is misspecified (but not both)
DR estimators typically involve fitting a weighted outcome model, where the weights are a function of the propensity score and the outcome model predictions
The DR property provides a safeguard against model misspecification and can improve the efficiency of the causal effect estimates compared to IPW alone

Targeted maximum likelihood

Targeted maximum likelihood estimation (TMLE) is a doubly robust method that combines IPW with a targeted outcome model to optimize the bias-variance tradeoff in causal effect estimation
TMLE involves fitting an initial outcome model, updating the model using a targeting step that incorporates the propensity score, and obtaining the final causal effect estimate by averaging the targeted predictions
TMLE has been shown to have desirable statistical properties, including double robustness, asymptotic efficiency, and optimal bias-variance tradeoff, making it a promising approach for causal inference in observational studies

📊Causal Inference Unit 5 Review

5.3 Inverse probability weighting

📊Causal Inference Unit 5 Review

5.3 Inverse probability weighting

Unit & Topic Study Guides

Overview of inverse probability weighting

Motivation for weighting approach

Assumptions of IPW

Positivity assumption

Exchangeability assumption

Estimating weights

Propensity score models

Stabilized vs unstabilized weights

Fitting outcome models

Weighted regression

Estimating causal effects

Assessing covariate balance

Standardized mean differences

Variance ratios

Comparison to other methods

IPW vs matching

IPW vs stratification

Limitations of IPW

Sensitivity to model misspecification

Instability with extreme weights

Applications of IPW

Time-varying treatments

Marginal structural models

Simulation studies of IPW

Evaluating bias reduction

Comparing to other estimators

Extensions of IPW

Doubly robust estimation

Targeted maximum likelihood

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

📊Causal Inference
Unit 5 Review