Difference-in-differences (DiD) is a powerful tool for estimating causal effects in observational studies. It compares changes in outcomes between treatment and control groups over time, controlling for pre-existing differences and time trends.
DiD relies on the parallel trends assumption, which assumes both groups would have followed similar paths without intervention. This method is widely used in policy evaluation, economics, and social sciences to assess the impact of interventions or policy changes.
Definition of DiD
- Difference-in-differences (DiD) is a quasi-experimental research design used to estimate the causal effect of a treatment or intervention on an outcome of interest
- DiD compares the change in outcomes over time between a treatment group and a control group, allowing researchers to control for both time-invariant and group-invariant confounders
- The key identifying assumption in DiD is the parallel trends assumption, which states that in the absence of treatment, the average outcomes for the treatment and control groups would have followed parallel paths over time
Setup for DiD
- DiD requires panel data or repeated cross-sectional data with observations on both the treatment and control groups before and after the intervention
- The treatment group is exposed to the intervention or policy change at a specific point in time, while the control group remains unexposed throughout the study period
- The outcome variable of interest is measured for both groups at baseline (pre-intervention) and follow-up (post-intervention) periods
Parallel trends assumption
- The parallel trends assumption is crucial for the validity of DiD estimates
- It assumes that in the absence of the intervention, the average change in the outcome would have been the same for both the treatment and control groups
- This assumption allows researchers to use the control group's time trend as a counterfactual for the treatment group's time trend had they not received the intervention
- Violations of the parallel trends assumption can lead to biased estimates of the treatment effect
DiD estimation
- DiD estimates the causal effect of a treatment by comparing the change in outcomes before and after the intervention for the treatment group to the change in outcomes over the same period for the control group
- The DiD estimator is obtained by subtracting the change in the control group's average outcome from the change in the treatment group's average outcome
- DiD can be implemented using regression analysis, where the treatment effect is captured by the coefficient on the interaction term between the treatment group indicator and the post-intervention time period indicator
Simple two-period DiD
- In the simplest case, DiD involves two time periods (pre and post) and two groups (treatment and control)
- The DiD estimator is calculated as: $(\bar{Y}{Treatment, Post} - \bar{Y}{Treatment, Pre}) - (\bar{Y}{Control, Post} - \bar{Y}{Control, Pre})$, where $\bar{Y}$ represents the average outcome for each group and time period
- This estimator can be obtained from a regression of the outcome on indicators for treatment group, post-intervention period, and their interaction
Multi-period DiD
- DiD can be extended to settings with multiple time periods before and after the intervention
- In this case, the DiD estimator is obtained by comparing the change in outcomes for the treatment group to the change in outcomes for the control group, averaged across all pre and post periods
- Multi-period DiD allows for more flexible time trends and can provide additional evidence on the dynamics of the treatment effect over time
Two-way fixed effects model
- DiD can be implemented using a two-way fixed effects regression model
- This model includes fixed effects for both units (e.g., individuals, firms, or states) and time periods, controlling for any time-invariant differences across units and any unit-invariant time trends
- The treatment effect is captured by the coefficient on the interaction term between the treatment group indicator and the post-intervention time period indicator
- Two-way fixed effects models can accommodate multiple treatment and control groups, as well as staggered adoption of the treatment
Key assumptions
- The validity of DiD estimates relies on several key assumptions that must be carefully evaluated in any application
Common trends assumption
- The common trends assumption, also known as the parallel trends assumption, is the most critical assumption for DiD
- It states that in the absence of the intervention, the average outcomes for the treatment and control groups would have followed parallel paths over time
- This assumption implies that any differences in the outcome between the two groups are due to the intervention and not to pre-existing differences in their time trends
- Researchers can assess the plausibility of this assumption by examining pre-intervention trends in the outcome for both groups
Stable unit treatment value assumption (SUTVA)
- SUTVA requires that the potential outcomes for any unit do not depend on the treatment status of other units
- In other words, there should be no spillover effects or interference between units
- Violations of SUTVA can occur when the treatment affects the outcomes of untreated units (e.g., through social interactions or general equilibrium effects)
- Addressing SUTVA violations may require redefining the unit of analysis or using alternative methods that explicitly model spillovers
No anticipation effects
- DiD assumes that units do not change their behavior in anticipation of future treatment
- Anticipation effects can lead to biased estimates if units in the treatment group adjust their outcomes before the intervention occurs
- To mitigate this concern, researchers can check for evidence of pre-trends or use an event study design to examine the dynamics of the treatment effect
Interpreting DiD estimates
- DiD estimates can be interpreted as the average treatment effect on the treated (ATT) under certain assumptions
Average treatment effect (ATE)
- The ATE is the average effect of the treatment on the entire population, including both treated and untreated units
- DiD estimates the ATT, which is the average effect of the treatment on the units that actually received it
- If treatment effects are homogeneous (i.e., the same for all units), the ATT will be equal to the ATE
- However, if treatment effects are heterogeneous, the ATT may differ from the ATE
Heterogeneous treatment effects
- Treatment effects may vary across different subgroups of the population (e.g., by age, gender, or income level)
- DiD can be used to estimate heterogeneous treatment effects by interacting the treatment indicator with indicators for different subgroups
- Examining heterogeneous treatment effects can provide insights into the distributional impacts of the intervention and help identify subpopulations that benefit most from the treatment
Robustness checks
- Researchers should conduct various robustness checks to assess the sensitivity of DiD estimates to alternative specifications and assumptions
Testing for pre-trends
- One key robustness check is to test for the presence of pre-trends in the outcome variable
- This can be done by estimating a model that includes leads of the treatment indicator (i.e., interactions between the treatment group indicator and indicators for time periods before the intervention)
- If the coefficients on the lead terms are statistically significant, it suggests that the parallel trends assumption may be violated
- Event study designs, which estimate treatment effects for each time period relative to the intervention, can also be used to visually assess pre-trends
Placebo tests
- Placebo tests involve estimating DiD models using outcomes or treatments that should not be affected by the intervention
- For example, researchers can estimate the effect of the intervention on an outcome that is known to be unrelated to the treatment
- If the DiD estimate is statistically significant in the placebo test, it suggests that the main results may be driven by unobserved confounders or other sources of bias
- Placebo tests can also be conducted by randomly assigning the treatment to different units or time periods and checking that the DiD estimate is close to zero
Triple differences (DDD)
- Triple differences (DDD) is an extension of DiD that involves comparing the DiD estimates across two subgroups that are differentially affected by the treatment
- DDD can help control for time-varying confounders that affect both the treatment and control groups, as long as these confounders do not have a differential impact on the two subgroups
- For example, if a policy change affects one age group (e.g., older workers) but not another (e.g., younger workers), DDD can be used to estimate the treatment effect by comparing the DiD estimates for the two age groups
- DDD requires additional assumptions and may not always be feasible, depending on the available data and the nature of the intervention
Extensions of DiD
- Several extensions of the basic DiD framework have been developed to address specific challenges and improve the flexibility of the method
Staggered adoption DiD
- In many settings, the treatment is adopted by different units at different times, rather than being implemented simultaneously for all treated units
- Staggered adoption DiD methods can be used to estimate treatment effects in this context
- These methods typically involve estimating a two-way fixed effects model that includes indicators for each unit and time period, as well as interactions between the treatment indicator and indicators for the time periods since adoption
- Staggered adoption DiD requires additional assumptions, such as the absence of anticipation effects and the homogeneity of treatment effects across units and time
Synthetic control methods vs DiD
- Synthetic control methods (SCM) are an alternative approach to estimating treatment effects in settings with a single treated unit and multiple control units
- SCM constructs a weighted average of the control units (the synthetic control) that best resembles the treated unit in terms of pre-intervention outcomes and other relevant characteristics
- The treatment effect is then estimated by comparing the post-intervention outcomes of the treated unit to those of the synthetic control
- SCM can be seen as a generalization of DiD, as it allows for more flexible weighting of the control units and can accommodate multiple pre-intervention time periods
- However, SCM may not be feasible when there are multiple treated units or when the available control units do not provide a good match for the treated unit
Dynamic treatment effects
- In some cases, the effect of the treatment may vary over time, rather than being constant
- Dynamic DiD models can be used to estimate the evolution of the treatment effect by including interactions between the treatment indicator and indicators for different post-intervention time periods
- These models can help capture any nonlinearities or time-varying patterns in the treatment effect
- Event study designs, which estimate treatment effects for each time period relative to the intervention, are a common way to visualize dynamic treatment effects
- Estimating dynamic treatment effects requires additional assumptions, such as the absence of anticipation effects and the comparability of the treatment and control groups over time
Limitations of DiD
- Despite its widespread use, DiD has several limitations that researchers should be aware of when applying the method
Violations of parallel trends
- The most critical assumption for the validity of DiD estimates is the parallel trends assumption
- If the treatment and control groups have different pre-intervention trends in the outcome variable, DiD estimates may be biased
- Violations of parallel trends can occur due to time-varying confounders that affect the two groups differently, or due to anticipation effects that cause the treatment group to change its behavior before the intervention
- Researchers should carefully assess the plausibility of the parallel trends assumption using visual inspection, pre-trend tests, and other diagnostic tools
Time-varying confounders
- DiD controls for time-invariant confounders by comparing changes over time, but it does not account for time-varying confounders that affect the treatment and control groups differently
- If there are unobserved factors that change over time and have a differential impact on the two groups, DiD estimates may be biased
- Addressing time-varying confounders may require additional data or more advanced methods, such as instrumental variables or regression discontinuity designs
Spillover effects
- DiD assumes that there are no spillover effects or interference between units (the SUTVA assumption)
- However, in some settings, the treatment may affect the outcomes of untreated units, leading to biased estimates
- Spillover effects can occur through various channels, such as social interactions, economic linkages, or general equilibrium effects
- Addressing spillover effects may require redefining the unit of analysis (e.g., using larger geographic units) or explicitly modeling the interactions between units
Applications of DiD
- DiD has been widely applied in various fields, including economics, public policy, and social sciences
Policy evaluation examples
- DiD is commonly used to evaluate the impact of policy changes, such as minimum wage increases, health insurance expansions, or environmental regulations
- For example, Card and Krueger (1994) used DiD to estimate the effect of a minimum wage increase in New Jersey on employment in fast-food restaurants, comparing the change in employment to that in neighboring Pennsylvania, which did not increase its minimum wage
- Other notable applications include the evaluation of the Oregon Health Insurance Experiment (Finkelstein et al., 2012) and the effects of the Clean Air Act Amendments on air quality and infant health (Chay and Greenstone, 2003)
DiD with panel data
- DiD is often applied to panel data, which consists of repeated observations on the same units over time
- Panel data allows researchers to control for unit-specific fixed effects, which capture any time-invariant differences between the treatment and control groups
- Panel data also enables the estimation of dynamic treatment effects and the examination of pre-trends and post-treatment dynamics
- However, panel data may be subject to attrition bias if units drop out of the sample over time, which can lead to biased DiD estimates if attrition is related to the treatment
DiD with repeated cross-sections
- In some cases, panel data may not be available, but researchers can still apply DiD using repeated cross-sections of data
- Repeated cross-sections consist of different samples of units drawn from the same population at different points in time
- DiD with repeated cross-sections involves comparing the change in outcomes for the treatment group to the change in outcomes for the control group, using the cross-sectional samples before and after the intervention
- This approach requires the additional assumption that the composition of the treatment and control groups remains stable over time, or that any changes in composition are uncorrelated with the treatment
- DiD with repeated cross-sections is commonly used in settings where panel data is not feasible, such as when evaluating the impact of policy changes on population-level outcomes using survey data