🎳Intro to Econometrics Unit 9 Review

9.2 Instrumental variables

🎳Intro to Econometrics
Unit 9 Review

9.2 Instrumental variables

Written by the Fiveable Content Team • Last updated September 2025

🎳Intro to Econometrics

Unit & Topic Study Guides

9.1 Endogeneity

9.2 Instrumental variables

9.3 Two-stage least squares (2SLS)

9.4 Validity of instruments

9.5 Weak instruments

Instrumental variables are a powerful tool in econometrics for estimating causal relationships when endogeneity is present. This technique uses an external variable to isolate exogenous variation in the explanatory variable, allowing researchers to obtain consistent estimates of causal effects.

The instrumental variables approach relies on finding a variable that meets specific conditions: relevance, exclusion restriction, and exogeneity. By using methods like two-stage least squares, economists can leverage natural experiments, lagged variables, or geographical variations to estimate causal effects in various fields.

Instrumental variables overview

Instrumental variables (IV) is an econometric technique used to estimate causal relationships when there are endogenous explanatory variables in a regression model
IV approach aims to address the problem of endogeneity and obtain consistent estimates of the causal effect of an explanatory variable on the dependent variable

Endogeneity problem

Endogeneity occurs when an explanatory variable is correlated with the error term in a regression model
Causes of endogeneity include omitted variable bias, measurement error, and simultaneity
Endogeneity leads to biased and inconsistent estimates of the causal effect of the explanatory variable on the dependent variable
Example: Estimating the effect of education on earnings, where unobserved ability is correlated with both education and earnings (omitted variable bias)

Causal inference challenges

Establishing causal relationships is crucial for policy analysis and decision-making in economics
Randomized controlled trials (RCTs) are the gold standard for causal inference but are not always feasible or ethical
Observational data often suffers from endogeneity issues, making it difficult to identify causal effects
Instrumental variables provide a way to estimate causal effects using observational data when certain assumptions are met

Instrumental variables approach

The instrumental variables approach relies on finding a variable (the instrument) that is correlated with the endogenous explanatory variable but uncorrelated with the error term
The instrument affects the dependent variable only through its effect on the endogenous explanatory variable
The IV approach estimates the causal effect of the explanatory variable on the dependent variable by using the variation in the explanatory variable that is driven by the instrument

Relevance condition

The relevance condition requires that the instrument is correlated with the endogenous explanatory variable
A weak correlation between the instrument and the endogenous explanatory variable can lead to weak instrument problems and biased estimates
The first-stage regression in the two-stage least squares (2SLS) procedure tests the relevance condition
Example: Using the distance to the nearest college as an instrument for education, the instrument should be correlated with the individual's level of education

Exclusion restriction

The exclusion restriction assumes that the instrument affects the dependent variable only through its effect on the endogenous explanatory variable
The instrument should not have a direct effect on the dependent variable or be correlated with any omitted variables that affect the dependent variable
Violations of the exclusion restriction can lead to biased estimates of the causal effect
Example: The distance to the nearest college (instrument) should not directly affect an individual's earnings (dependent variable), except through its effect on education (endogenous explanatory variable)

Exogeneity assumption

The exogeneity assumption requires that the instrument is uncorrelated with the error term in the regression model
This assumption ensures that the variation in the endogenous explanatory variable captured by the instrument is exogenous and not related to unobserved factors affecting the dependent variable
Violations of the exogeneity assumption can lead to biased estimates of the causal effect
Example: The distance to the nearest college (instrument) should not be correlated with unobserved factors (e.g., family background) that affect both education and earnings

Types of instrumental variables

Various types of instrumental variables can be used depending on the research question and available data
The choice of instrument should be based on theoretical arguments and empirical evidence supporting the relevance and exclusion restriction assumptions

Natural experiments

Natural experiments are events or policies that create exogenous variation in the endogenous explanatory variable
These events or policies are often used as instruments because they are unlikely to be related to unobserved factors affecting the dependent variable
Examples of natural experiments include policy changes, natural disasters, and birthplace or birth timing
Example: Using the Vietnam War draft lottery as an instrument for military service to estimate the effect of military service on future earnings

Lagged variables

Lagged values of the endogenous explanatory variable or other variables can sometimes be used as instruments
The idea is that past values of a variable may be correlated with its current value but uncorrelated with the current error term
Lagged variables are more likely to be valid instruments in settings with time-series or panel data
Example: Using lagged advertising expenditure as an instrument for current advertising expenditure to estimate the effect of advertising on sales

Geographical variations

Geographical variations in policies, infrastructure, or other factors can be used as instruments
The idea is that these variations are often determined by historical or institutional factors that are exogenous to the individuals or firms being studied
Geographical instruments are more likely to be valid in settings with cross-sectional or panel data
Example: Using the presence of a land-grant college in a county as an instrument for individual education to estimate the effect of education on earnings

Two-stage least squares (2SLS)

Two-stage least squares (2SLS) is a commonly used estimation procedure for instrumental variables regression
2SLS involves two regression stages that aim to isolate the exogenous variation in the endogenous explanatory variable and use it to estimate the causal effect on the dependent variable

First stage regression

In the first stage regression, the endogenous explanatory variable is regressed on the instrument(s) and any other exogenous control variables
The predicted values from this regression represent the exogenous variation in the endogenous explanatory variable that is driven by the instrument(s)
The first stage regression tests the relevance condition and provides evidence on the strength of the instrument(s)
Example: Regressing education on the distance to the nearest college and other control variables to obtain predicted values of education

Second stage regression

In the second stage regression, the dependent variable is regressed on the predicted values of the endogenous explanatory variable from the first stage and any other exogenous control variables
The coefficient on the predicted values of the endogenous explanatory variable represents the causal effect of interest
The standard errors in the second stage regression need to be adjusted to account for the two-stage estimation procedure
Example: Regressing earnings on the predicted values of education from the first stage and other control variables to estimate the causal effect of education on earnings

Interpreting 2SLS estimates

The 2SLS estimates can be interpreted as the local average treatment effect (LATE) for the subpopulation of individuals who are affected by the instrument (compliers)
The LATE may differ from the average treatment effect (ATE) for the entire population if the treatment effect is heterogeneous
The interpretation of the 2SLS estimates depends on the validity of the instrument and the assumptions underlying the IV approach
Example: The 2SLS estimate of the effect of education on earnings represents the average return to education for individuals whose education level was affected by their distance to the nearest college

Validity tests for instruments

Several tests can be used to assess the validity of instruments and the assumptions underlying the IV approach
These tests provide evidence on the relevance, exclusion restriction, and exogeneity of the instruments

Weak instruments problem

Weak instruments are instruments that are only weakly correlated with the endogenous explanatory variable
Weak instruments can lead to biased 2SLS estimates and unreliable inference
The first-stage F-statistic and the Cragg-Donald Wald F-statistic can be used to test for weak instruments
A rule of thumb is that the first-stage F-statistic should be greater than 10 for the instruments to be considered strong
Example: Testing whether the distance to the nearest college is a strong instrument for education by examining the first-stage F-statistic

Overidentifying restrictions test

When there are more instruments than endogenous explanatory variables (overidentified model), the overidentifying restrictions test can be used to assess the validity of the instruments
The test checks whether the instruments are uncorrelated with the error term in the second stage regression
Rejecting the null hypothesis of the test suggests that at least one of the instruments is not valid
The Sargan test and the Hansen J test are commonly used overidentifying restrictions tests
Example: Using the Sargan test to check the validity of multiple instruments (e.g., distance to the nearest college and local unemployment rate) for education

Hausman test for endogeneity

The Hausman test can be used to test for the presence of endogeneity in the explanatory variable
The test compares the OLS and 2SLS estimates and checks whether their difference is statistically significant
Rejecting the null hypothesis of the test suggests that the explanatory variable is endogenous and the IV approach is necessary
The Hausman test can help justify the use of instrumental variables in a regression analysis
Example: Using the Hausman test to check whether education is endogenous in the earnings regression and whether the IV approach is necessary

Applications of instrumental variables

Instrumental variables have been widely used in various fields of economics to estimate causal effects and address endogeneity issues
Some common applications include supply and demand estimation, policy evaluation, and labor economics

Supply and demand estimation

IV approach can be used to estimate the price elasticity of supply and demand when prices and quantities are jointly determined (simultaneity bias)
Common instruments for price include cost shifters (e.g., input prices, weather shocks) and demand shifters (e.g., income, population)
Example: Using weather shocks as an instrument for crop prices to estimate the price elasticity of supply for agricultural products

Policy evaluation examples

IV approach can be used to evaluate the causal effects of policies or interventions when there are endogeneity issues (e.g., self-selection, omitted variables)
Instruments can be based on policy changes, eligibility rules, or other exogenous factors that affect the treatment variable
Example: Using the introduction of compulsory schooling laws as an instrument for education to estimate the causal effect of education on health outcomes

Limitations and criticisms

IV approach relies on strong assumptions (relevance, exclusion restriction, exogeneity) that may not always hold in practice
Finding valid instruments can be challenging and requires careful theoretical and empirical justification
IV estimates are local average treatment effects (LATE) and may not generalize to the entire population
IV approach can be sensitive to specification choices and the choice of instruments
Some critics argue that IV approach does not fully address the endogeneity problem and may introduce new biases

Advanced topics in instrumental variables

Several advanced topics in instrumental variables have been developed to address limitations and extend the applicability of the IV approach
These topics include heterogeneous treatment effects, nonlinear models, and weak and many instruments

Heterogeneous treatment effects

The IV approach can be extended to estimate heterogeneous treatment effects when the causal effect of the explanatory variable varies across individuals
The marginal treatment effect (MTE) framework can be used to estimate treatment effect heterogeneity and construct policy-relevant parameters
Example: Estimating the heterogeneous returns to education across individuals with different levels of unobserved ability using the MTE framework

Nonlinear models with instruments

The IV approach can be adapted to estimate causal effects in nonlinear models (e.g., probit, logit, Poisson)
Two-stage residual inclusion (2SRI) and control function approaches are commonly used for nonlinear models with endogenous explanatory variables
Example: Using the control function approach to estimate the causal effect of health insurance on healthcare utilization in a count data model

Weak and many instruments

Weak instruments and many instruments (relative to the sample size) can lead to biased IV estimates and unreliable inference
Weak instrument robust inference methods (e.g., Anderson-Rubin test, conditional likelihood ratio test) have been developed to address the weak instruments problem
Many weak instruments can be combined using the Jackknife IV estimator (JIVE) or the limited information maximum likelihood (LIML) estimator
Example: Using the LIML estimator to estimate the returns to education when there are many weak instruments based on interactions between birth year and birth quarter

🎳Intro to Econometrics Unit 9 Review

9.2 Instrumental variables

🎳Intro to Econometrics Unit 9 Review

9.2 Instrumental variables

Unit & Topic Study Guides

Instrumental variables overview

Endogeneity problem

Causal inference challenges

Instrumental variables approach

Relevance condition

Exclusion restriction

Exogeneity assumption

Types of instrumental variables

Natural experiments

Lagged variables

Geographical variations

Two-stage least squares (2SLS)

First stage regression

Second stage regression

Interpreting 2SLS estimates

Validity tests for instruments

Weak instruments problem

Overidentifying restrictions test

Hausman test for endogeneity

Applications of instrumental variables

Supply and demand estimation

Policy evaluation examples

Limitations and criticisms

Advanced topics in instrumental variables

Heterogeneous treatment effects

Nonlinear models with instruments

Weak and many instruments

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

🎳Intro to Econometrics
Unit 9 Review