🥖Linear Modeling Theory Unit 4 Review

4.3 Assessing Normality and Homoscedasticity

🥖Linear Modeling Theory
Unit 4 Review

4.3 Assessing Normality and Homoscedasticity

Written by the Fiveable Content Team • Last updated September 2025

🥖Linear Modeling Theory

Unit & Topic Study Guides

4.1 Residual Analysis and Plots

4.2 Detecting Outliers and Influential Observations

4.3 Assessing Normality and Homoscedasticity

4.4 Transformations and Weighted Least Squares

Assessing normality and homoscedasticity is crucial for validating linear regression models. These assumptions ensure reliable estimates and inferences. Violations can lead to biased coefficients and incorrect conclusions about relationships between variables.

Graphical methods and statistical tests help detect assumption violations. Residual plots, normality tests, and heteroscedasticity checks guide researchers in identifying issues. Understanding these diagnostics is essential for making informed decisions about model validity and potential remedial measures.

Normality and Homoscedasticity Assumptions

Understanding the Assumptions

Normality assumption states that the residuals (errors) of a linear regression model should follow a normal distribution with a mean of zero
Homoscedasticity assumption requires that the variance of the residuals is constant across all levels of the independent variable(s)
- Homoscedasticity implies that the spread of the residuals should be consistent, without any systematic patterns or changes in variance
Violations of these assumptions can lead to biased and inefficient estimates of the regression coefficients and standard errors

Implications of Violated Assumptions

Non-normality of residuals can affect the validity of hypothesis tests and confidence intervals, as they rely on the assumption of normally distributed errors
- Example: If the residuals are heavily skewed or have outliers, the t-tests and confidence intervals for the regression coefficients may be unreliable
Heteroscedasticity (non-constant variance) can result in incorrect standard errors and p-values, leading to invalid inferences about the significance of the regression coefficients
- Example: If the variance of the residuals increases with higher values of the independent variable, the standard errors may be underestimated, resulting in overly optimistic p-values and potentially false conclusions about the significance of the coefficients

Assessing Residual Normality

Graphical Methods

Visual inspection of residual plots can provide insights into the normality assumption
- Histogram of residuals should exhibit a bell-shaped, symmetric distribution around zero
- Normal probability plot (Q-Q plot) of residuals should show points close to a straight diagonal line if the residuals are normally distributed
  - Example: If the Q-Q plot shows a systematic departure from the diagonal line, such as an S-shaped pattern, it suggests non-normality of the residuals

Statistical Tests

Shapiro-Wilk test is a commonly used statistical test for assessing normality of residuals
- The null hypothesis of the Shapiro-Wilk test is that the residuals are normally distributed
- A small p-value (typically < 0.05) indicates a deviation from normality, while a large p-value suggests the residuals are consistent with a normal distribution
Kolmogorov-Smirnov test is another statistical test for normality, comparing the empirical cumulative distribution function of the residuals to the theoretical normal distribution
Skewness and kurtosis measures can also be used to assess the symmetry and heaviness of the tails of the residual distribution, respectively
- Example: A skewness value close to zero indicates a symmetric distribution, while a positive or negative skewness suggests right or left skewness, respectively

Detecting Heteroscedasticity

Residual Plots

Residual plots can reveal patterns of heteroscedasticity
- Plotting residuals against the predicted values (fitted values) of the dependent variable
- If the spread of residuals increases or decreases systematically with the predicted values, it indicates the presence of heteroscedasticity
  - Example: If the residual plot shows a fan-shaped pattern, with the spread of residuals increasing as the predicted values increase, it suggests heteroscedasticity

Statistical Tests

Breusch-Pagan test is a statistical test for detecting heteroscedasticity
- The null hypothesis is that the variance of the residuals is constant (homoscedasticity)
- A small p-value (typically < 0.05) suggests the presence of heteroscedasticity
White's test is another statistical test for heteroscedasticity that does not assume a specific form of heteroscedasticity
Goldfeld-Quandt test compares the variance of residuals between two subsamples of the data, typically split based on the values of an independent variable suspected to cause heteroscedasticity
- Example: If the variance of residuals is significantly different between the subsamples, it indicates heteroscedasticity related to that independent variable

Consequences of Violated Assumptions

Impact on Coefficient Estimates and Inferences

Violation of normality assumption can lead to biased and unreliable estimates of regression coefficients and standard errors
- Non-normal residuals can affect the validity of hypothesis tests and confidence intervals, leading to incorrect conclusions
- In severe cases of non-normality, the least squares estimators may not be the most efficient or appropriate
Heteroscedasticity can result in inefficient estimates of regression coefficients and biased standard errors
- The standard errors of the regression coefficients may be underestimated or overestimated, affecting the reliability of hypothesis tests and confidence intervals
- Heteroscedasticity can lead to incorrect conclusions about the significance of the independent variables

Remedial Measures

Violations of these assumptions can impact the reliability and validity of the linear regression model and its inferences
Remedial measures, such as data transformations or robust regression techniques, may be necessary to address violations of normality and homoscedasticity assumptions
- Example: Applying a logarithmic transformation to the dependent variable can sometimes help stabilize the variance and improve normality of residuals
- Example: Using weighted least squares regression or robust regression methods (e.g., Huber-White standard errors) can account for heteroscedasticity and provide more reliable estimates and inferences

🥖Linear Modeling Theory Unit 4 Review

4.3 Assessing Normality and Homoscedasticity

🥖Linear Modeling Theory
Unit 4 Review

4.3 Assessing Normality and Homoscedasticity

Unit & Topic Study Guides

Normality and Homoscedasticity Assumptions

Understanding the Assumptions

Implications of Violated Assumptions

Assessing Residual Normality

Graphical Methods

Statistical Tests

Detecting Heteroscedasticity

Residual Plots

Statistical Tests

Consequences of Violated Assumptions

Impact on Coefficient Estimates and Inferences

Remedial Measures

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

Study Content & Tools

Company

Resources

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes