📊Experimental Design Unit 6 Review

6.4 Assumptions and diagnostics for ANOVA

📊Experimental Design
Unit 6 Review

6.4 Assumptions and diagnostics for ANOVA

Written by the Fiveable Content Team • Last updated September 2025

📊Experimental Design

Unit & Topic Study Guides

6.1 One-way ANOVA

6.2 Two-way ANOVA

6.3 Multifactor ANOVA

6.4 Assumptions and diagnostics for ANOVA

ANOVA assumptions are crucial for valid results. Normality, homogeneity of variance, and independence must be checked. Violations can lead to incorrect conclusions, so it's important to assess these assumptions using visual and formal methods.

Diagnostic tests help evaluate ANOVA assumptions. Residual plots and formal tests like Levene's and Shapiro-Wilk are used. If violations occur, data transformations or robust methods can address issues, ensuring reliable analysis and interpretation of results.

Assumptions

Normality and Its Assessment

Normality assumes the residuals (differences between observed and predicted values) are normally distributed
Violations of normality can lead to inaccurate p-values and confidence intervals
Assess normality visually using Q-Q plots or histograms of residuals
- Q-Q plots compare the distribution of residuals to a theoretical normal distribution
- Histograms should show a bell-shaped curve for normally distributed residuals
Formally test normality using the Shapiro-Wilk test
- Null hypothesis: residuals are normally distributed
- P-value < 0.05 suggests a significant departure from normality

Homogeneity of Variance and Independence

Homogeneity of variance (homoscedasticity) assumes equal variances across groups
- Violations (heteroscedasticity) can affect the validity of F-tests and lead to incorrect conclusions
- Assess homogeneity visually using residual plots (residuals vs. fitted values)
  - Patterns or increasing/decreasing spread indicate heteroscedasticity
- Formally test homogeneity using Levene's test
  - Null hypothesis: variances are equal across groups
  - P-value < 0.05 suggests significant differences in variances
Independence of observations assumes that observations within and between groups are not related
- Violations can occur due to repeated measures, clustering, or spatial/temporal correlation
- Assess independence by examining the study design and data collection process
- Violations may require alternative models (repeated measures ANOVA, mixed models)

Diagnostic Tests

Residual Plots for Assessing Assumptions

Residual plots are graphical tools for assessing ANOVA assumptions
Residuals vs. Fitted plot
- Assess homogeneity of variance
- Look for patterns, increasing/decreasing spread, or outliers
Normal Q-Q plot
- Assess normality of residuals
- Compare residuals to a theoretical normal distribution
- Deviations from a straight line indicate non-normality
Scale-Location plot
- Assess homogeneity of variance
- Look for patterns or increasing/decreasing spread
Residuals vs. Leverage plot
- Identify influential observations
- Points with high leverage and large residuals may have a strong influence on the model

Formal Tests for Assumptions

Levene's test for homogeneity of variance
- Null hypothesis: variances are equal across groups
- P-value < 0.05 suggests significant differences in variances
- Robust to non-normality, but sensitive to large sample sizes
Shapiro-Wilk test for normality
- Null hypothesis: residuals are normally distributed
- P-value < 0.05 suggests a significant departure from normality
- More powerful than visual assessment, but sensitive to large sample sizes
- Alternative: Anderson-Darling test

Addressing Violations

Data Transformations

Transformations can help stabilize variances and improve normality
Common transformations: logarithmic, square root, reciprocal
- Logarithmic: $log(x)$ or $log(x+1)$ for data with zero values
- Square root: $\sqrt{x}$ for data with a Poisson distribution
- Reciprocal: $\frac{1}{x}$ for data with a strong right skew
Choose a transformation based on the nature of the data and the severity of the violation
Interpret results on the transformed scale or back-transform for interpretation

Robust ANOVA Methods and Non-Parametric Alternatives

Robust ANOVA methods are less sensitive to violations of assumptions
- Welch's ANOVA: does not assume equal variances
- Trimmed means ANOVA: robust to non-normality and outliers
- Bootstrapping: resampling method to obtain robust confidence intervals and p-values
Non-parametric alternatives do not rely on distributional assumptions
- Kruskal-Wallis test: rank-based test for comparing medians across groups
- Friedman test: rank-based test for repeated measures designs
- Permutation tests: resampling method to obtain exact p-values
Consider the trade-offs between robustness and power when selecting an alternative method

📊Experimental Design Unit 6 Review

6.4 Assumptions and diagnostics for ANOVA

📊Experimental Design
Unit 6 Review

6.4 Assumptions and diagnostics for ANOVA

Unit & Topic Study Guides

Assumptions

Normality and Its Assessment

Homogeneity of Variance and Independence

Diagnostic Tests

Residual Plots for Assessing Assumptions

Formal Tests for Assumptions

Addressing Violations

Data Transformations

Robust ANOVA Methods and Non-Parametric Alternatives

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

Study Content & Tools

Company

Resources

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes