One-way ANOVA is a statistical method that compares means across three or more groups to determine if significant differences exist. It's used when you have a categorical independent variable with three or more levels and a continuous, normally distributed dependent variable.
The test calculates an F-statistic, which compares between-group variability to within-group variability. A large F-statistic and small p-value suggest significant differences among group means. Post-hoc tests and effect size calculations help pinpoint specific differences and their magnitude.
One-Way ANOVA
Purpose of one-way ANOVA
- Compares means across three or more groups or populations to determine if significant differences exist
- Used when independent variable is categorical with three or more levels (groups) and dependent variable is continuous and normally distributed within each group
- Assumes equal variances of dependent variable across all groups (homogeneity of variances)
- Commonly applied to compare effectiveness of different treatments, evaluate impact of factors on continuous outcome variables, or determine significant differences in performance across multiple groups (e.g., comparing test scores of students from different schools)
Hypotheses in one-way ANOVA
- Null hypothesis ($H_0$) states all group means are equal: $H_0: \mu_1 = \mu_2 = ... = \mu_k$, where $k$ is the number of groups
- Alternative hypothesis ($H_a$) states at least one group mean differs: $H_a: \text{At least one } \mu_i \text{ is different}$
- If p-value < chosen significance level (typically 0.05), reject $H_0$ and conclude significant difference among group means
- If p-value > chosen significance level, fail to reject $H_0$ and conclude insufficient evidence to suggest significant difference among group means
F-statistic and p-value interpretation
- F-statistic calculated as ratio of between-group variability to within-group variability: $F = \frac{\text{Between-group variability}}{\text{Within-group variability}}$
- F-statistic follows F-distribution with $(k-1)$ and $(N-k)$ degrees of freedom, where $k$ is number of groups and $N$ is total sample size
- Larger F-statistic indicates greater difference among group means relative to within-group variability
- p-value represents probability of obtaining F-statistic as extreme as or more extreme than observed value, assuming $H_0$ is true
- Small p-value (< 0.05) suggests observed differences among group means unlikely due to chance, providing evidence against $H_0$
Assumptions of one-way ANOVA
- Independence: Observations within each group must be independent
- Violation occurs when observations within groups are related or dependent
- Normality: Dependent variable should be approximately normally distributed within each group
- Violation occurs when distribution is heavily skewed or has outliers in one or more groups
- Homogeneity of variances: Variances of dependent variable must be equal across all groups
- Violation (heteroscedasticity) occurs when variances substantially differ across groups
- Use diagnostic plots (residual plots, Q-Q plots) to assess normality and homogeneity of variances
- Conduct formal tests (Levene's test, Bartlett's test) to assess equality of variances
- If assumptions violated, consider data transformations (log transformation), non-parametric alternatives (Kruskal-Wallis test) for severe normality violations, or robust methods (Welch's ANOVA) for heteroscedasticity
Post-Hoc Tests and Effect Size
Conduct post-hoc tests to determine which specific group means differ significantly
- If one-way ANOVA yields significant F-statistic, use post-hoc tests to determine which specific group means differ
- Common post-hoc tests: Tukey's HSD, Bonferroni correction, Scheffe's test, Dunnett's test (compares multiple treatments to control)
- Post-hoc tests control familywise error rate (FWER) or false discovery rate (FDR) to maintain overall Type I error rate at desired level (0.05)
- Identify pairwise comparisons with p-values < adjusted significance level (e.g., 0.05 divided by number of comparisons for Bonferroni)
- Conclude corresponding group means are significantly different
Calculate and interpret the effect size in one-way ANOVA
- Effect size measures magnitude of difference among group means, independent of sample size
- Common effect size measures:
- Eta-squared ($\eta^2$): Proportion of total variance in dependent variable explained by independent variable (groups), $\eta^2 = \frac{\text{Sum of squares between groups}}{\text{Total sum of squares}}$
- Omega-squared ($\omega^2$): Unbiased estimate of population effect size, $\omega^2 = \frac{\text{Sum of squares between groups} - (k-1) \times \text{Mean square within groups}}{\text{Total sum of squares} + \text{Mean square within groups}}$
- Cohen's guidelines for $\eta^2$: 0.01 (small), 0.06 (medium), 0.14 (large)
- Larger effect sizes indicate greater proportion of variance in dependent variable explained by independent variable (groups)
- Reporting effect size with F-statistic and p-value provides comprehensive understanding of results and facilitates cross-study comparisons