Fiveable

๐Ÿฅ–Linear Modeling Theory Unit 10 Review

QR code for Linear Modeling Theory practice questions

10.3 Multiple Comparisons and Post-hoc Tests

๐Ÿฅ–Linear Modeling Theory
Unit 10 Review

10.3 Multiple Comparisons and Post-hoc Tests

Written by the Fiveable Content Team โ€ข Last updated September 2025
Written by the Fiveable Content Team โ€ข Last updated September 2025
๐Ÿฅ–Linear Modeling Theory
Unit & Topic Study Guides

After conducting a one-way ANOVA, we often need to dig deeper to understand which groups differ. Multiple comparisons and post-hoc tests help us do this while controlling for increased error rates. These methods let us compare groups pairwise and figure out where the differences lie.

Choosing the right post-hoc test depends on factors like group numbers and sample sizes. Popular tests include Tukey's HSD and Bonferroni correction. Each has its strengths and weaknesses, balancing error control and statistical power. Understanding these trade-offs is key to interpreting results accurately.

Multiple Comparisons in ANOVA

The Need for Multiple Comparison Procedures

  • A significant ANOVA result indicates at least one group differs from the others, but does not specify which group(s) differ or the direction of the differences
  • Conducting multiple pairwise comparisons without adjusting for the increased probability of making a Type I error can lead to an inflated familywise error rate
    • Type I error involves rejecting a true null hypothesis
    • Familywise error rate represents the probability of making at least one Type I error across all comparisons in a family of tests
  • Multiple comparison procedures, also known as post-hoc tests, control the familywise error rate while allowing for pairwise comparisons between groups (Tukey's HSD, Bonferroni correction)
  • The choice of multiple comparison procedure depends on factors such as the number of groups, sample sizes, and the desired balance between power and Type I error control

Factors Influencing the Choice of Multiple Comparison Procedure

  • The number of groups being compared affects the choice of procedure
    • Some procedures are more suitable for a large number of groups (Tukey's HSD), while others may become overly conservative (Bonferroni correction)
  • Sample sizes and their equality across groups can impact the validity of certain procedures
    • Equal sample sizes are assumed by some procedures (Tukey's HSD)
    • Unequal sample sizes may require alternative procedures or modifications (Scheffรฉ's test, Games-Howell test)
  • The desired balance between power and Type I error control guides the selection of a procedure
    • More conservative procedures prioritize Type I error control at the expense of reduced power (Bonferroni correction)
    • Less conservative procedures may have higher power but a higher risk of Type I errors (Tukey's HSD)

Post-Hoc Tests for Pairwise Comparisons

Tukey's Honestly Significant Difference (HSD) Test

  • Tukey's HSD is a widely used post-hoc test that controls the familywise error rate for all pairwise comparisons
  • It calculates a critical value based on the studentized range distribution, which accounts for the number of groups and the degrees of freedom for the error term in the ANOVA
  • Pairwise differences between group means are compared to the critical value to determine statistical significance
  • Tukey's HSD assumes equal sample sizes and homogeneity of variances across groups
    • Violations of these assumptions may affect the validity of the test results

Bonferroni Correction

  • The Bonferroni correction is a simple and conservative approach to controlling the familywise error rate
  • It involves dividing the desired alpha level by the number of comparisons
    • The Bonferroni-adjusted alpha level is used as the criterion for determining statistical significance for each pairwise comparison
  • The Bonferroni correction can be overly conservative, especially when the number of comparisons is large, leading to reduced power to detect true differences

Other Common Post-Hoc Tests

  • Scheffรฉ's test is a conservative procedure that can be used with unequal sample sizes and is robust to violations of homogeneity of variances
  • Dunnett's test is used for comparing each group to a control group, rather than all pairwise comparisons
  • The Holm-Bonferroni method is a step-down procedure that is less conservative than the standard Bonferroni correction
    • It sequentially adjusts the alpha level for each comparison based on the rank of the p-values

Interpreting Post-Hoc Test Results

Statistical Significance and Direction of Differences

  • Post-hoc tests provide p-values and/or confidence intervals for each pairwise comparison, indicating the statistical significance and the direction of the differences between groups
  • A statistically significant pairwise difference suggests that the population means of the two groups being compared are likely to be different, given the observed sample means and the chosen alpha level
  • The direction of the difference can be determined by comparing the sample means or examining the sign of the difference (positive or negative)

Non-Significant Differences and Practical Significance

  • Non-significant pairwise differences suggest that there is insufficient evidence to conclude that the population means of the two groups differ, given the observed sample means and the chosen alpha level
  • When interpreting post-hoc test results, it is important to consider the practical significance of the differences in addition to statistical significance
    • Practical significance refers to the magnitude and relevance of the differences in the context of the research question
  • The context of the research question and the limitations of the study design should also be considered when drawing conclusions from post-hoc test results

Multiple Comparison Procedures: Trade-offs

Balancing Type I Error Control and Power

  • Multiple comparison procedures control the familywise error rate at the expense of reduced power to detect true differences between groups
  • More conservative procedures, such as the Bonferroni correction, provide stronger control over Type I errors but may have lower power, especially when the number of comparisons is large
  • Less conservative procedures, such as Tukey's HSD, may have higher power but may also have a higher risk of Type I errors
  • The choice of multiple comparison procedure should be based on the specific research question, the number of groups, the desired balance between Type I error control and power, and the assumptions of the test

Assumptions and Limitations

  • Some multiple comparison procedures, such as Tukey's HSD, assume equal sample sizes and homogeneity of variances across groups
    • Violations of these assumptions may affect the validity of the test results
    • Alternative procedures or modifications may be necessary when assumptions are violated (Scheffรฉ's test, Games-Howell test)
  • Researchers should be aware of the assumptions and limitations of the chosen multiple comparison procedure and consider them when interpreting the results

Transparency and Justification

  • Researchers should be transparent about the multiple comparison procedure used and justify their choice based on the study design and research objectives
  • Providing a clear rationale for the selected procedure helps readers understand the trade-offs and limitations of the analysis
  • Transparency in reporting also facilitates the reproducibility and critical evaluation of the research findings