🐛Biostatistics Unit 6 Review

6.4 Post-hoc tests and multiple comparisons

🐛Biostatistics
Unit 6 Review

6.4 Post-hoc tests and multiple comparisons

Written by the Fiveable Content Team • Last updated September 2025

🐛Biostatistics

Unit & Topic Study Guides

6.1 One-way ANOVA and its assumptions

6.2 Two-way ANOVA and factorial designs

6.3 Repeated measures ANOVA

6.4 Post-hoc tests and multiple comparisons

When conducting ANOVA, post-hoc tests help pinpoint which group means differ significantly. These tests are crucial when the overall F-test shows differences exist, but doesn't specify where. They're essential for maintaining accuracy in multiple comparisons.

Post-hoc tests adjust for increased Type I error risk when comparing multiple groups. By controlling the familywise error rate or false discovery rate, they ensure the overall significance level stays at the desired 0.05. This balance between finding differences and avoiding false positives is key in biological experiments.

Post-Hoc Tests in ANOVA

The Need for Post-Hoc Tests

Post-hoc tests are used in ANOVA when the overall F-test is significant, indicating that at least one group mean differs from the others
- Post-hoc tests identify which specific group means are significantly different from each other
Multiple comparisons arise when performing multiple hypothesis tests simultaneously on the same data set
- In ANOVA, this occurs when comparing more than two group means
The use of post-hoc tests and multiple comparison adjustments is necessary to maintain the desired Type I error rate (usually α = 0.05) across all comparisons
- Without these adjustments, the probability of making a Type I error increases with the number of comparisons performed

Multiple Comparisons and Type I Error

Multiple comparisons occur when conducting multiple hypothesis tests on the same data set simultaneously
- Each additional comparison increases the likelihood of making a Type I error (rejecting a true null hypothesis)
The familywise error rate (FWER) is the probability of making at least one Type I error across all comparisons
- Controlling the FWER ensures that the overall Type I error rate is maintained at the desired level (0.05)
Post-hoc tests and multiple comparison adjustments are crucial for maintaining the desired Type I error rate in ANOVA with multiple group means
- These methods adjust the significance level for each comparison to account for the increased risk of Type I errors

Choosing Post-Hoc Tests

Common Post-Hoc Tests for One-Way ANOVA

Tukey's Honestly Significant Difference (HSD) test is a commonly used post-hoc test for one-way ANOVA
- Controls the familywise error rate and is appropriate when sample sizes are equal and all pairwise comparisons are of interest
- Calculates the minimum difference between group means required for significance, based on the studentized range distribution
Bonferroni correction is a simple and conservative method for adjusting the significance level for multiple comparisons
- The adjusted significance level is calculated as $α/m$, where $m$ is the number of comparisons
- Can be applied to various ANOVA designs, but may be overly conservative when the number of comparisons is large
Dunnett's test is used when comparing each treatment group to a control group in a one-way ANOVA
- Maintains the familywise error rate and is more powerful than the Bonferroni correction for this specific comparison type
- Calculates critical values based on the multivariate t-distribution
Scheffe's test is a conservative post-hoc test that can be used for all possible contrasts in a one-way ANOVA, not just pairwise comparisons
- Appropriate when the number of comparisons is large or when the comparisons are not planned in advance
- Uses the F-distribution to calculate critical values, making it more conservative than other methods

Post-Hoc Tests for Factorial ANOVA

For factorial ANOVA designs, post-hoc tests such as Tukey's HSD or Bonferroni correction can be applied to main effects or simple effects when the corresponding F-test is significant
- Main effects compare the means of each level of a factor, averaged across all levels of the other factor(s)
- Simple effects compare the means of one factor at a specific level of another factor
When interpreting post-hoc tests in factorial ANOVA, consider the presence of interactions between factors
- If a significant interaction exists, focus on interpreting simple effects rather than main effects
Adjust the significance level for the number of comparisons within each main effect or simple effect
- For example, in a 2x3 factorial ANOVA, there are 3 comparisons for the main effect of the factor with 3 levels, and 2 comparisons for each simple effect of one factor at each level of the other factor

Interpreting Post-Hoc Results

Understanding Adjusted P-Values and Significance

Post-hoc test results are typically presented as adjusted p-values for each pairwise comparison
- A comparison is considered statistically significant if the adjusted p-value is less than the chosen significance level (0.05)
Adjusted p-values account for the multiple comparisons performed, ensuring that the familywise error rate or false discovery rate is controlled
- The specific adjustment method (Bonferroni, Holm-Bonferroni, Tukey's HSD, etc.) should be reported along with the results

Practical Significance and Confidence Intervals

When interpreting post-hoc test results, it is essential to consider the practical significance of the observed differences in addition to statistical significance
- The magnitude of the differences between group means and the context of the research question should be taken into account
- A statistically significant difference may not always be practically meaningful or relevant
Confidence intervals for the difference between group means can provide additional information about the precision and practical significance of the observed differences
- A 95% confidence interval that does not contain zero indicates a statistically significant difference at the 0.05 level
- The width of the confidence interval reflects the precision of the estimated difference, with narrower intervals indicating greater precision

Drawing Valid Conclusions

When drawing conclusions based on post-hoc tests, researchers should be cautious about making claims that extend beyond the specific comparisons tested and the study's design limitations
- Post-hoc tests provide information about pairwise differences between group means, but do not establish causal relationships or generalize beyond the study sample
Consider the context of the research question, the study design, and any potential confounding variables when interpreting post-hoc test results
- Differences between group means may be attributed to factors other than the independent variable of interest
Be transparent about the post-hoc tests performed, the adjustment methods used, and any limitations or assumptions of the analysis
- Clearly state which comparisons were planned a priori and which were conducted post-hoc

Controlling Type I Error in ANOVA

Familywise Error Rate (FWER) Control Methods

The familywise error rate (FWER) is the probability of making at least one Type I error across all comparisons
- Controlling the FWER ensures that the overall Type I error rate is maintained at the desired level (0.05)
The Bonferroni correction is a simple method for controlling the FWER by dividing the desired significance level by the number of comparisons
- While conservative, it can be applied to various ANOVA designs and post-hoc tests
- The adjusted significance level is $α/m$, where $α$ is the desired familywise error rate and $m$ is the number of comparisons
The Holm-Bonferroni method is a step-down procedure that offers more power than the standard Bonferroni correction while still controlling the FWER
- It sequentially adjusts the significance level for each comparison based on the rank of the corresponding p-value
- Begin with the smallest p-value and compare it to $α/(m - i + 1)$, where $i$ is the rank of the p-value, and proceed until a non-significant result is obtained

False Discovery Rate (FDR) Control Methods

The false discovery rate (FDR) is an alternative to the FWER that controls the expected proportion of false positives among all significant results
- The FDR is less conservative than FWER-controlling methods and may be preferred when a higher number of false positives is acceptable in exchange for increased power
The Benjamini-Hochberg procedure is a popular method for controlling the FDR
- Sort the p-values from smallest to largest and assign ranks ($i$) to each p-value
- Compare each p-value to $(i/m)q$, where $m$ is the total number of comparisons and $q$ is the desired FDR level
- The largest p-value that satisfies $p_i ≤ (i/m)q$ and all smaller p-values are considered significant

Choosing a Multiple Comparison Adjustment Method

When selecting a multiple comparison adjustment method, researchers should consider factors such as:
- The desired balance between Type I and Type II error rates
- The number and type of comparisons (pairwise, many-to-one, etc.)
- The specific research question and study design
- The assumptions and limitations of each method
In general, FWER-controlling methods (Bonferroni, Holm-Bonferroni, Tukey's HSD) are more conservative and prioritize controlling Type I errors
- These methods are appropriate when the cost of a Type I error is high or when the number of comparisons is relatively small
FDR-controlling methods (Benjamini-Hochberg) are less conservative and prioritize maintaining power while controlling the proportion of false positives
- These methods may be preferred when the cost of a Type II error is high, when the number of comparisons is large, or when some false positives are acceptable in exchange for increased power

🐛Biostatistics Unit 6 Review

6.4 Post-hoc tests and multiple comparisons

🐛Biostatistics Unit 6 Review

6.4 Post-hoc tests and multiple comparisons

Unit & Topic Study Guides

Post-Hoc Tests in ANOVA

The Need for Post-Hoc Tests

Multiple Comparisons and Type I Error

Choosing Post-Hoc Tests

Common Post-Hoc Tests for One-Way ANOVA

Post-Hoc Tests for Factorial ANOVA

Interpreting Post-Hoc Results

Understanding Adjusted P-Values and Significance

Practical Significance and Confidence Intervals

Drawing Valid Conclusions

Controlling Type I Error in ANOVA

Familywise Error Rate (FWER) Control Methods

False Discovery Rate (FDR) Control Methods

Choosing a Multiple Comparison Adjustment Method

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

🐛Biostatistics
Unit 6 Review