Hypothesis testing is a crucial statistical tool for making informed decisions based on data. It involves formulating null and alternative hypotheses, analyzing data, and drawing conclusions about population parameters.
Understanding p-values, significance levels, and statistical power helps researchers balance the risks of Type I and Type II errors. These concepts guide study design, data interpretation, and decision-making across various fields, from medicine to business.
Understanding Hypothesis Testing
Null vs alternative hypotheses
- Null hypothesis (Hโ) assumes no effect or difference typically includes equality (=, โค, or โฅ) (Earth is flat)
- Alternative hypothesis (Hโ or Hโ) opposes null hypothesis, aims to support researcher's claim typically includes inequality (โ , >, <) (Earth is round)
- Frame research question, guide statistical analysis, determine test direction (one-tailed or two-tailed)
- Examples: Drug effectiveness (Hโ: new drug = placebo, Hโ: new drug > placebo), Gender pay gap (Hโ: male salary โค female salary, Hโ: male salary > female salary)
Type I and Type II errors
- Type I error (false positive) rejects true null hypothesis probability ฮฑ (alpha) leads to unnecessary changes, wasted resources (convicting innocent person)
- Type II error (false negative) fails to reject false null hypothesis probability ฮฒ (beta) results in missed opportunities, undetected effects (acquitting guilty person)
- Trade-off between errors decreasing one increases the other
- Decision-making impact balances risks, considers costs and consequences of incorrect decisions (medical diagnosis, quality control)
P-values and significance levels
- P-value probability of extreme results assuming Hโ true calculated using test statistic and distribution
- Significance level (ฮฑ) predetermined threshold for rejecting Hโ common values: 0.05, 0.01, 0.1
- Interpretation: reject Hโ if p-value < ฮฑ, fail to reject if p-value โฅ ฮฑ
- Smaller p-values indicate stronger evidence against Hโ
- Limitations: doesn't measure effect size or practical significance
- Examples: Clinical trials (p = 0.03 < ฮฑ = 0.05, reject Hโ), Market research (p = 0.08 > ฮฑ = 0.05, fail to reject Hโ)
Statistical power in testing
- Probability of correctly rejecting false null hypothesis represented as 1 - ฮฒ
- Affected by sample size, effect size, significance level (ฮฑ), data variability
- Determines ability to detect true effects, aids study design and sample size calculation
- Power analysis used to determine required sample size for desired power balances Type I and II error risks
- Low power increases Type II error risk, reduces research reproducibility
- Examples: Drug trials (80% power to detect 20% improvement), Psychology experiments (90% power for medium effect size)