Fiveable

๐ŸŽฒData, Inference, and Decisions Unit 5 Review

QR code for Data, Inference, and Decisions practice questions

5.3 Estimating means, proportions, and variances

๐ŸŽฒData, Inference, and Decisions
Unit 5 Review

5.3 Estimating means, proportions, and variances

Written by the Fiveable Content Team โ€ข Last updated September 2025
Written by the Fiveable Content Team โ€ข Last updated September 2025
๐ŸŽฒData, Inference, and Decisions
Unit & Topic Study Guides

Estimating means, proportions, and variances is crucial for making inferences about populations. This topic covers methods like t-distribution for small samples, normal approximation for proportions, and chi-square distribution for variances.

These techniques help create confidence intervals, giving us a range of likely values for population parameters. Understanding when to use each method and how sample size affects our estimates is key to accurate statistical analysis.

Confidence Intervals for Means

T-Distribution for Small Samples

  • Use t-distribution for constructing confidence intervals when population standard deviation unknown and sample size small (typically n < 30)
  • Calculate degrees of freedom for t-distribution as n - 1 (n represents sample size)
  • Construct confidence interval for population mean using formula xห‰ยฑ(t(s/n))\bar{x} \pm (t (s / \sqrt{n}))
    • $\bar{x}$ represents sample mean
    • t represents critical value from t-distribution
    • s represents sample standard deviation
    • n represents sample size
  • Determine critical value t based on desired confidence level and degrees of freedom
  • T-distribution approaches normal distribution as sample size increases
    • Becomes more appropriate for larger samples
  • Margin of error in t-distribution confidence interval influenced by:
    • Sample size
    • Sample standard deviation
    • Chosen confidence level
  • Example: Constructing 95% confidence interval for mean height of 25 students
    • Sample mean = 170 cm, sample standard deviation = 8 cm
    • Degrees of freedom = 24, t-critical value (95% confidence) = 2.064
    • Confidence interval: 170 ยฑ (2.064 (8 / โˆš25)) = (166.7 cm, 173.3 cm)

Estimating Proportions and Confidence Intervals

Normal Approximation for Proportions

  • Use normal approximation to binomial distribution for constructing confidence intervals for population proportions
    • Applicable when np โ‰ฅ 5 and n(1-p) โ‰ฅ 5
    • n represents sample size
    • p represents sample proportion
  • Calculate confidence interval for population proportion using formula p^ยฑ(z(p^(1โˆ’p^))/n)\hat{p} \pm (z \sqrt{(\hat{p}(1-\hat{p}))/n})
    • $\hat{p}$ represents sample proportion
    • z represents critical value from standard normal distribution
    • n represents sample size
  • Compute standard error of proportion as (p^(1โˆ’p^))/n\sqrt{(\hat{p}(1-\hat{p}))/n}
  • Margin of error in proportion confidence interval influenced by:
    • Sample size
    • Sample proportion
    • Chosen confidence level
  • Consider alternative methods for small samples or extreme proportions:
    • Wilson score interval
    • Clopper-Pearson interval
  • Normal approximation method assumes sampling distribution of sample proportion approximately normal
    • Generally true for large samples due to Central Limit Theorem
  • Example: Estimating proportion of left-handed people in population
    • Sample size = 1000, left-handed individuals = 110
    • Sample proportion = 0.11, z-critical value (95% confidence) = 1.96
    • Confidence interval: 0.11 ยฑ (1.96 * โˆš((0.11 * 0.89) / 1000)) = (0.091, 0.129)

Point Estimates and Confidence Intervals for Variance

Estimating Population Variance and Standard Deviation

  • Use sample variance (sยฒ) as point estimate for population variance (ฯƒยฒ)
  • Use sample standard deviation (s) to estimate population standard deviation (ฯƒ)
  • Construct confidence intervals for population variances using chi-square distribution
    • Sampling distribution of sample variance follows chi-square distribution
  • Calculate confidence interval for population variance using formula ((nโˆ’1)s2)/ฯ‡upper2<ฯƒ2<((nโˆ’1)s2)/ฯ‡lower2((n-1)sยฒ)/ฯ‡ยฒ_{upper} < ฯƒยฒ < ((n-1)sยฒ)/ฯ‡ยฒ_{lower}
    • n represents sample size
    • sยฒ represents sample variance
    • ฯ‡ยฒ represents critical values from chi-square distribution
  • Set degrees of freedom for chi-square distribution in variance estimation to n - 1
  • Obtain confidence intervals for population standard deviations by taking square root of lower and upper bounds of variance confidence interval
  • Width of confidence interval for variances and standard deviations influenced by:
    • Sample size
    • Chosen confidence level
  • Assume population normally distributed for validity of confidence intervals
  • Example: Estimating population variance of exam scores
    • Sample size = 40, sample variance = 25
    • 95% confidence interval for variance: (17.76, 38.39)
    • 95% confidence interval for standard deviation: (4.21, 6.20)

Central Limit Theorem and Normal Approximations

Applying Central Limit Theorem in Interval Estimation

  • Central Limit Theorem (CLT) states sampling distribution of sample mean approaches normal distribution as sample size increases
    • Applies regardless of underlying population distribution
  • Apply CLT for practical purposes when sample size โ‰ฅ 30
    • Assumes population not extremely skewed
  • CLT justifies use of normal approximations in interval estimation for means
    • Applicable even when population distribution unknown or non-normal
  • Apply CLT to sampling distribution of sample proportion for proportions
    • Allows normal approximations when np โ‰ฅ 5 and n(1-p) โ‰ฅ 5
  • Standard error of mean (SEM) decreases as sample size increases
    • Key implication of CLT in interval estimation
  • CLT enables use of z-scores and t-scores in constructing confidence intervals
    • Based on normal and approximately normal distributions
  • Understanding CLT crucial for determining appropriate interval estimation methods
    • Parametric methods for large samples or normal populations
    • Non-parametric or bootstrap methods for small samples or non-normal populations
  • Example: Applying CLT to estimate mean household income
    • Sample size = 100, sample mean = $50,000, sample standard deviation = $10,000
    • 95% confidence interval: $50,000 ยฑ (1.96 ($10,000 / โˆš100)) = ($48,040, $51,960)