The Central Limit Theorem is a game-changer in statistics. It tells us that as sample sizes grow, the distribution of sample means becomes normal, no matter what the original population looks like. This is huge for making business decisions based on data.
With this theorem, we can make educated guesses about entire populations using just a sample. It's like having a crystal ball for statistics, letting us predict probabilities and build confidence intervals, even when we don't know everything about the population we're studying.
Central Limit Theorem
Central Limit Theorem significance
- States sample mean distribution approaches normal as sample size increases, regardless of population distribution shape
- Holds true for sufficiently large samples (typically n โฅ 30) drawn independently
- Allows inferential statistics for business decisions
- Infer population parameters from sample statistics
- Calculate probability of sample means within specific range
- Construct confidence intervals and conduct hypothesis tests for population means, even with unknown or non-normal population distributions
Normality in sampling distributions
- Sampling distribution of the mean: distribution of all possible sample means for given sample size
- As sample size (n) increases, sampling distribution of mean becomes more normal
- Applies even if population distribution is non-normal (skewed or bimodal)
- Random variation in sample means decreases with larger samples, causing more symmetric and bell-shaped distribution
- Applies to populations with finite mean and standard deviation, given large enough samples drawn independently
Sampling Distribution of the Mean
Calculations for sampling distributions
- Mean of sampling distribution of mean ($ฮผxฬ$) equals population mean ($ฮผ$)
- $ฮผxฬ = ฮผ$
- Standard deviation of sampling distribution of mean ($ฯxฬ$), or standard error of mean, equals population standard deviation ($ฯ$) divided by square root of sample size ($n$)
- $ฯxฬ = \frac{ฯ}{\sqrt{n}}$
- Larger sample sizes decrease standard error of mean, indicating sample means cluster more closely around population mean
Probability applications of CLT
- For large samples (n โฅ 30) with known population standard deviation, approximate sampling distribution of mean with normal distribution using $ฮผxฬ$ and $ฯxฬ$
- Determine probability of sample means within specific range:
- Standardize sample mean by calculating z-score: $z = \frac{xฬ - ฮผxฬ}{ฯxฬ}$
- Find probability associated with z-score using standard normal distribution table or calculator
- Example: Population mean 100, standard deviation 15, sample of 36 observations with mean 105
- Calculate z-score: $z = \frac{105 - 100}{\frac{15}{\sqrt{36}}} = 2$
- Probability of z-score โค 2 is approximately 0.9772, so probability of sample mean โค 105 is about 0.9772