The standard normal distribution is a powerful tool for comparing data from different sources. It uses z-scores to standardize values, allowing for easy interpretation and comparison across various datasets.
Z-scores tell us how many standard deviations a data point is from the mean. The empirical rule helps estimate probabilities for normally distributed data, making it easier to understand and analyze statistical information.
The Standard Normal Distribution
Calculation of z-scores
- Standardizes data from different distributions using formula $z = \frac{x - \mu}{\sigma}$ (standardization)
- $x$ represents individual data point
- $\mu$ represents mean of distribution
- $\sigma$ represents standard deviation of distribution
- Represents number of standard deviations data point is from mean
- Positive z-scores indicate data point above mean (height of 6 feet)
- Negative z-scores indicate data point below mean (height of 5 feet)
- Allows for comparison of data from different distributions
- Z-scores have mean of 0 and standard deviation of 1 (SAT scores and IQ scores)
Interpretation of z-scores
- Standard normal distribution has mean of 0 and standard deviation of 1
- Determines relative position of data point within standard normal distribution
- Z-score of 0 corresponds to mean of distribution (50th percentile)
- Z-score of 1 corresponds to one standard deviation above mean (84th percentile)
- Z-score of -1 corresponds to one standard deviation below mean (16th percentile)
- Probability of data point falling within certain range determined using z-scores and standard normal distribution
- Area under curve between two z-scores represents probability of data point falling within that range (probability of z-score between -1 and 1 is about 68%)
- Probability density function describes the likelihood of a continuous random variable taking on a specific value
Empirical rule for z-score probabilities
- 68-95-99.7 rule provides quick way to estimate probabilities for normally distributed data
- States that for normal distribution:
- ~68% of data falls within one standard deviation of mean ($\mu \pm 1\sigma$)
- ~95% of data falls within two standard deviations of mean ($\mu \pm 2\sigma$)
- ~99.7% of data falls within three standard deviations of mean ($\mu \pm 3\sigma$)
- To estimate probabilities using empirical rule:
- Convert given values to z-scores
- Determine number of standard deviations z-scores are from mean
- Apply appropriate percentage from empirical rule
- Provides quick estimate, but for more precise probabilities, use standard normal distribution table or calculator (z-table or online calculator)
Additional Concepts
- Cumulative distribution function gives the probability that a random variable is less than or equal to a specific value
- Normal probability plot is used to assess whether a dataset follows a normal distribution
- Central limit theorem states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases, regardless of the underlying population distribution