📊Probability and Statistics Unit 3 Review

3.5 Normal distribution

📊Probability and Statistics
Unit 3 Review

3.5 Normal distribution

Written by the Fiveable Content Team • Last updated September 2025

📊Probability and Statistics

Unit & Topic Study Guides

3.1 Discrete random variables

3.2 Continuous random variables

3.3 Bernoulli and binomial distributions

3.4 Poisson distribution

3.5 Normal distribution

3.6 Exponential distribution

3.7 Uniform distribution

The normal distribution is a fundamental concept in probability and statistics. It's characterized by its symmetric, bell-shaped curve and is defined by two parameters: the mean and standard deviation. This distribution is crucial for understanding many natural phenomena and forms the basis for numerous statistical techniques.

Normal distributions have several key properties, including symmetry and the 68-95-99.7 rule. The probability density function and cumulative distribution function are essential mathematical tools for working with normal distributions. The standard normal distribution, with a mean of 0 and standard deviation of 1, is particularly useful for standardizing data and making comparisons.

Definition of normal distribution

Continuous probability distribution that is symmetric and bell-shaped, with the mean, median, and mode all equal
Describes many natural phenomena such as heights, weights, and IQ scores
Defined by two parameters: the mean ($\mu$) and standard deviation ($\sigma$)

Properties of normal distribution

Symmetry of normal distribution

Normal distribution is symmetric about the mean
50% of the data falls below the mean and 50% falls above the mean
Skewness, a measure of asymmetry, is zero for a normal distribution

Mean, median, mode of normal distribution

In a normal distribution, the mean, median, and mode are all equal
Mean represents the average value of the data
Median is the middle value when data is arranged in order
Mode is the most frequently occurring value

Standard deviation of normal distribution

Measures the spread or dispersion of data from the mean
Approximately 68% of data falls within one standard deviation of the mean
Approximately 95% of data falls within two standard deviations of the mean
Approximately 99.7% of data falls within three standard deviations of the mean

Probability density function

Formula for probability density function

$f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}$
- $\mu$ is the mean
- $\sigma$ is the standard deviation
- $\pi \approx 3.14159$
- $e \approx 2.71828$

Characteristics of probability density function

Gives the relative likelihood of a continuous random variable taking on a specific value
Area under the curve between two points represents the probability of the variable falling within that range
Total area under the curve is equal to 1

Cumulative distribution function

Definition of cumulative distribution function

Gives the probability that a random variable $X$ takes a value less than or equal to $x$
Denoted as $F(x) = P(X \leq x)$
Obtained by integrating the probability density function from $-\infty$ to $x$

Properties of cumulative distribution function

Non-decreasing function, i.e., $F(a) \leq F(b)$ if $a \leq b$
Ranges from 0 to 1
$\lim_{x \to -\infty} F(x) = 0$ and $\lim_{x \to \infty} F(x) = 1$

Standard normal distribution

Definition of standard normal distribution

Normal distribution with a mean of 0 and a standard deviation of 1
Denoted as $Z \sim N(0, 1)$
Any normal distribution can be transformed into a standard normal distribution using z-scores

Z-scores in standard normal distribution

Measures the number of standard deviations a data point is from the mean
Calculated as $z = \frac{x - \mu}{\sigma}$
- $x$ is the data point
- $\mu$ is the mean
- $\sigma$ is the standard deviation
Allows for comparison of data points from different normal distributions

Applications of normal distribution

Normal approximation to binomial distribution

Binomial distribution can be approximated by a normal distribution when certain conditions are met
- Sample size is large ($n \geq 30$)
- Success probability is not too close to 0 or 1 ($np \geq 5$ and $n(1-p) \geq 5$)
Simplifies calculations for binomial probabilities

Confidence intervals using normal distribution

Used to estimate population parameters based on sample data
For large samples, confidence intervals for the mean can be constructed using the normal distribution
Example: 95% confidence interval for the mean is $\bar{x} \pm 1.96 \frac{\sigma}{\sqrt{n}}$, where $\bar{x}$ is the sample mean, $\sigma$ is the population standard deviation, and $n$ is the sample size

Hypothesis testing with normal distribution

Used to test claims about population parameters based on sample data
For large samples, the normal distribution can be used to calculate test statistics and p-values
Example: Z-test for a population mean with known standard deviation

Assessing normality

Graphical methods for assessing normality

Histogram: Should be approximately bell-shaped and symmetric
Normal probability plot (Q-Q plot): Data points should fall close to a straight line
Box plot: Should be symmetric with no outliers

Quantitative methods for assessing normality

Shapiro-Wilk test: Null hypothesis is that the data is normally distributed
- P-value > 0.05 suggests normality
Kolmogorov-Smirnov test: Compares the empirical distribution function to the theoretical normal distribution function
- P-value > 0.05 suggests normality
Skewness and kurtosis: Measures of asymmetry and tail thickness, respectively
- Values close to 0 suggest normality

Transforming data to normal distribution

Box-Cox transformation

Family of power transformations that can help to normalize skewed data
Defined as: $y^{(\lambda)} = \begin{cases} \frac{y^\lambda - 1}{\lambda}, & \lambda \neq 0 \ \log(y), & \lambda = 0 \end{cases}$
- $y$ is the original data
- $\lambda$ is the transformation parameter
Optimal $\lambda$ can be found using maximum likelihood estimation

Other transformations for normality

Square root transformation: $\sqrt{y}$, useful for count data with Poisson distribution
Logarithmic transformation: $\log(y)$, useful for right-skewed data
Reciprocal transformation: $\frac{1}{y}$, useful for left-skewed data

Relationship to other distributions

Normal distribution vs t-distribution

T-distribution has heavier tails than the normal distribution
Used when the sample size is small ($n < 30$) and the population standard deviation is unknown
Converges to the normal distribution as the degrees of freedom increase

Normal distribution vs chi-square distribution

Chi-square distribution is right-skewed and non-negative
Used in hypothesis testing and confidence intervals for variance
Obtained by summing the squares of independent standard normal variables

Normal distribution vs F-distribution

F-distribution is right-skewed and non-negative
Used in hypothesis testing and confidence intervals for the ratio of two variances
Obtained by dividing two independent chi-square variables

Limitations of normal distribution

Situations where normal distribution is inappropriate

Data with extreme outliers or heavy tails
Strongly skewed data
Discrete or categorical data

Alternatives to normal distribution

Student's t-distribution: For small sample sizes with unknown population standard deviation
Poisson distribution: For count data with rare events
Binomial distribution: For binary data with a fixed number of trials
Exponential distribution: For modeling waiting times or time-to-event data

📊Probability and Statistics Unit 3 Review

3.5 Normal distribution

📊Probability and Statistics Unit 3 Review

3.5 Normal distribution

Unit & Topic Study Guides

Definition of normal distribution

Properties of normal distribution

Symmetry of normal distribution

Mean, median, mode of normal distribution

Standard deviation of normal distribution

Probability density function

Formula for probability density function

Characteristics of probability density function

Cumulative distribution function

Definition of cumulative distribution function

Properties of cumulative distribution function

Standard normal distribution

Definition of standard normal distribution

Z-scores in standard normal distribution

Applications of normal distribution

Normal approximation to binomial distribution

Confidence intervals using normal distribution

Hypothesis testing with normal distribution

Assessing normality

Graphical methods for assessing normality

Quantitative methods for assessing normality

Transforming data to normal distribution

Box-Cox transformation

Other transformations for normality

Relationship to other distributions

Normal distribution vs t-distribution

Normal distribution vs chi-square distribution

Normal distribution vs F-distribution

Limitations of normal distribution

Situations where normal distribution is inappropriate

Alternatives to normal distribution

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

📊Probability and Statistics
Unit 3 Review