📈Theoretical Statistics Unit 2 Review

2.6 Common probability distributions

📈Theoretical Statistics
Unit 2 Review

2.6 Common probability distributions

Written by the Fiveable Content Team • Last updated September 2025

📈Theoretical Statistics

Unit & Topic Study Guides

2.1 Discrete random variables

2.2 Continuous random variables

2.3 Probability mass functions

2.4 Probability density functions

2.5 Cumulative distribution functions

2.6 Common probability distributions

Probability distributions are the backbone of statistical inference, allowing us to model real-world phenomena and analyze data effectively. By understanding different types of distributions, we can select appropriate methods for various research questions and make accurate predictions.

This topic covers discrete and continuous distributions, univariate and multivariate cases, and special distributions used in hypothesis testing. We'll explore key properties, relationships between distributions, and their applications in statistical inference, hypothesis testing, and confidence intervals.

Types of probability distributions

Probability distributions form the foundation of statistical inference in Theoretical Statistics
Understanding different types of distributions enables accurate modeling of real-world phenomena and data analysis
Classifying distributions helps in selecting appropriate statistical methods for various research questions

Discrete vs continuous distributions

Discrete distributions model random variables with countable outcomes (integers)
Continuous distributions represent variables that can take any value within a range
Probability mass functions describe discrete distributions while probability density functions characterize continuous distributions
Examples of discrete distributions include (Poisson, binomial)
Continuous distribution examples encompass (normal, exponential)

Univariate vs multivariate distributions

Univariate distributions describe a single random variable
Multivariate distributions model the joint behavior of two or more random variables
Univariate distributions use single-variable functions while multivariate distributions employ multidimensional functions
Correlation and covariance play crucial roles in multivariate distributions
Applications of multivariate distributions include (portfolio analysis, climate modeling)

Discrete probability distributions

Discrete distributions model random variables with distinct, separate outcomes
These distributions are essential in analyzing count data and categorical variables
Understanding discrete distributions aids in solving problems involving finite sets of possibilities

Bernoulli distribution

Models a single trial with two possible outcomes (success or failure)
Probability mass function given by $P(X=x) = p^x(1-p)^{1-x}$ where x is 0 or 1
Mean (expected value) equals p, variance equals p(1-p)
Used in modeling binary outcomes (coin flips, yes/no surveys)
Forms the basis for more complex discrete distributions

Binomial distribution

Represents the number of successes in n independent Bernoulli trials
Probability mass function: $P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$
Mean equals np, variance equals np(1-p)
Applies to scenarios with fixed number of trials and constant probability of success
Examples include (number of defective items in a batch, correct answers in a multiple-choice test)

Poisson distribution

Models the number of events occurring in a fixed interval of time or space
Probability mass function: $P(X=k) = \frac{\lambda^k e^{-\lambda}}{k!}$
Mean and variance both equal λ (rate parameter)
Assumes events occur independently and at a constant average rate
Applications include (number of customers arriving at a store, radioactive decay events)

Geometric distribution

Represents the number of trials until the first success in a sequence of Bernoulli trials
Probability mass function: $P(X=k) = (1-p)^{k-1}p$
Mean equals 1/p, variance equals (1-p)/p^2
Models waiting time scenarios (number of coin flips until first heads)
Used in reliability analysis and quality control

Negative binomial distribution

Generalizes the geometric distribution to model the number of failures before r successes
Probability mass function: $P(X=k) = \binom{k+r-1}{k} p^r (1-p)^k$
Mean equals r(1-p)/p, variance equals r(1-p)/p^2
Applies to scenarios requiring multiple successes (number of insurance claims until r payouts)
Used in modeling overdispersed count data

Continuous probability distributions

Continuous distributions model random variables that can take any value within a range
These distributions are crucial for analyzing measurements and time-related data
Understanding continuous distributions enables sophisticated modeling of real-world phenomena

Uniform distribution

Represents equal probability for all values within a given interval [a,b]
Probability density function: $f(x) = \frac{1}{b-a}$ for a ≤ x ≤ b
Mean equals (a+b)/2, variance equals (b-a)^2/12
Serves as a basis for generating random numbers in simulations
Applied in modeling random selection processes (lottery numbers, roulette wheel outcomes)

Normal distribution

Characterized by its bell-shaped curve and symmetry around the mean
Probability density function: $f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$
Defined by two parameters: mean (μ) and standard deviation (σ)
Central Limit Theorem establishes its importance in statistical inference
Widely used in natural and social sciences (height distributions, measurement errors)

Exponential distribution

Models the time between events in a Poisson process
Probability density function: $f(x) = \lambda e^{-\lambda x}$ for x ≥ 0
Mean equals 1/λ, variance equals 1/λ^2
Exhibits the memoryless property
Applications include (waiting times, equipment failure rates)

Gamma distribution

Generalizes the exponential distribution with shape (k) and scale (θ) parameters
Probability density function: $f(x) = \frac{x^{k-1}e^{-x/\theta}}{\theta^k\Gamma(k)}$ for x > 0
Mean equals kθ, variance equals kθ^2
Models waiting times for k events in a Poisson process
Used in reliability analysis and modeling rainfall amounts

Beta distribution

Defined on the interval [0,1] with shape parameters α and β
Probability density function: $f(x) = \frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)}$
Mean equals α/(α+β), variance equals αβ/((α+β)^2(α+β+1))
Often used to model probabilities and proportions
Applications in Bayesian statistics and modeling of random variables with finite range

Properties of distributions

Understanding distribution properties enables effective statistical analysis and inference
These properties provide insights into the behavior and characteristics of random variables
Mastering distribution properties forms the basis for advanced statistical techniques

Probability density function

Describes the relative likelihood of a continuous random variable taking on a specific value
Integral of the PDF over an interval gives the probability of the variable falling within that range
Must be non-negative and integrate to 1 over its entire domain
Derivatives of the PDF reveal important features of the distribution
Used to calculate probabilities and derive other distribution properties

Cumulative distribution function

Represents the probability that a random variable takes a value less than or equal to a given point
For continuous distributions, CDF is the integral of the PDF from negative infinity to x
Always monotonically increasing and ranges from 0 to 1
Useful for calculating probabilities and quantiles
Relationship to PDF: $F'(x) = f(x)$ for continuous distributions

Moments of distributions

Describe various aspects of the probability distribution's shape and location
First moment (mean) represents the center of the distribution
Second central moment (variance) measures the spread around the mean
Higher moments (skewness, kurtosis) provide information about asymmetry and tail behavior
Moment generating functions uniquely determine probability distributions

Expected value and variance

Expected value (mean) represents the long-run average of a random variable
Calculated as $E[X] = \sum_{x} xP(X=x)$ for discrete and $E[X] = \int_{-\infty}^{\infty} xf(x)dx$ for continuous distributions
Variance measures the spread of the distribution around the mean
Computed as $Var(X) = E[(X-\mu)^2] = E[X^2] - (E[X])^2$
Standard deviation, the square root of variance, provides a measure of dispersion in the same units as the random variable

Relationships between distributions

Understanding relationships between distributions aids in statistical modeling and analysis
These connections often arise from transformations or limiting behaviors of random variables
Recognizing distribution relationships enables efficient problem-solving and data interpretation

Normal vs standard normal

Standard normal distribution has mean 0 and standard deviation 1
Any normal distribution can be transformed to standard normal: $Z = \frac{X-\mu}{\sigma}$
Standard normal distribution simplifies probability calculations and hypothesis testing
Z-scores derived from this relationship allow comparison across different normal distributions
Tables of standard normal probabilities facilitate quick computations

Poisson vs exponential

Poisson distribution models the number of events in a fixed interval
Exponential distribution represents the time between events in a Poisson process
If X ~ Poisson(λt), then the time until the next event Y ~ Exponential(λ)
Both distributions are characterized by a single parameter λ
Relationship useful in modeling queuing systems and arrival processes

Gamma vs beta

Gamma distribution generalizes the exponential distribution
Beta distribution is related to the gamma through the following property: If X ~ Gamma(α, θ) and Y ~ Gamma(β, θ), then X/(X+Y) ~ Beta(α, β)
Both distributions are versatile and used in Bayesian analysis
Gamma often models waiting times, while Beta models probabilities or proportions
Their relationship allows for flexible modeling of various phenomena

Special distributions

Special distributions arise in specific statistical contexts and hypothesis testing
These distributions play crucial roles in inferential statistics and experimental design
Understanding special distributions is essential for advanced statistical analysis techniques

Chi-square distribution

Arises from the sum of squares of independent standard normal random variables
Probability density function: $f(x) = \frac{x^{(k/2)-1}e^{-x/2}}{2^{k/2}\Gamma(k/2)}$ for x > 0
Characterized by degrees of freedom (k) parameter
Used in goodness-of-fit tests and analysis of variances
Relationship to normal distribution: if Z ~ N(0,1), then Z^2 ~ χ^2(1)

Student's t-distribution

Arises when estimating the mean of a normally distributed population with unknown variance
Probability density function: $f(x) = \frac{\Gamma(\frac{\nu+1}{2})}{\sqrt{\nu\pi}\Gamma(\frac{\nu}{2})}(1+\frac{x^2}{\nu})^{-\frac{\nu+1}{2}}$
Characterized by degrees of freedom (ν) parameter
Approaches normal distribution as degrees of freedom increase
Used in hypothesis testing and constructing confidence intervals for means

F-distribution

Arises as the ratio of two chi-square distributions divided by their respective degrees of freedom
Probability density function involves complex beta and gamma functions
Characterized by two degrees of freedom parameters (d1, d2)
Used in analysis of variance (ANOVA) and regression analysis
Tests equality of variances in different populations

Applications of distributions

Probability distributions form the foundation for various statistical inference techniques
Understanding distribution applications enables proper selection of statistical methods
These applications bridge theoretical concepts with practical data analysis

Statistical inference

Uses probability distributions to draw conclusions about populations from sample data
Involves parameter estimation, hypothesis testing, and confidence interval construction
Requires understanding of sampling distributions and their properties
Employs maximum likelihood estimation and method of moments for parameter estimation
Bayesian inference incorporates prior distributions to update beliefs based on observed data

Hypothesis testing

Tests claims about population parameters using sample data
Utilizes test statistics derived from appropriate probability distributions
Involves null and alternative hypotheses, significance levels, and p-values
Examples include t-tests for means, chi-square tests for independence, and F-tests for variances
Power analysis uses distributions to determine sample size requirements for detecting effects

Confidence intervals

Provide a range of plausible values for population parameters
Constructed using the sampling distribution of the estimator
Typically based on normal, t, or chi-square distributions depending on the parameter and sample size
Confidence level determines the probability that the interval contains the true parameter value
Used in various fields to quantify uncertainty in parameter estimates

Transformations of distributions

Transformations allow manipulation of random variables to create new distributions
Understanding transformations aids in solving complex probability problems
These techniques are crucial for deriving sampling distributions and developing statistical methods

Linear transformations

Involve adding a constant (shift) or multiplying by a constant (scale) to a random variable
For X with mean μ and variance σ^2, Y = aX + b has mean aμ + b and variance a^2σ^2
Preserve the general shape of the distribution but change location and spread
Useful for standardizing variables (z-scores) and unit conversions
Examples include transforming between Fahrenheit and Celsius temperatures

Non-linear transformations

Involve applying non-linear functions to random variables
Can significantly alter the shape and properties of the original distribution
Examples include exponential, logarithmic, and power transformations
Used to stabilize variance, normalize data, or linearize relationships
Require careful consideration of how the transformation affects probabilities and moments

Multivariate distributions

Model the joint behavior of two or more random variables simultaneously
Essential for analyzing complex systems and relationships between variables
Require understanding of concepts like correlation, covariance, and conditional distributions

Multivariate normal distribution

Generalizes the univariate normal distribution to multiple dimensions
Characterized by a mean vector μ and covariance matrix Σ
Probability density function involves the determinant and inverse of the covariance matrix
Marginal and conditional distributions are also normal
Widely used in multivariate statistical analysis, including factor analysis and discriminant analysis

Multinomial distribution

Generalizes the binomial distribution to multiple categories
Models the outcomes of n independent trials with k possible outcomes
Probability mass function involves multinomial coefficients and category probabilities
Mean and variance can be calculated for each category
Applications include modeling voting outcomes and market share analysis

Sampling distributions

Describe the distribution of sample statistics across different samples from a population
Crucial for understanding the behavior of estimators and conducting statistical inference
Form the basis for constructing confidence intervals and performing hypothesis tests

Distribution of sample mean

Central Limit Theorem states that the sampling distribution of the mean approaches normal as sample size increases
For large samples, X̄ ~ N(μ, σ^2/n) where μ and σ^2 are population parameters
Standard error of the mean (SEM) equals σ/√n
T-distribution used when population standard deviation is unknown and sample size is small
Enables inference about population means using sample data

Distribution of sample variance

Sample variance (s^2) follows a scaled chi-square distribution
For normal populations, (n-1)s^2/σ^2 ~ χ^2(n-1)
Used to construct confidence intervals for population variance
F-distribution arises when comparing variances from two independent samples
Understanding this distribution is crucial for ANOVA and regression analysis

Limit theorems

Describe the asymptotic behavior of random variables and their functions
Provide theoretical foundations for many statistical inference techniques
Enable approximations that simplify complex probability calculations

Law of large numbers

States that the sample mean converges to the population mean as sample size increases
Weak law of large numbers deals with convergence in probability
Strong law of large numbers concerns almost sure convergence
Justifies the use of sample means as estimators of population means
Fundamental to the concept of consistency in statistical estimation

Central limit theorem

States that the sum (or average) of a large number of independent, identically distributed random variables approaches a normal distribution
Applies regardless of the underlying distribution of the individual variables
Convergence rate depends on the original distribution and sample size
Enables normal approximations for various distributions (binomial, Poisson) with large n
Forms the basis for many statistical inference procedures and hypothesis tests

📈Theoretical Statistics Unit 2 Review

2.6 Common probability distributions

📈Theoretical Statistics Unit 2 Review

2.6 Common probability distributions

Unit & Topic Study Guides

Types of probability distributions

Discrete vs continuous distributions

Univariate vs multivariate distributions

Discrete probability distributions

Bernoulli distribution

Binomial distribution

Poisson distribution

Geometric distribution

Negative binomial distribution

Continuous probability distributions

Uniform distribution

Normal distribution

Exponential distribution

Gamma distribution

Beta distribution

Properties of distributions

Probability density function

Cumulative distribution function

Moments of distributions

Expected value and variance

Relationships between distributions

Normal vs standard normal

Poisson vs exponential

Gamma vs beta

Special distributions

Chi-square distribution

Student's t-distribution

F-distribution

Applications of distributions

Statistical inference

Hypothesis testing

Confidence intervals

Transformations of distributions

Linear transformations

Non-linear transformations

Multivariate distributions

Multivariate normal distribution

Multinomial distribution

Sampling distributions

Distribution of sample mean

Distribution of sample variance

Limit theorems

Law of large numbers

Central limit theorem

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

📈Theoretical Statistics
Unit 2 Review