Fiveable

📊Probability and Statistics Unit 3 Review

QR code for Probability and Statistics practice questions

3.3 Bernoulli and binomial distributions

📊Probability and Statistics
Unit 3 Review

3.3 Bernoulli and binomial distributions

Written by the Fiveable Content Team • Last updated September 2025
Written by the Fiveable Content Team • Last updated September 2025
📊Probability and Statistics
Unit & Topic Study Guides

Bernoulli and binomial distributions are fundamental concepts in probability theory. They model events with two possible outcomes, like coin flips or product defects. Understanding these distributions is crucial for analyzing experiments, quality control, and decision-making in various fields.

The Bernoulli distribution represents a single trial with binary outcomes, while the binomial distribution extends this to multiple trials. These distributions form the basis for more complex probability models and are widely used in statistics, engineering, and data science applications.

Bernoulli distribution

  • Fundamental probability distribution that models a single trial with two possible outcomes (success or failure)
  • Forms the basis for more complex probability distributions, such as the binomial distribution
  • Used to model events with binary outcomes, such as coin flips, defective products, or yes/no survey responses

Bernoulli trial definition

  • A single experiment with only two possible outcomes, typically labeled as success (1) or failure (0)
  • Probability of success remains constant across multiple trials
  • Trials are independent of each other, meaning the outcome of one trial does not influence the outcome of another
  • Examples include flipping a coin (heads or tails), testing a product (defective or non-defective), or a medical test (positive or negative)

Bernoulli random variable

  • A random variable that takes the value 1 with probability $p$ (success) and the value 0 with probability $1-p$ (failure)
  • Denoted by $X \sim Bern(p)$, where $p$ is the probability of success
  • Probability mass function (PMF) of a Bernoulli random variable is given by $P(X=x) = p^x(1-p)^{1-x}$ for $x \in {0,1}$
  • Expected value (mean) of a Bernoulli random variable is $E(X) = p$, and the variance is $Var(X) = p(1-p)$

Probability mass function

  • A function that gives the probability of a discrete random variable taking on a specific value
  • For a Bernoulli random variable $X$ with success probability $p$, the PMF is given by $P(X=x) = p^x(1-p)^{1-x}$ for $x \in {0,1}$
  • The PMF satisfies two conditions:
    1. $P(X=x) \geq 0$ for all $x$
    2. $\sum_{x} P(X=x) = 1$

Mean and variance

  • The expected value (mean) of a Bernoulli random variable $X$ with success probability $p$ is given by $E(X) = p$
  • The variance of a Bernoulli random variable $X$ with success probability $p$ is given by $Var(X) = p(1-p)$
  • The standard deviation is the square root of the variance, $\sigma = \sqrt{p(1-p)}$
  • These properties are derived using the PMF and the definitions of expected value and variance for discrete random variables

Applications of Bernoulli distribution

  • Modeling binary outcomes in various fields, such as quality control (defective or non-defective products), medical testing (positive or negative results), and survey responses (yes or no)
  • Serves as a building block for more complex probability distributions, like the binomial distribution, which models the number of successes in a fixed number of independent Bernoulli trials
  • Used in logistic regression, a statistical method for modeling binary dependent variables based on one or more independent variables
  • Applied in reliability analysis to model the probability of a component or system functioning or failing at a given time

Binomial distribution

  • A discrete probability distribution that models the number of successes in a fixed number of independent Bernoulli trials
  • Extends the Bernoulli distribution to multiple trials, allowing for the calculation of probabilities for various numbers of successes
  • Widely used in various fields, such as quality control, clinical trials, and modeling of success/failure outcomes

Binomial experiment definition

  • Consists of a fixed number of independent Bernoulli trials, denoted by $n$
  • Each trial has only two possible outcomes, success (with probability $p$) or failure (with probability $1-p$)
  • The probability of success remains constant across all trials
  • The trials are independent, meaning the outcome of one trial does not influence the outcome of another
  • The random variable of interest is the number of successes in the $n$ trials

Binomial random variable

  • A discrete random variable $X$ that represents the number of successes in a binomial experiment with $n$ trials and success probability $p$
  • Denoted by $X \sim B(n,p)$, where $n$ is the number of trials and $p$ is the probability of success in each trial
  • The possible values of $X$ range from 0 to $n$, representing the number of successes in the $n$ trials
  • The probability mass function (PMF) of a binomial random variable is given by $P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$ for $k = 0, 1, \dots, n$

Probability mass function

  • The PMF of a binomial random variable $X \sim B(n,p)$ is given by $P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$ for $k = 0, 1, \dots, n$
  • $\binom{n}{k}$ is the binomial coefficient, which represents the number of ways to choose $k$ successes from $n$ trials
  • The PMF gives the probability of observing exactly $k$ successes in $n$ trials, given the success probability $p$
  • The PMF satisfies the conditions: $P(X=k) \geq 0$ for all $k$ and $\sum_{k=0}^n P(X=k) = 1$

Cumulative distribution function

  • The cumulative distribution function (CDF) of a binomial random variable $X \sim B(n,p)$ is given by $F(x) = P(X \leq x) = \sum_{k=0}^{\lfloor x \rfloor} \binom{n}{k} p^k (1-p)^{n-k}$
  • The CDF gives the probability of observing at most $x$ successes in $n$ trials, given the success probability $p$
  • $\lfloor x \rfloor$ denotes the floor function, which returns the greatest integer less than or equal to $x$
  • The CDF is a non-decreasing function, with $F(x) = 0$ for $x < 0$ and $F(x) = 1$ for $x \geq n$

Mean, variance, and standard deviation

  • The expected value (mean) of a binomial random variable $X \sim B(n,p)$ is given by $E(X) = np$
  • The variance of a binomial random variable $X \sim B(n,p)$ is given by $Var(X) = np(1-p)$
  • The standard deviation is the square root of the variance, $\sigma = \sqrt{np(1-p)}$
  • These properties are derived using the PMF and the definitions of expected value and variance for discrete random variables

Moment generating function

  • The moment generating function (MGF) of a binomial random variable $X \sim B(n,p)$ is given by $M_X(t) = E(e^{tX}) = (pe^t + 1 - p)^n$
  • The MGF is a powerful tool for deriving moments and other properties of the binomial distribution
  • The $k$-th moment of $X$ can be obtained by evaluating the $k$-th derivative of the MGF at $t=0$: $E(X^k) = M_X^{(k)}(0)$
  • The MGF can also be used to establish relationships between the binomial distribution and other probability distributions

Binomial coefficient

  • The binomial coefficient, denoted by $\binom{n}{k}$ or $C(n,k)$, represents the number of ways to choose $k$ items from a set of $n$ items, where the order of selection does not matter
  • It is calculated using the formula $\binom{n}{k} = \frac{n!}{k!(n-k)!}$, where $n!$ represents the factorial of $n$
  • The binomial coefficient appears in the PMF of the binomial distribution, as it counts the number of ways to arrange $k$ successes among $n$ trials
  • Binomial coefficients have various properties, such as symmetry ($\binom{n}{k} = \binom{n}{n-k}$) and the binomial theorem

Pascal's triangle

  • A triangular array of numbers in which each number is the sum of the two numbers directly above it
  • The entries in Pascal's triangle are the binomial coefficients $\binom{n}{k}$, where $n$ represents the row number (starting from 0) and $k$ represents the position within the row (starting from 0)
  • Pascal's triangle provides a convenient way to calculate binomial coefficients without using the factorial formula
  • The triangle has various properties and applications, such as the binomial theorem, probability calculations, and combinatorial identities

Bernoulli vs binomial distribution

  • The Bernoulli distribution models a single trial with two possible outcomes (success or failure), while the binomial distribution models the number of successes in a fixed number of independent Bernoulli trials
  • The Bernoulli distribution is a special case of the binomial distribution with $n=1$
  • The PMF of a Bernoulli random variable is $P(X=x) = p^x(1-p)^{1-x}$ for $x \in {0,1}$, while the PMF of a binomial random variable is $P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$ for $k = 0, 1, \dots, n$
  • The mean and variance of a Bernoulli random variable are $E(X) = p$ and $Var(X) = p(1-p)$, while the mean and variance of a binomial random variable are $E(X) = np$ and $Var(X) = np(1-p)$

Properties of binomial distribution

  • The binomial distribution has several important properties that make it useful for modeling and analyzing various phenomena
  • These properties include the reproductive property, additive property, most probable outcome, and the median of the distribution
  • Understanding these properties helps in applying the binomial distribution to real-world problems and in deriving related probability distributions

Reproductive property

  • If $X_1 \sim B(n_1, p)$ and $X_2 \sim B(n_2, p)$ are independent binomial random variables with the same success probability $p$, then their sum $X_1 + X_2$ follows a binomial distribution with parameters $n_1 + n_2$ and $p$
  • In other words, the sum of two independent binomial random variables with the same success probability is also a binomial random variable
  • This property allows for the combination of multiple binomial experiments into a single binomial experiment, simplifying calculations and analysis

Additive property

  • If $X_1 \sim B(n, p_1)$ and $X_2 \sim B(n, p_2)$ are independent binomial random variables with the same number of trials $n$, then the conditional distribution of $X_1$ given $X_1 + X_2 = k$ is a binomial distribution with parameters $k$ and $\frac{p_1}{p_1 + p_2}$
  • This property is useful in situations where the total number of successes is known, and we want to determine the distribution of successes among the two categories
  • The additive property is a consequence of the multinomial distribution, which generalizes the binomial distribution to more than two categories

Most probable outcome

  • The most probable outcome (mode) of a binomial distribution $B(n, p)$ is the value of $k$ that maximizes the probability mass function $P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$
  • The mode of a binomial distribution is either $\lfloor (n+1)p \rfloor$ or $\lceil (n+1)p \rceil - 1$, where $\lfloor \cdot \rfloor$ and $\lceil \cdot \rceil$ denote the floor and ceiling functions, respectively
  • When $(n+1)p$ is an integer, the binomial distribution has two modes: $(n+1)p - 1$ and $(n+1)p$
  • The most probable outcome provides insight into the likely number of successes in a binomial experiment

Median of binomial distribution

  • The median of a binomial distribution $B(n, p)$ is the value $m$ such that $P(X \leq m) \geq 0.5$ and $P(X \geq m) \geq 0.5$
  • In general, the median of a binomial distribution is not equal to its mean $np$, except when $p = 0.5$ and $n$ is odd
  • For large values of $n$, the median can be approximated using the normal distribution, as the binomial distribution becomes approximately normal when $n$ is large and $p$ is not too close to 0 or 1
  • The median provides a measure of the central tendency of the binomial distribution, which is less sensitive to extreme values than the mean

Approximations to binomial distribution

  • When the number of trials $n$ is large, calculating probabilities using the binomial PMF can be computationally intensive
  • In such cases, approximations to the binomial distribution can be used to simplify calculations and provide accurate estimates of probabilities
  • The two most common approximations are the normal approximation and the Poisson approximation, each with its own set of conditions and guidelines for application

Normal approximation

  • The normal distribution can be used to approximate the binomial distribution when $n$ is large and $p$ is not too close to 0 or 1
  • The conditions for the normal approximation to be appropriate are: $np \geq 10$ and $n(1-p) \geq 10$
  • Under these conditions, the binomial random variable $X \sim B(n, p)$ can be approximated by a normal random variable $Y \sim N(np, \sqrt{np(1-p)})$
  • To calculate probabilities using the normal approximation, the continuity correction factor of 0.5 is often applied to improve the accuracy of the approximation

Poisson approximation

  • The Poisson distribution can be used to approximate the binomial distribution when $n$ is large and $p$ is small, such that $np$ remains constant
  • The condition for the Poisson approximation to be appropriate is: $n \geq 100$ and $p \leq 0.1$, with $np \leq 10$
  • Under these conditions, the binomial random variable $X \sim B(n, p)$ can be approximated by a Poisson random variable $Y \sim Poisson(np)$
  • The Poisson approximation is particularly useful when modeling rare events, such as defects in manufacturing or mutations in DNA sequences

Rule of thumb for approximations

  • As a general rule of thumb, the normal approximation is more appropriate when $p$ is close to 0.5, while the Poisson approximation is more appropriate when $p$ is close to 0 or 1
  • When both approximations are applicable, the normal approximation is generally preferred due to its greater flexibility and the availability of continuity correction
  • It is important to check the conditions for each approximation before applying them to ensure the accuracy of the results
  • In cases where the conditions for both approximations are not met, it is recommended to use the exact binomial PMF for probability calculations

Applications of binomial distribution

  • The binomial distribution has numerous applications across various fields, including quality control, clinical trials, and modeling of success/failure outcomes
  • Understanding the binomial distribution and its properties is crucial for making informed decisions and drawing valid conclusions in these contexts
  • Some of the most common applications of the binomial distribution are discussed below

Quality control and inspection

  • In manufacturing, the binomial distribution can be used to model the number of defective items in a batch of products
  • By setting a threshold for the acceptable number of defective items, quality control managers can make decisions on whether to accept or reject a batch
  • The binomial distribution can also be used to determine the optimal sample size for inspection, balancing the cost of inspection with the risk of accepting a defective batch
  • Example: A factory produces light bulbs with a 2% defect rate. If a random sample of 100 bulbs is inspected, the binomial distribution can be used to calculate the probability of finding at most 3 defective bulbs

Clinical trials and drug testing

  • In medical research, the binomial distribution is used to model the number of patients who respond positively to a treatment or experience side effects
  • Clinical trials often involve comparing the success rates of two or more treatments, which can be modeled using the difference between two binomial proportions
  • The binomial distribution is also used to determine the sample size required to detect a significant difference between treatments, while controlling for Type I and Type II errors
  • Example: In a clinical trial, 60% of patients respond positively to a new drug, while only 40% respond positively to a placebo. The binomial distribution can be used to calculate the probability of observing a significant difference in response rates between the two groups

Modeling of success/failure outcomes