📊Probability and Statistics Unit 11 Review

11.4 Bayesian hypothesis testing

📊Probability and Statistics
Unit 11 Review

11.4 Bayesian hypothesis testing

Written by the Fiveable Content Team • Last updated September 2025

📊Probability and Statistics

Unit & Topic Study Guides

11.1 Bayes' theorem for inference

11.2 Prior and posterior distributions

11.3 Conjugate priors

11.4 Bayesian hypothesis testing

11.5 Bayesian decision theory

Bayesian hypothesis testing offers a powerful framework for comparing competing theories using probability. It combines prior beliefs with observed data to update our understanding of hypotheses.

This approach allows for direct statements about the probability of hypotheses, unlike traditional frequentist methods. Bayesian testing incorporates uncertainty and prior knowledge, providing a nuanced way to evaluate scientific theories and make decisions.

Bayesian vs frequentist approaches

Bayesian and frequentist approaches are two main philosophical frameworks for statistical inference and hypothesis testing in probability and statistics
The choice between Bayesian and frequentist methods can have significant implications for study design, data analysis, and interpretation of results

Philosophical differences

Bayesian approach treats probability as a measure of subjective belief or uncertainty, while frequentist approach defines probability in terms of long-run frequencies of events
Bayesians use prior information and update beliefs based on observed data, whereas frequentists rely solely on the likelihood of the data
Bayesian methods aim to estimate the posterior probability distribution of parameters, while frequentist methods focus on point estimates and confidence intervals

Practical implications

Bayesian approach allows for incorporation of prior knowledge and expert opinion, which can be advantageous in fields with strong theoretical foundations or historical data
Frequentist methods are often seen as more objective and less dependent on subjective priors, making them popular in fields that emphasize strict empirical evidence
Bayesian methods provide direct probability statements about parameters and hypotheses, while frequentist results are interpreted in terms of long-run performance and error rates
Bayesian analysis can be computationally intensive, especially with complex models and large datasets, while frequentist methods are often more tractable and efficient

Bayes' theorem

Bayes' theorem is a fundamental rule in probability theory that describes how to update probabilities based on new evidence or information
It forms the foundation of Bayesian inference and provides a systematic way to combine prior knowledge with observed data

Conditional probabilities

Bayes' theorem deals with conditional probabilities, which are probabilities of events given that some other event has occurred
Denoted as $P(A|B)$, the probability of event A given event B
Conditional probabilities are related to joint probabilities and marginal probabilities through the multiplication rule: $P(A|B) = \frac{P(A \cap B)}{P(B)}$

Derivation of Bayes' theorem

Bayes' theorem can be derived from the definition of conditional probability and the multiplication rule
Starting with $P(A|B) = \frac{P(A \cap B)}{P(B)}$ and $P(B|A) = \frac{P(A \cap B)}{P(A)}$, we can rearrange to get $P(A \cap B) = P(A|B)P(B) = P(B|A)P(A)$
Solving for $P(A|B)$ yields Bayes' theorem: $P(A|B) = \frac{P(B|A)P(A)}{P(B)}$

Components of Bayes' theorem

$P(A|B)$ is the posterior probability, the probability of event A after observing event B
$P(B|A)$ is the likelihood, the probability of observing event B given that event A is true
$P(A)$ is the prior probability, the initial probability of event A before observing event B
$P(B)$ is the marginal likelihood or evidence, the total probability of observing event B

Prior probabilities

Prior probabilities represent the initial beliefs or knowledge about the parameters or hypotheses of interest before observing the data
The choice of prior distribution can have a significant impact on the posterior inferences, especially when the sample size is small

Choosing prior distributions

Prior distributions should reflect the available information and uncertainty about the parameters
Informative priors incorporate specific knowledge or expertise, while non-informative priors aim to minimize the influence on the posterior
Conjugate priors are chosen to have the same functional form as the likelihood, leading to analytically tractable posterior distributions

Informative vs non-informative priors

Informative priors are based on previous studies, expert opinion, or theoretical considerations and assign higher probabilities to certain parameter values
Non-informative priors, such as uniform or Jeffreys priors, aim to be objective and let the data dominate the posterior inference
The choice between informative and non-informative priors depends on the context and the desired balance between prior knowledge and data-driven results

Conjugate priors

Conjugate priors are families of distributions that, when combined with the likelihood function, result in a posterior distribution from the same family
Examples include beta priors for binomial likelihood, gamma priors for Poisson likelihood, and normal priors for normal likelihood with known variance
Conjugate priors simplify the computation of the posterior distribution and allow for analytical updates as new data becomes available

Likelihood functions

The likelihood function quantifies the probability of observing the data given the parameters or hypotheses under consideration
It plays a central role in both Bayesian and frequentist inference, as it connects the data to the parameters of interest

Probability of data given hypothesis

In the context of hypothesis testing, the likelihood function represents the probability of observing the data under each competing hypothesis
For a hypothesis $H$ and data $D$, the likelihood is denoted as $P(D|H)$
The likelihood is not a probability distribution over the hypotheses, but rather a measure of how well each hypothesis explains the observed data

Constructing likelihood functions

The likelihood function is constructed based on the assumed probability model for the data
For independent and identically distributed (i.i.d.) observations, the likelihood is the product of the individual probability densities or mass functions
The functional form of the likelihood depends on the type of data and the chosen probability distribution (e.g., normal, binomial, Poisson)

Maximum likelihood estimation

Maximum likelihood estimation (MLE) is a frequentist method for estimating parameters by finding the values that maximize the likelihood function
The MLE estimates are the parameter values that make the observed data most probable under the assumed probability model
MLE is often used as a point estimate in frequentist inference and can serve as a starting point for Bayesian inference with non-informative priors

Posterior probabilities

Posterior probabilities represent the updated beliefs about the parameters or hypotheses after observing the data
They combine the prior probabilities and the likelihood function through Bayes' theorem to incorporate both prior knowledge and empirical evidence

Updating beliefs with evidence

Bayes' theorem provides a systematic way to update prior beliefs in light of new evidence or data
The posterior probability is proportional to the product of the prior probability and the likelihood: $P(H|D) \propto P(D|H)P(H)$
As more data becomes available, the posterior distribution evolves to reflect the cumulative evidence and converges towards the true parameter values

Calculating posterior distributions

To obtain the posterior distribution, the product of the prior and likelihood is normalized by dividing by the marginal likelihood: $P(H|D) = \frac{P(D|H)P(H)}{P(D)}$
The marginal likelihood $P(D)$ is the probability of observing the data under all possible hypotheses, obtained by integrating or summing over the parameter space
In practice, the posterior distribution is often computed using numerical methods such as Markov chain Monte Carlo (MCMC) sampling

Summarizing posterior distributions

Posterior distributions can be summarized using point estimates, such as the posterior mean, median, or mode, to provide a single representative value
Credible intervals, such as the highest posterior density (HPD) interval or equal-tailed interval, quantify the uncertainty in the parameter estimates
Posterior probabilities of hypotheses can be compared to assess the relative support for competing models or theories

Bayesian hypothesis testing

Bayesian hypothesis testing involves comparing the posterior probabilities of competing hypotheses given the observed data
It provides a direct measure of the relative support for each hypothesis and allows for the incorporation of prior beliefs

Setting up hypotheses

In Bayesian hypothesis testing, the competing hypotheses are treated as random variables with prior probabilities
The null hypothesis $H_0$ and alternative hypothesis $H_1$ are assigned prior probabilities $P(H_0)$ and $P(H_1)$, reflecting the initial beliefs about their plausibility
The prior probabilities should sum to one and can be based on scientific knowledge, previous studies, or subjective judgment

Bayes factors

Bayes factors quantify the relative evidence in favor of one hypothesis over another, based on the observed data
The Bayes factor $BF_{10}$ is the ratio of the marginal likelihoods under the alternative and null hypotheses: $BF_{10} = \frac{P(D|H_1)}{P(D|H_0)}$
A Bayes factor greater than 1 indicates support for the alternative hypothesis, while a Bayes factor less than 1 favors the null hypothesis
Bayes factors can be interpreted using established guidelines, such as Jeffreys' scale, to assess the strength of evidence

Posterior odds ratios

Posterior odds ratios combine the prior odds and the Bayes factor to update the relative plausibility of the hypotheses after observing the data
The posterior odds ratio is the product of the prior odds ratio and the Bayes factor: $\frac{P(H_1|D)}{P(H_0|D)} = \frac{P(H_1)}{P(H_0)} \times BF_{10}$
Posterior odds ratios provide a direct measure of the relative support for the hypotheses and can be used to make decisions based on established thresholds or utility functions

Bayesian decision making

Bayesian decision theory provides a framework for making optimal decisions under uncertainty, taking into account prior beliefs, observed data, and the consequences of different actions
It combines Bayesian inference with utility theory to balance the costs and benefits of decisions in the presence of incomplete information

Expected utility theory

Expected utility theory is a normative model for rational decision making under uncertainty
It assigns utilities to the possible outcomes of each action, representing the relative desirability or value of each outcome
The expected utility of an action is the sum of the utilities of its outcomes, weighted by their respective probabilities
The optimal decision is the action that maximizes the expected utility

Minimizing expected loss

In many practical applications, the focus is on minimizing the expected loss or cost, rather than maximizing utility
The loss function quantifies the penalty or cost associated with each possible outcome, given the true state of nature
The expected loss of an action is the sum of the losses of its outcomes, weighted by their respective probabilities
The optimal decision is the action that minimizes the expected loss

Optimal decisions under uncertainty

Bayesian decision making incorporates the posterior probabilities of different states of nature, obtained through Bayesian inference, to guide the choice of actions
The posterior expected utility or loss of an action is calculated by weighting the utilities or losses of its outcomes by their posterior probabilities
The optimal decision is the action that maximizes the posterior expected utility or minimizes the posterior expected loss
Sensitivity analysis can be conducted to assess the robustness of the optimal decision to changes in the prior probabilities, utilities, or losses

Bayesian credible intervals

Bayesian credible intervals are ranges of parameter values that contain a specified probability of the true parameter value, based on the posterior distribution
They provide a intuitive measure of the uncertainty in the parameter estimates and are analogous to confidence intervals in frequentist inference

Highest posterior density intervals

The highest posterior density (HPD) interval is the narrowest interval that contains a specified probability of the posterior distribution
It is constructed by selecting the range of parameter values with the highest posterior density, such that the interval has the desired probability content
HPD intervals are unique and can be asymmetric, especially for skewed or multimodal posterior distributions

Equal-tailed intervals

Equal-tailed intervals are constructed by selecting the range of parameter values that exclude equal probabilities in the tails of the posterior distribution
For a 95% equal-tailed interval, the lower and upper bounds are the 2.5th and 97.5th percentiles of the posterior distribution
Equal-tailed intervals are symmetric and can be easier to compute than HPD intervals, but may include parameter values with lower posterior density

Interpreting credible intervals

Bayesian credible intervals have a direct probability interpretation: a 95% credible interval contains the true parameter value with a probability of 0.95, given the observed data and the prior distribution
This interpretation differs from frequentist confidence intervals, which are based on the sampling distribution of the estimator and have a long-run coverage probability
Credible intervals can be used to assess the precision of the parameter estimates, test hypotheses, and make decisions based on the posterior distribution

Sensitivity analysis

Sensitivity analysis investigates the robustness of Bayesian inferences and decisions to changes in the prior distributions, likelihood functions, or model assumptions
It helps to assess the impact of subjective choices and potential sources of uncertainty on the posterior results

Robustness to prior choice

Sensitivity to the choice of prior distribution can be evaluated by comparing the posterior inferences obtained under different priors, such as informative vs non-informative or conjugate vs non-conjugate priors
If the posterior results are similar across a range of reasonable prior distributions, the inferences are considered robust to the prior choice
If the posterior results are heavily influenced by the prior, caution should be exercised in interpreting the results, and the sensitivity should be clearly communicated

Influence of individual data points

The influence of individual data points on the posterior inferences can be assessed using case deletion or cross-validation techniques
By removing or perturbing individual observations and re-estimating the posterior distribution, the sensitivity to outliers or influential points can be evaluated
If the posterior results are strongly affected by a small number of observations, further investigation and potential model adjustments may be necessary

Checking model assumptions

Sensitivity analysis can also involve checking the assumptions of the likelihood function or the probability model for the data
Residual analysis, goodness-of-fit tests, or posterior predictive checks can be used to assess the adequacy of the assumed model
If the model assumptions are violated or the fit is poor, alternative models or more flexible likelihood functions may need to be considered
Sensitivity to the choice of likelihood function can be evaluated by comparing the posterior results obtained under different probability models

Bayesian model comparison

Bayesian model comparison involves selecting among competing models or hypotheses based on their relative evidence in light of the observed data
It provides a principled way to balance model fit and complexity, and to quantify the uncertainty in model selection

Marginal likelihoods

The marginal likelihood, also known as the evidence, is the probability of the observed data under a given model, integrated over the prior distribution of the parameters
It quantifies the overall fit of the model to the data, while automatically penalizing for model complexity and integrating out the uncertainty in the parameters
Marginal likelihoods can be difficult to compute, especially for complex models, and may require numerical methods such as importance sampling or bridge sampling

Bayes factors for model selection

Bayes factors compare the marginal likelihoods of two competing models, providing a measure of the relative evidence in favor of one model over another
The Bayes factor $BF_{12}$ is the ratio of the marginal likelihoods of model 1 and model 2: $BF_{12} = \frac{P(D|M_1)}{P(D|M_2)}$
A Bayes factor greater than 1 indicates support for model 1, while a Bayes factor less than 1 favors model 2
Bayes factors can be interpreted using established guidelines, such as Jeffreys' scale, to assess the strength of evidence for each model

Bayesian model averaging

Bayesian model averaging (BMA) accounts for the uncertainty in model selection by combining the predictions or inferences from multiple models, weighted by their posterior probabilities
The posterior probability of each model is proportional to the product of its prior probability and marginal likelihood: $P(M_k|D) \propto P(D|M_k)P(M_k)$
BMA provides a coherent way to incorporate model uncertainty into the final inferences and can improve predictive performance and robustness
The implementation of BMA can be challenging, especially with a large number of models, and may require efficient sampling or approximation techniques

Computational methods

Bayesian inference often involves complex and high-dimensional posterior distributions that cannot be analytically derived or easily summarized
Computational methods, such as Markov chain Monte Carlo (MCMC) and variational inference, are used to approximate the posterior distribution and obtain samples or estimates of the parameters

Markov chain Monte Carlo (MCMC)

MCMC methods generate samples from the posterior distribution by constructing a Markov chain that has the desired posterior as its stationary distribution
The samples are obtained by iteratively simulating from the Markov chain, with each new sample depending only on the previous one
Common MCMC algorithms include the Metropolis-Hastings algorithm and the Gibbs sampler
MCMC samples can be used to estimate posterior summaries, such as means, medians, and credible intervals, and to assess convergence and mixing of the Markov chain

Gibbs sampling

Gibbs sampling is a special case of the Metropolis-Hastings algorithm that is particularly useful when the posterior distribution is difficult to sample directly, but the conditional distributions of each parameter given the others are easy to simulate from
It iteratively samples from the conditional distributions of each parameter, updating one parameter at a time while keeping the others fixed
Gibbs sampling can be efficient and easy to implement for models with conjugate priors and tractable conditional distributions
It is widely used in hierarchical models and latent variable models, such as Bayesian

📊Probability and Statistics Unit 11 Review

11.4 Bayesian hypothesis testing

📊Probability and Statistics Unit 11 Review

11.4 Bayesian hypothesis testing

Unit & Topic Study Guides

Bayesian vs frequentist approaches

Philosophical differences

Practical implications

Bayes' theorem

Conditional probabilities

Derivation of Bayes' theorem

Components of Bayes' theorem

Prior probabilities

Choosing prior distributions

Informative vs non-informative priors

Conjugate priors

Likelihood functions

Probability of data given hypothesis

Constructing likelihood functions

Maximum likelihood estimation

Posterior probabilities

Updating beliefs with evidence

Calculating posterior distributions

Summarizing posterior distributions

Bayesian hypothesis testing

Setting up hypotheses

Bayes factors

Posterior odds ratios

Bayesian decision making

Expected utility theory

Minimizing expected loss

Optimal decisions under uncertainty

Bayesian credible intervals

Highest posterior density intervals

Equal-tailed intervals

Interpreting credible intervals

Sensitivity analysis

Robustness to prior choice

Influence of individual data points

Checking model assumptions

Bayesian model comparison

Marginal likelihoods

Bayes factors for model selection

Bayesian model averaging

Computational methods

Markov chain Monte Carlo (MCMC)

Gibbs sampling

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

📊Probability and Statistics
Unit 11 Review