📊Actuarial Mathematics Unit 11 Review

11.1 Generalized linear models and regression analysis

📊Actuarial Mathematics
Unit 11 Review

11.1 Generalized linear models and regression analysis

Written by the Fiveable Content Team • Last updated September 2025

📊Actuarial Mathematics

Unit & Topic Study Guides

11.1 Generalized linear models and regression analysis

11.2 Survival analysis and Cox proportional hazards

11.3 Time series analysis and forecasting

11.4 Bayesian inference and Markov chain Monte Carlo

11.5 Machine learning and predictive modeling

Generalized linear models expand on traditional linear regression, allowing for analysis of non-normal data common in insurance and finance. They combine a response variable, linear predictor, link function, and variance function to model complex relationships between variables.

GLMs are crucial for actuaries in risk assessment and pricing. By accommodating various data distributions, they provide flexible tools for modeling claim frequency, severity, and other key metrics in insurance and financial applications.

Fundamentals of generalized linear models

Generalized linear models (GLMs) extend the concept of linear regression to accommodate a wider range of response variable distributions, making them essential tools in actuarial modeling and risk assessment
GLMs allow for the analysis of non-normal data, such as count data, binary outcomes, and continuous positive data, which are common in insurance and financial applications

Components of GLMs

Response variable: The dependent variable in a GLM, which follows a distribution from the exponential family
Linear predictor: A linear combination of the explanatory variables, denoted as $\eta = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p$
Link function: A function that relates the expected value of the response variable to the linear predictor, allowing for non-linear relationships between the predictors and the response
Variance function: Describes the relationship between the mean and variance of the response variable, which depends on the chosen distribution

Exponential family of distributions

The exponential family includes a wide range of distributions, such as Normal, Poisson, Binomial, Gamma, and Inverse Gaussian
Distributions in the exponential family have a common form for their probability density or mass function, given by $f(y; \theta, \phi) = \exp\left(\frac{y\theta - b(\theta)}{a(\phi)} + c(y, \phi)\right)$
- $\theta$: natural parameter
- $\phi$: dispersion parameter
- $a(\cdot)$, $b(\cdot)$, and $c(\cdot)$: functions specific to each distribution

Link functions for GLMs

The link function $g(\cdot)$ relates the expected value of the response variable $\mu = \mathbb{E}(Y)$ to the linear predictor $\eta$, such that $g(\mu) = \eta$
Common link functions include:
- Identity link: $g(\mu) = \mu$ (used in linear regression)
- Log link: $g(\mu) = \log(\mu)$ (used in Poisson regression)
- Logit link: $g(\mu) = \log\left(\frac{\mu}{1-\mu}\right)$ (used in logistic regression)
- Inverse link: $g(\mu) = \frac{1}{\mu}$ (used in Gamma regression)
The choice of link function depends on the distribution of the response variable and the desired interpretation of the model coefficients

Maximum likelihood estimation in GLMs

Maximum likelihood estimation (MLE) is a method for estimating the parameters of a GLM by maximizing the likelihood function, which measures the probability of observing the data given the model parameters
MLE is a fundamental concept in actuarial science, as it allows for the estimation of risk parameters and the assessment of model fit

Log-likelihood function

The log-likelihood function is the natural logarithm of the likelihood function, given by $\ell(\boldsymbol{\beta}) = \sum_{i=1}^n \log f(y_i; \theta_i, \phi)$, where $\boldsymbol{\beta}$ is the vector of regression coefficients
Maximizing the log-likelihood function is equivalent to maximizing the likelihood function, but it is often more convenient to work with the log-likelihood due to its additive properties

Fisher scoring algorithm

The Fisher scoring algorithm is an iterative method for finding the maximum likelihood estimates of the regression coefficients in a GLM
The algorithm updates the estimates at each iteration using the Fisher information matrix, which is the expected value of the negative Hessian matrix of the log-likelihood function
The Fisher scoring algorithm is more stable and efficient than the Newton-Raphson method, particularly when the likelihood function is not well-behaved

Iteratively reweighted least squares

Iteratively reweighted least squares (IRLS) is an alternative formulation of the Fisher scoring algorithm that is commonly used in statistical software packages
IRLS recasts the GLM estimation problem as a weighted least squares problem, where the weights are updated at each iteration based on the current estimates of the regression coefficients and the variance function
The IRLS algorithm converges to the same maximum likelihood estimates as the Fisher scoring algorithm, but it provides additional insights into the structure of the GLM and the role of the variance function

Model selection and validation

Model selection and validation are crucial steps in the development of GLMs, as they help to identify the most appropriate model structure and assess its performance on new data
Actuaries use various model selection criteria and validation techniques to balance model complexity and predictive accuracy, ensuring that the chosen model is suitable for pricing, reserving, and risk management purposes

Deviance and likelihood ratio tests

Deviance is a measure of the goodness of fit of a GLM, defined as twice the difference between the log-likelihood of the saturated model (a model with a separate parameter for each observation) and the log-likelihood of the fitted model
Likelihood ratio tests compare the deviance of two nested models (where one model is a special case of the other) to assess whether the more complex model provides a significantly better fit to the data
The likelihood ratio test statistic follows a chi-squared distribution under the null hypothesis that the simpler model is adequate, with degrees of freedom equal to the difference in the number of parameters between the two models

Akaike information criterion (AIC)

The Akaike information criterion (AIC) is a model selection criterion that balances the goodness of fit of a model with its complexity, penalizing models with a larger number of parameters
AIC is defined as $\text{AIC} = -2\ell(\hat{\boldsymbol{\beta}}) + 2p$, where $\ell(\hat{\boldsymbol{\beta}})$ is the log-likelihood of the fitted model and $p$ is the number of parameters
Models with lower AIC values are preferred, as they provide a better trade-off between fit and complexity

Bayesian information criterion (BIC)

The Bayesian information criterion (BIC), also known as the Schwarz criterion, is another model selection criterion that penalizes model complexity more heavily than AIC
BIC is defined as $\text{BIC} = -2\ell(\hat{\boldsymbol{\beta}}) + p\log(n)$, where $n$ is the sample size
Like AIC, models with lower BIC values are preferred, but BIC tends to favor simpler models than AIC, particularly when the sample size is large

Residual analysis and diagnostics

Residual analysis involves examining the differences between the observed response values and the fitted values from the GLM to assess model assumptions and identify potential outliers or influential observations
Common diagnostic plots for GLMs include:
- Residuals vs. fitted values plot: Checks for non-linearity, heteroscedasticity, and outliers
- Normal Q-Q plot of residuals: Assesses the normality assumption for the errors
- Scale-location plot: Examines the relationship between the absolute residuals and the fitted values to detect heteroscedasticity
- Cook's distance plot: Identifies influential observations that may have a disproportionate impact on the model estimates
Residual analysis helps actuaries to refine their models, detect violations of assumptions, and improve the reliability of their predictions

Poisson regression for count data

Poisson regression is a type of GLM used to model count data, where the response variable represents the number of events occurring in a fixed interval of time or space
In actuarial applications, Poisson regression is often used to model claim frequency, the number of accidents, or the number of policy renewals

Poisson distribution and assumptions

The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval, assuming a constant average rate of occurrence
The probability mass function of the Poisson distribution is given by $P(Y = k) = \frac{e^{-\lambda}\lambda^k}{k!}$, where $\lambda$ is the average rate of occurrence
Poisson regression assumes that:
- The response variable follows a Poisson distribution
- The logarithm of the expected value of the response variable is linearly related to the predictors
- The events occur independently of each other

Log-linear models and interpretation

In Poisson regression, the link function is the natural logarithm, resulting in a log-linear model: $\log(\mathbb{E}(Y)) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p$
The coefficients in a Poisson regression model can be interpreted as the change in the log of the expected count for a one-unit increase in the corresponding predictor, holding all other predictors constant
To obtain the multiplicative effect of a predictor on the expected count, we can exponentiate the coefficient: $\exp(\beta_j)$ represents the ratio of the expected count for a one-unit increase in $x_j$, holding all other predictors constant

Overdispersion and quasi-Poisson models

Overdispersion occurs when the variance of the response variable is greater than its mean, violating the equidispersion assumption of the Poisson distribution
Overdispersion can lead to underestimated standard errors and incorrect inferences about the significance of predictors
Quasi-Poisson models address overdispersion by introducing a dispersion parameter $\phi$ that scales the variance of the response variable: $\text{Var}(Y) = \phi\mathbb{E}(Y)$
The quasi-Poisson model retains the same mean structure as the Poisson model but adjusts the standard errors and inference procedures to account for overdispersion

Logistic regression for binary outcomes

Logistic regression is a type of GLM used to model binary or categorical response variables, where the outcome of interest is the probability of an event occurring
In actuarial applications, logistic regression is often used to model the probability of a claim being filed, the likelihood of a policyholder renewing their coverage, or the risk of default on a loan

Logit link function and odds ratios

In logistic regression, the link function is the logit function, which is the natural logarithm of the odds: $\text{logit}(p) = \log\left(\frac{p}{1-p}\right) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p$, where $p$ is the probability of the event occurring
The coefficients in a logistic regression model can be interpreted as the change in the log odds of the event for a one-unit increase in the corresponding predictor, holding all other predictors constant
The exponential of a coefficient, $\exp(\beta_j)$, represents the odds ratio for a one-unit increase in $x_j$, holding all other predictors constant
- An odds ratio greater than 1 indicates an increased likelihood of the event
- An odds ratio less than 1 indicates a decreased likelihood of the event

Interpretation of coefficients

To interpret the coefficients in a logistic regression model, it is often helpful to convert the log odds to probabilities using the inverse logit function: $p = \frac{\exp(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p)}{1 + \exp(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p)}$
The change in the probability of the event for a one-unit increase in a predictor depends on the values of the other predictors, due to the non-linear nature of the logit function
To assess the impact of a predictor on the probability scale, it is common to compute average marginal effects or predict probabilities at representative values of the predictors

Receiver operating characteristic (ROC) curves

Receiver operating characteristic (ROC) curves are a graphical tool for evaluating the performance of a binary classifier, such as a logistic regression model
An ROC curve plots the true positive rate (sensitivity) against the false positive rate (1 - specificity) for different classification thresholds
The area under the ROC curve (AUC) is a summary measure of the model's discriminatory power, with values closer to 1 indicating better performance
Actuaries use ROC curves and AUC to compare different logistic regression models, assess the trade-off between sensitivity and specificity, and select optimal classification thresholds based on business objectives

Gamma regression for continuous positive data

Gamma regression is a type of GLM used to model continuous, positive response variables that exhibit a right-skewed distribution
In actuarial applications, Gamma regression is often used to model claim severity, loss amounts, or insurance premiums

Gamma distribution and assumptions

The Gamma distribution is a continuous probability distribution that describes the waiting time until a specified number of events occur in a Poisson process
The probability density function of the Gamma distribution is given by $f(y; \alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)}y^{\alpha-1}e^{-\beta y}$, where $\alpha$ is the shape parameter and $\beta$ is the rate parameter
Gamma regression assumes that:
- The response variable follows a Gamma distribution
- The reciprocal of the expected value of the response variable is linearly related to the predictors
- The variance of the response variable is proportional to the square of its mean

Inverse link function and interpretation

In Gamma regression, the link function is the inverse link, resulting in an inverse linear model: $\frac{1}{\mathbb{E}(Y)} = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p$
The coefficients in a Gamma regression model can be interpreted as the change in the reciprocal of the expected value for a one-unit increase in the corresponding predictor, holding all other predictors constant
To obtain the multiplicative effect of a predictor on the expected value, we can calculate $\exp(-\beta_j)$, which represents the ratio of the expected value for a one-unit increase in $x_j$, holding all other predictors constant

Applications in insurance modeling

Gamma regression is widely used in insurance modeling, particularly for pricing and reserving purposes
Examples of applications include:
- Modeling the severity of auto insurance claims, with predictors such as driver age, vehicle type, and accident history
- Estimating the average cost per claim for a health insurance portfolio, based on policyholder characteristics and medical conditions
- Predicting the loss amount for property insurance policies, considering factors such as property value, location, and construction type
By accurately modeling claim severity and loss amounts, actuaries can develop more precise pricing models, set adequate reserves, and manage risk exposure for insurance companies

Tweedie regression for compound Poisson-Gamma data

Tweedie regression is a type of GLM that combines the properties of Poisson and Gamma distributions to model continuous, non-negative data with a mass at zero
In actuarial applications, Tweedie regression is often used to model aggregate losses, where some observations have zero losses and others have positive loss amounts

Tweedie distribution and properties

The Tweedie distribution is a family of probability distributions that includes the Poisson, Gamma, and Gaussian distributions as special cases
The Tweedie distribution is characterized by a power variance function, where the variance is proportional to the mean raised to a power $p$: $\text{Var}(Y) = \phi\mathbb{E}(Y)^p$
The value of $p$ determines the specific distribution within the Tweedie family:
- $p = 0$: Normal distribution
- $p = 1$: Poisson distribution
- $1 < p < 2$: Compound Poisson-Gamma distribution
- $p = 2$: Gamma distribution
- $p = 3$: Inverse Gaussian distribution

Power variance function and p-values

In Tweedie regression, the link function is the log link, and the variance function is the power variance function: $\log(\mathbb{E}(Y)) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p$ and $\text{Var}(Y) = \phi\mathbb{E}(Y)^p$
The power parameter $p$ is estimated along with the regression coefficients and the dispersion parameter $\phi$ using maximum likelihood estimation
The choice of $p$ affects the interpretation of the model coefficients and the handling of zero observations in the data
Statistical software packages often provide tools for estimating the optimal value of $p$ based on the data and for assessing the goodness of fit of Tweedie regression models

Applications in actuarial science

Tweedie regression is particularly useful in actuarial science for modeling aggregate losses, which exhibit a mixture of zero and positive values
Examples of applications include:
- Modeling the total claim amount for a portfolio of insurance policies, where some policyholders have no claims and others have varying claim amounts
- Estimating the

📊Actuarial Mathematics Unit 11 Review

11.1 Generalized linear models and regression analysis

📊Actuarial Mathematics Unit 11 Review

11.1 Generalized linear models and regression analysis

Unit & Topic Study Guides

Fundamentals of generalized linear models

Components of GLMs

Exponential family of distributions

Link functions for GLMs

Maximum likelihood estimation in GLMs

Log-likelihood function

Fisher scoring algorithm

Iteratively reweighted least squares

Model selection and validation

Deviance and likelihood ratio tests

Akaike information criterion (AIC)

Bayesian information criterion (BIC)

Residual analysis and diagnostics

Poisson regression for count data

Poisson distribution and assumptions

Log-linear models and interpretation

Overdispersion and quasi-Poisson models

Logistic regression for binary outcomes

Logit link function and odds ratios

Interpretation of coefficients

Receiver operating characteristic (ROC) curves

Gamma regression for continuous positive data

Gamma distribution and assumptions

Inverse link function and interpretation

Applications in insurance modeling

Tweedie regression for compound Poisson-Gamma data

Tweedie distribution and properties

Power variance function and p-values

Applications in actuarial science

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

📊Actuarial Mathematics
Unit 11 Review