🥖Linear Modeling Theory Unit 2 Review

2.3 Measures of Model Fit: R-squared and Adjusted R-squared

🥖Linear Modeling Theory
Unit 2 Review

2.3 Measures of Model Fit: R-squared and Adjusted R-squared

Written by the Fiveable Content Team • Last updated September 2025

🥖Linear Modeling Theory

Unit & Topic Study Guides

2.1 Ordinary Least Squares (OLS) Method

2.2 Properties of Least Squares Estimators

2.3 Measures of Model Fit: R-squared and Adjusted R-squared

2.4 Analysis of Variance (ANOVA) Table for Regression

Measures of model fit help us gauge how well our linear regression model explains the data. R-squared tells us what percentage of variation in the dependent variable our model accounts for, ranging from 0 to 1.

While R-squared is useful, it has limitations. Enter adjusted R-squared, which penalizes adding unnecessary variables. This helps us avoid overfitting and compare models with different numbers of predictors more accurately.

Coefficient of determination (R-squared)

Definition and interpretation

R-squared is a statistical measure representing the proportion of variance in the dependent variable predictable from the independent variable(s) in a linear regression model
Ranges from 0 to 1, with higher values indicating a better fit of the model to the data
- An R-squared of 1 means the model explains all the variability of the response data around its mean
Interprets the percentage of variation in the dependent variable explainable by the independent variable(s) in the model
Also known as the coefficient of determination, commonly used to assess the goodness of fit of a linear regression model
Formula for R-squared: $R-squared = 1 - (SSR / SST)$, where SSR is the sum of squared residuals and SST is the total sum of squares

Importance and usage

R-squared provides a quantitative measure of how well the linear regression model fits the observed data
Helps evaluate the strength of the relationship between the dependent and independent variables
Allows comparison of different models to determine which one better explains the variability in the data
Widely used in various fields (economics, social sciences, engineering) to assess the explanatory power of linear regression models

Calculating R-squared

Required components

To calculate R-squared, you need the sum of squared residuals (SSR) and the total sum of squares (SST) from the linear regression model
SSR is the sum of the squared differences between the predicted values and the actual values of the dependent variable
- Represents the amount of variation in the dependent variable not explained by the model
SST is the sum of the squared differences between the actual values of the dependent variable and its mean
- Represents the total variation in the dependent variable

Calculation methods

Once you have SSR and SST, use the formula $R-squared = 1 - (SSR / SST)$ to calculate R-squared
Alternatively, most statistical software packages (SPSS, R) and programming languages (Python) provide functions to directly compute R-squared for a given linear regression model
- Example in R: summary(lm_model)$r.squared returns the R-squared value for the linear model lm_model
- Example in Python with scikit-learn: from sklearn.metrics import r2_score; r2_score(y_true, y_pred) calculates R-squared given the true values (y_true) and predicted values (y_pred)

R-squared limitations vs adjusted R-squared

Limitations of R-squared

R-squared increases as more independent variables are added to the model, even if those variables do not have a significant impact on the dependent variable
- This can lead to the inclusion of irrelevant variables and overfitting
Does not indicate whether the independent variables are statistically significant or if the model is appropriate for the data
- Only measures the goodness of fit without considering the model's validity
Does not consider the number of independent variables in the model, potentially leading to overfitting if too many variables are included

Adjusted R-squared as an alternative

Adjusted R-squared addresses the limitations of R-squared by adjusting for the number of independent variables in the model
Penalizes the addition of unnecessary independent variables, providing a more reliable measure of the model's goodness of fit
Particularly useful when comparing models with different numbers of independent variables
- Helps determine if adding more variables truly improves the model's explanatory power

Adjusted R-squared interpretation

Calculation and formula

Adjusted R-squared is calculated using the formula: $Adjusted R-squared = 1 - [(1 - R-squared) (n - 1) / (n - k - 1)]$, where n is the number of observations and k is the number of independent variables in the model
The adjusted R-squared value will always be less than or equal to the R-squared value
- Decreases when the number of independent variables increases without a corresponding improvement in the model's fit

Interpretation and comparison

The interpretation of adjusted R-squared is similar to R-squared
- Represents the proportion of variance in the dependent variable predictable from the independent variable(s), adjusted for the number of variables in the model
A higher adjusted R-squared value indicates a better fit of the model to the data, considering the number of independent variables used
When comparing models with different numbers of independent variables, adjusted R-squared is a more appropriate measure than R-squared
- Helps identify the model that strikes a balance between explanatory power and parsimony (using fewer variables)

🥖Linear Modeling Theory Unit 2 Review

2.3 Measures of Model Fit: R-squared and Adjusted R-squared

🥖Linear Modeling Theory
Unit 2 Review

2.3 Measures of Model Fit: R-squared and Adjusted R-squared

Unit & Topic Study Guides

Coefficient of determination (R-squared)

Definition and interpretation

Importance and usage

Calculating R-squared

Required components

Calculation methods

R-squared limitations vs adjusted R-squared

Limitations of R-squared

Adjusted R-squared as an alternative

Adjusted R-squared interpretation

Calculation and formula

Interpretation and comparison

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

Study Content & Tools

Company

Resources

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes