Fiveable

๐ŸŽฒIntro to Statistics Unit 12 Review

QR code for Intro to Statistics practice questions

12.3 The Regression Equation

๐ŸŽฒIntro to Statistics
Unit 12 Review

12.3 The Regression Equation

Written by the Fiveable Content Team โ€ข Last updated September 2025
Written by the Fiveable Content Team โ€ข Last updated September 2025
๐ŸŽฒIntro to Statistics
Unit & Topic Study Guides

Linear regression helps us understand relationships between variables. It's like drawing a line through scattered dots to see patterns. We use this to predict outcomes based on input data.

The regression equation gives us a formula for that line. It shows how one variable changes when another does. This helps us make educated guesses about future trends or unknown values.

The Regression Equation

Least-squares regression line calculation

  • Least-squares regression line fits data points best by minimizing sum of squared vertical distances between points and line
  • Equation of least-squares regression line: $\hat{y} = b_0 + b_1x$
    • $\hat{y}$: predicted value of response variable (dependent variable)
    • $b_0$: y-intercept of regression line
    • $b_1$: slope of regression line
    • $x$: value of explanatory variable (predictor variable)
  • Calculate slope ($b_1$) and y-intercept ($b_0$) using formulas:
    1. $b_1 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n} (x_i - \bar{x})^2}$
    2. $b_0 = \bar{y} - b_1\bar{x}$
    • $\bar{x}$: mean of explanatory variable
    • $\bar{y}$: mean of response variable
    • $x_i$ and $y_i$: individual values of explanatory and response variables

Interpretation of regression slope

  • Slope ($b_1$) represents change in response variable ($y$) for one-unit increase in explanatory variable ($x$)
  • Interpretation depends on context of variables studied
    • Salary example: slope of 1,500 indicates employee's salary increases by $1,500 on average for each additional year of experience

Correlation and determination coefficients

  • Correlation coefficient ($r$) measures strength and direction of linear relationship between explanatory and response variables
    • $r$ ranges from -1 to 1, values closer to -1 or 1 indicate stronger linear relationship
    • Positive $r$: positive linear relationship
    • Negative $r$: negative linear relationship
  • Coefficient of determination ($r^2$) represents proportion of variation in response variable explained by explanatory variable in regression model
    • $r^2$ ranges from 0 to 1, values closer to 1 indicate more variation explained by explanatory variable
    • Example: $r^2 = 0.75$ means 75% of variation in response variable explained by explanatory variable, 25% due to other factors or random variation

Linear Regression Analysis

  • Linear regression is a statistical method used to model the relationship between variables
  • Scatterplots are used to visualize the relationship between two variables
  • The line of best fit (regression line) is determined through regression analysis
  • Regression analysis helps identify patterns and make predictions based on the relationship between variables