Fiveable

๐Ÿ“ˆIntro to Probability for Business Unit 11 Review

QR code for Intro to Probability for Business practice questions

11.2 Simple Linear Regression Model

๐Ÿ“ˆIntro to Probability for Business
Unit 11 Review

11.2 Simple Linear Regression Model

Written by the Fiveable Content Team โ€ข Last updated September 2025
Written by the Fiveable Content Team โ€ข Last updated September 2025
๐Ÿ“ˆIntro to Probability for Business
Unit & Topic Study Guides

Linear regression is a powerful statistical tool for understanding relationships between variables. It helps us predict one variable based on another, using a simple equation that captures their connection. This method is crucial for business decisions, from sales forecasting to understanding customer behavior.

The key components of linear regression include the slope, y-intercept, and error term. By interpreting these elements and assessing the model's fit through R-squared values, we can gauge how well our predictions match reality and make informed business choices.

Components and Interpretation of Simple Linear Regression

Components of linear regression

  • Simple linear regression model expressed as $y = \beta_0 + \beta_1x + \epsilon$
    • $y$ dependent variable (response variable) being predicted or explained
    • $x$ independent variable (explanatory variable) used to predict or explain changes in $y$
    • $\beta_0$ y-intercept, value of $y$ when $x$ equals zero
    • $\beta_1$ slope, change in $y$ for a one-unit increase in $x$
    • $\epsilon$ random error term, accounts for variability in $y$ not explained by linear relationship with $x$

Interpretation of slope vs y-intercept

  • Slope ($\beta_1$) change in dependent variable ($y$) for one-unit increase in independent variable ($x$)
    • Interpretation depends on context and units of variables
      • Sales ($y$) and advertising expenditure ($x$), slope of 50 means $1,000 increase in advertising leads to $50 increase in sales
  • Y-intercept ($\beta_0$) value of dependent variable ($y$) when independent variable ($x$) equals zero
    • Interpretation depends on context and whether $x = 0$ is meaningful
      • Number of employees ($x$), $\beta_0$ might not have practical interpretation, as company cannot have zero employees

Equation and Prediction in Simple Linear Regression

Equation of regression models

  • Least squares method estimates slope ($\beta_1$) and y-intercept ($\beta_0$) from data points
    • Calculate slope: $\beta_1 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n}(x_i - \bar{x})^2}$
      • $x_i$ and $y_i$ individual data points
      • $\bar{x}$ and $\bar{y}$ means of $x$ and $y$
      • $n$ number of data points
    • Calculate y-intercept: $\beta_0 = \bar{y} - \beta_1\bar{x}$
  • Substitute estimated slope and y-intercept into simple linear regression model equation: $\hat{y} = \beta_0 + \beta_1x$
    • $\hat{y}$ predicted value of dependent variable

Predictions from regression equations

  • Use estimated simple linear regression model equation $\hat{y} = \beta_0 + \beta_1x$ to predict value of dependent variable ($\hat{y}$) for given value of independent variable ($x$)
    1. Substitute given value of $x$ into equation
    2. Calculate predicted value of $\hat{y}$
      • Estimated regression equation $\hat{y} = 100 + 50x$ and $x = 2$, predicted value of $\hat{y}$ is $\hat{y} = 100 + 50(2) = 200$

Goodness of Fit in Simple Linear Regression

Goodness of fit assessment

  • Assess goodness of fit using coefficient of determination (R-squared)
    • R-squared proportion of variance in dependent variable ($y$) predictable from independent variable ($x$)
    • Formula: $R^2 = \frac{SSR}{SST} = 1 - \frac{SSE}{SST}$
      • $SSR$ sum of squares regression (explained variation)
      • $SSE$ sum of squares error (unexplained variation)
      • $SST$ total sum of squares (total variation)

Meaning of R-squared values

  • R-squared ranges from 0 to 1, higher values indicate better fit, lower values indicate poorer fit
    • R-squared of 0 none of variance in $y$ explained by $x$
    • R-squared of 1 all of variance in $y$ explained by $x$
    • R-squared of 0.75 means 75% of variance in dependent variable explained by independent variable, 25% unexplained