📊Honors Statistics Unit 11 Review

11.3 Test of Independence

📊Honors Statistics
Unit 11 Review

11.3 Test of Independence

Written by the Fiveable Content Team • Last updated September 2025

📊Honors Statistics

Unit & Topic Study Guides

11.1 Facts About the Chi-Square Distribution

11.2 Goodness-of-Fit Test

11.3 Test of Independence

11.4 Test for Homogeneity

11.5 Comparison of the Chi-Square Tests

11.6 Test of a Single Variance

11.7 Lab 1: Chi-Square Goodness-of-Fit

11.8 Lab 2: Chi-Square Test of Independence

The test of independence helps determine if two categorical variables are related. Using contingency tables and chi-square calculations, we can analyze observed frequencies against expected values to assess independence.

This statistical method is crucial for understanding relationships in categorical data. By following a step-by-step process, we can calculate test statistics, compare them to critical values, and draw conclusions about variable dependencies.

Test of Independence

Construction of contingency tables

Two-way frequency table displays relationship between two categorical variables
- Rows represent one categorical variable (gender)
- Columns represent the other categorical variable (preferred color)
- Each cell contains observed frequency or count of intersection between row and column variables
Steps to construct contingency table:
1. Identify two categorical variables of interest
2. Determine levels or categories for each variable
3. Create table with rows and columns representing levels of each variable
4. Fill in cells with observed frequencies or counts for each combination of row and column categories (25 males prefer blue)
Total of each row and column called marginal frequency
Grand total is sum of all observations in table

Calculation of chi-square test statistic

Test statistic for test of independence calculated using chi-square distribution
- Chi-square distribution is right-skewed distribution with degrees of freedom equal to $(r-1)(c-1)$, where $r$ is number of rows and $c$ is number of columns in contingency table
To calculate test statistic:
1. Compute expected frequency for each cell in contingency table
  - Expected frequency $E_{ij} = \frac{(\text{row }i\text{ total}) \times (\text{column }j\text{ total})}{\text{grand total}}$
2. Calculate chi-square test statistic using formula:
  - $\chi^2 = \sum_{i=1}^{r} \sum_{j=1}^{c} \frac{(O_{ij} - E_{ij})^2}{E_{ij}}$
  - $O_{ij}$ is observed frequency in $i$-th row and $j$-th column
  - $E_{ij}$ is expected frequency in $i$-th row and $j$-th column
Test statistic measures discrepancy between observed and expected frequencies
- Larger test statistic indicates greater difference between observed and expected values, suggesting dependence between variables (test statistic of 12.5 suggests strong dependence)
- Effect size can be calculated to quantify the strength of the relationship between variables

Determination of factor independence

Test of independence used to determine if there is significant relationship between two categorical variables
Null hypothesis ($H_0$): Two categorical variables are independent
Alternative hypothesis ($H_1$): Two categorical variables are dependent
Steps to conduct test of independence:
1. State null and alternative hypotheses
2. Construct contingency table and calculate expected frequencies
3. Calculate chi-square test statistic
4. Determine degrees of freedom $(r-1)(c-1)$
5. Choose significance level ($\alpha = 0.05$)
6. Find critical value from chi-square distribution table using degrees of freedom and significance level
7. Compare test statistic to critical value or calculate p-value
  - If test statistic greater than critical value or p-value less than significance level, reject null hypothesis and conclude variables are dependent (test statistic of 15.2 > critical value of 7.81, reject $H_0$)
  - If test statistic less than critical value or p-value greater than significance level, fail to reject null hypothesis and conclude insufficient evidence to suggest dependence between variables (test statistic of 3.5 < critical value of 7.81, fail to reject $H_0$)
- Sample size affects the power of the test to detect significant relationships

Additional Analysis

Post-hoc analysis can be conducted to identify specific categories contributing to significant results
Standardized residuals can be calculated to determine which cells in the contingency table contribute most to the chi-square statistic

📊Honors Statistics Unit 11 Review

11.3 Test of Independence

📊Honors Statistics
Unit 11 Review

11.3 Test of Independence

Unit & Topic Study Guides

Test of Independence

Construction of contingency tables

Calculation of chi-square test statistic

Determination of factor independence

Additional Analysis

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

Study Content & Tools

Company

Resources

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes