📊Probability and Statistics Unit 9 Review

9.5 t-tests and z-tests

📊Probability and Statistics
Unit 9 Review

9.5 t-tests and z-tests

Written by the Fiveable Content Team • Last updated September 2025

📊Probability and Statistics

Unit & Topic Study Guides

9.1 Confidence intervals for means

9.2 Confidence intervals for proportions

9.3 Hypothesis testing framework

9.4 Type I and Type II errors

9.5 t-tests and z-tests

9.6 ANOVA

Hypothesis testing is a crucial statistical method for making decisions about populations based on sample data. It involves formulating null and alternative hypotheses, collecting data, and using statistical tests to assess the significance of research findings.

T-tests and z-tests are common tools for comparing means and proportions. These tests help researchers determine if there are significant differences between samples or if sample statistics differ from hypothesized population parameters. Understanding their assumptions and applications is essential for accurate data analysis.

Fundamentals of hypothesis testing

Hypothesis testing is a statistical method used to make decisions or draw conclusions about a population based on sample data
Involves formulating a null hypothesis ($H_0$) and an alternative hypothesis ($H_a$), collecting data, and using statistical tests to determine whether to reject or fail to reject the null hypothesis
Hypothesis tests are widely used in various fields, including psychology, biology, economics, and social sciences, to assess the significance of research findings and make data-driven decisions

Null and alternative hypotheses

The null hypothesis ($H_0$) represents the default or status quo position, typically stating that there is no significant difference or relationship between variables
The alternative hypothesis ($H_a$) is the claim that the researcher wants to support, usually indicating the presence of a significant difference or relationship
The choice of the null and alternative hypotheses depends on the research question and the direction of the expected effect

One-tailed vs two-tailed tests

One-tailed tests are used when the alternative hypothesis specifies a direction (greater than or less than) for the difference or relationship
Two-tailed tests are used when the alternative hypothesis does not specify a direction, only that there is a difference or relationship
The choice between a one-tailed or two-tailed test affects the critical values and the interpretation of the results
- One-tailed tests allocate the entire significance level (e.g., $\alpha = 0.05$) to one side of the distribution
- Two-tailed tests split the significance level equally between both sides of the distribution (e.g., $\alpha/2 = 0.025$ on each side)

Test statistics

Test statistics are calculated values used to make decisions in hypothesis testing by comparing them to critical values or p-values
The choice of the test statistic depends on the type of data, the sample size, and the assumptions of the test
Common test statistics include the t-statistic for t-tests and the z-statistic for z-tests

t-statistic for t-tests

The t-statistic is used when the sample size is small ($n < 30$) or when the population standard deviation is unknown
It follows a t-distribution with $n-1$ degrees of freedom
The formula for the t-statistic is: $t = \frac{\bar{x} - \mu}{s/\sqrt{n}}$ where $\bar{x}$ is the sample mean, $\mu$ is the hypothesized population mean, $s$ is the sample standard deviation, and $n$ is the sample size

z-statistic for z-tests

The z-statistic is used when the sample size is large ($n \geq 30$) and the population standard deviation is known
It follows a standard normal distribution (mean = 0, standard deviation = 1)
The formula for the z-statistic is: $z = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}$ where $\bar{x}$ is the sample mean, $\mu$ is the hypothesized population mean, $\sigma$ is the population standard deviation, and $n$ is the sample size

t-tests for means

t-tests are used to compare means and determine if there is a significant difference between them
There are three main types of t-tests: one-sample t-test, two-sample t-test, and paired t-test
t-tests assume that the data are approximately normally distributed and have equal variances (for two-sample t-tests)

One-sample t-test

Used to compare a sample mean to a known or hypothesized population mean
Tests whether the sample mean is significantly different from the population mean
Example: Testing if the average height of a sample of students is significantly different from the national average height

Two-sample t-test

Used to compare the means of two independent samples
Tests whether the means of the two samples are significantly different from each other
Example: Comparing the average test scores of students in two different teaching methods (traditional vs. online)

Paired t-test

Used to compare the means of two related or dependent samples (e.g., before and after measurements on the same individuals)
Tests whether the mean difference between the paired observations is significantly different from zero
Example: Comparing the blood pressure of patients before and after taking a new medication

Assumptions of t-tests

The data should be approximately normally distributed
- For larger sample sizes ($n \geq 30$), the t-test is robust to violations of normality due to the Central Limit Theorem
The samples should be independent (except for paired t-tests)
The variances of the samples should be equal (for two-sample t-tests)
- If the variances are unequal, alternative tests like Welch's t-test can be used

z-tests for proportions

z-tests are used to compare proportions and determine if there is a significant difference between them
There are two main types of z-tests for proportions: one-sample z-test and two-sample z-test
z-tests assume that the sample size is large enough ($np \geq 10$ and $n(1-p) \geq 10$) and that the samples are independent

One-sample z-test

Used to compare a sample proportion to a known or hypothesized population proportion
Tests whether the sample proportion is significantly different from the population proportion
Example: Testing if the proportion of defective products in a sample is significantly different from the claimed proportion

Two-sample z-test

Used to compare the proportions of two independent samples
Tests whether the proportions of the two samples are significantly different from each other
Example: Comparing the proportion of smokers in two different age groups (18-30 vs. 31-50)

Assumptions of z-tests

The sample size should be large enough ($np \geq 10$ and $n(1-p) \geq 10$)
The samples should be independent
The sample proportions should be normally distributed (approximated by the normal distribution when the sample size is large)

Significance level and p-values

The significance level ($\alpha$) is the probability of rejecting the null hypothesis when it is true (Type I error)
- Common significance levels are 0.05 and 0.01
The p-value is the probability of obtaining a test statistic as extreme as or more extreme than the observed value, assuming the null hypothesis is true
If the p-value is less than the significance level, we reject the null hypothesis; otherwise, we fail to reject the null hypothesis
The p-value provides a measure of the strength of evidence against the null hypothesis

Type I and Type II errors

Type I error (false positive) occurs when the null hypothesis is rejected when it is true
- The probability of a Type I error is equal to the significance level ($\alpha$)
Type II error (false negative) occurs when the null hypothesis is not rejected when it is false
- The probability of a Type II error is denoted by $\beta$
The power of a test is the probability of correctly rejecting the null hypothesis when it is false (1 - $\beta$)
There is a trade-off between Type I and Type II errors; decreasing one type of error increases the other

Confidence intervals

Confidence intervals provide a range of plausible values for a population parameter (e.g., mean or proportion) based on sample data
The confidence level (e.g., 95%) represents the proportion of intervals that would contain the true population parameter if the sampling process were repeated many times
Confidence intervals are constructed using the sample statistic (e.g., sample mean or proportion) and the standard error

Interpreting confidence intervals

A 95% confidence interval means that if the sampling process were repeated many times, 95% of the resulting intervals would contain the true population parameter
The width of the confidence interval indicates the precision of the estimate; narrower intervals suggest more precise estimates
Confidence intervals can be used to assess the significance of a result; if the interval does not contain the null hypothesis value, the result is significant at the corresponding confidence level

Confidence intervals vs hypothesis tests

Confidence intervals and hypothesis tests are related but serve different purposes
Hypothesis tests provide a decision about the significance of a result based on a pre-specified significance level
Confidence intervals provide a range of plausible values for the population parameter and indicate the precision of the estimate
Confidence intervals can be used to perform hypothesis tests by checking if the null hypothesis value falls within the interval

Power and sample size

Power is the probability of correctly rejecting the null hypothesis when it is false (1 - $\beta$)
Higher power means a higher chance of detecting a true effect or difference
Sample size is a key factor that influences the power of a test; larger sample sizes generally increase power

Factors affecting power

Effect size: Larger effects are easier to detect and require smaller sample sizes to achieve the same power
Significance level ($\alpha$): Smaller significance levels (e.g., 0.01) require larger sample sizes to maintain the same power compared to larger significance levels (e.g., 0.05)
Variability of the data: Higher variability requires larger sample sizes to achieve the same power
Type of test: Some tests (e.g., one-tailed tests) have higher power than others (e.g., two-tailed tests) for the same sample size and effect size

Calculating required sample size

The required sample size can be calculated based on the desired power, significance level, effect size, and variability of the data
There are formulas and software packages available to determine the required sample size for various types of tests
Example: Using GPower software to calculate the sample size needed for a two-sample t-test with a power of 0.80, significance level of 0.05, and a medium effect size (Cohen's d = 0.5)

Limitations and alternatives

Hypothesis tests and confidence intervals have limitations and assumptions that should be considered when interpreting results
Violations of assumptions (e.g., non-normality, unequal variances) can affect the validity of the results
Alternative methods can be used when assumptions are violated or when the data are not suitable for parametric tests

Nonparametric tests

Nonparametric tests do not assume a specific distribution of the data and are based on ranks or order statistics
Examples of nonparametric tests include the Wilcoxon rank-sum test (for two independent samples), the Wilcoxon signed-rank test (for paired samples), and the Kruskal-Wallis test (for three or more independent samples)
Nonparametric tests are less powerful than parametric tests when the assumptions are met, but they are more robust to violations of assumptions

Bayesian hypothesis testing

Bayesian hypothesis testing is an alternative approach that incorporates prior information and updates the probability of the hypotheses based on the observed data
Bayesian methods provide posterior probabilities of the hypotheses, which can be easier to interpret than p-values
Bayesian hypothesis testing requires specifying prior distributions for the parameters of interest, which can be subjective and may influence the results
Bayesian methods are becoming increasingly popular in various fields, including psychology, economics, and machine learning

📊Probability and Statistics Unit 9 Review

9.5 t-tests and z-tests

📊Probability and Statistics Unit 9 Review

9.5 t-tests and z-tests

Unit & Topic Study Guides

Fundamentals of hypothesis testing

Null and alternative hypotheses

One-tailed vs two-tailed tests

Test statistics

t-statistic for t-tests

z-statistic for z-tests

t-tests for means

One-sample t-test

Two-sample t-test

Paired t-test

Assumptions of t-tests

z-tests for proportions

One-sample z-test

Two-sample z-test

Assumptions of z-tests

Significance level and p-values

Type I and Type II errors

Confidence intervals

Interpreting confidence intervals

Confidence intervals vs hypothesis tests

Power and sample size

Factors affecting power

Calculating required sample size

Limitations and alternatives

Nonparametric tests

Bayesian hypothesis testing

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

📊Probability and Statistics
Unit 9 Review