🐛Biostatistics Unit 4 Review

4.1 Sampling methods and sample size determination

🐛Biostatistics
Unit 4 Review

4.1 Sampling methods and sample size determination

Written by the Fiveable Content Team • Last updated September 2025

🐛Biostatistics

Unit & Topic Study Guides

4.1 Sampling methods and sample size determination

4.2 Principles of experimental design in biology

4.3 Randomization, replication, and blocking in biological experiments

4.4 Power analysis and effect size estimation

Sampling methods and sample size determination are crucial in biological research. They ensure accurate representation of populations and reliable results. Proper sampling techniques help researchers avoid bias and draw valid conclusions from their studies.

Calculating the right sample size balances statistical power with resource constraints. It helps detect meaningful effects while minimizing errors. Understanding these concepts is key to designing effective experiments and interpreting research findings in biology.

Sampling Methods in Research

Probability Sampling Techniques

Simple random sampling
- Each member of the population has an equal chance of being selected
- Often used as a benchmark for other sampling methods
- Ensures unbiased representation of the population
- Requires a complete list of the population
Systematic sampling
- Individuals are selected from a population at regular intervals after a random starting point
- Useful when a complete list of the population is available and the population is homogeneous
- Easier to implement than simple random sampling
- May lead to biased results if there is a hidden pattern in the population
Stratified sampling
- Population is divided into distinct subgroups (strata) based on specific characteristics
- A random sample is taken from each stratum
- Ensures that all subgroups are represented in the sample
- Useful when the population is heterogeneous and subgroup comparisons are of interest
- Requires knowledge of the population's characteristics to define the strata
Cluster sampling
- Population is divided into clusters (naturally occurring groups, such as schools or neighborhoods)
- A subset of these clusters is randomly selected
- All individuals within the selected clusters are sampled
- Cost-effective when the population is geographically dispersed
- Less precise than other probability sampling methods due to potential differences between clusters

Non-Probability Sampling Techniques

Convenience sampling
- Individuals are selected based on their availability and willingness to participate
- Easy to implement and less time-consuming than probability sampling methods
- May lead to biased results as the sample may not be representative of the population
- Useful for pilot studies or when the population is hard to access
Purposive sampling
- Individuals are selected based on the researcher's judgment and the study's objectives
- Useful when targeting a specific subgroup or when the population is hard to access
- Allows for the selection of information-rich cases
- Prone to researcher bias and may not be representative of the population
- Snowball sampling, a type of purposive sampling, involves participants recruiting other participants from their networks

Sample Size Calculation

Factors Influencing Sample Size

Desired level of confidence
- Represents the probability that the true population parameter falls within the confidence interval
- Commonly set at 95% (corresponding to a Z-score of 1.96)
- Higher confidence levels require larger sample sizes
Margin of error
- The maximum acceptable difference between the sample estimate and the true population parameter
- Smaller margins of error require larger sample sizes
Variability of the population
- Measured by the standard deviation or proportion
- More heterogeneous populations require larger sample sizes to capture the variability
Expected effect size
- The magnitude of the difference or relationship between variables
- Smaller effect sizes require larger sample sizes to be detected

Sample Size Formulas for Different Study Designs

Simple random sample
- Formula: $n = (Z^2 * p * (1-p)) / e^2$
- $n$ is the sample size
- $Z$ is the Z-score corresponding to the desired confidence level
- $p$ is the estimated proportion of the population with the characteristic of interest
- $e$ is the margin of error
Comparing two means
- Formula: $n = (2 * (Z_α/2 + Z_β)^2 * σ^2) / Δ^2$
- $n$ is the sample size per group
- $Z_α/2$ is the Z-score corresponding to the desired level of significance
- $Z_β$ is the Z-score corresponding to the desired power
- $σ$ is the standard deviation
- $Δ$ is the minimum difference to be detected
Comparing two proportions
- Formula: $n = (Z_α/2 * sqrt(2 * p * (1-p)) + Z_β * sqrt(p1 * (1-p1) + p2 * (1-p2)))^2 / (p1 - p2)^2$
- $n$ is the sample size per group
- $Z_α/2$ and $Z_β$ are as defined above
- $p$ is the average of the two proportions
- $p1$ and $p2$ are the proportions in the two groups
Sample size calculators and statistical software
- Used to determine the appropriate sample size for more complex study designs (factorial experiments, repeated measures designs)
- Incorporate additional factors such as the number of groups, the correlation between repeated measures, and the desired effect size

Sampling Bias and Representativeness

Types of Sampling Bias

Selection bias
- Sample is not representative of the population due to the sampling method or the willingness of individuals to participate
- Can lead to overestimation or underestimation of population parameters
- Example: Recruiting participants through social media may exclude individuals without internet access
Non-response bias
- Occurs when individuals who do not respond to a survey or participate in a study differ systematically from those who do respond or participate
- Can affect the generalizability of the results
- Example: Individuals with strong opinions on a topic may be more likely to respond to a survey than those with neutral opinions
Volunteer bias
- Occurs when individuals who volunteer to participate in a study differ from those who do not volunteer
- Can lead to biased results, especially if the study involves sensitive topics or requires a significant time commitment
- Example: Individuals who volunteer for a study on exercise habits may be more health-conscious than the general population

Assessing Sample Representativeness

Compare sample characteristics to known population characteristics
- Demographic variables (age, gender, education level)
- Relevant clinical or behavioral characteristics
- Helps identify potential discrepancies between the sample and the population
Conduct a non-response analysis
- Compare characteristics of respondents and non-respondents
- Identify potential differences that may affect the generalizability of the results
- Example: Comparing the age distribution of survey respondents to that of the target population
Use probability sampling methods
- Simple random sampling and stratified sampling are more likely to produce representative samples than non-probability methods
- Ensure that all members of the population have a known, non-zero probability of being selected
- Example: Using a random number generator to select participants from a complete list of the population

Sample Size and Statistical Power

Importance of Adequate Sample Size

Ensures sufficient statistical power to detect a true effect or difference when it exists
- Statistical power is the probability of correctly rejecting a false null hypothesis
- Larger sample sizes increase the power of a study to detect a given effect size
- Example: A study with 100 participants may have 80% power to detect a medium effect size, while a study with 500 participants may have 99% power to detect the same effect size
Avoids false-negative results (Type II error)
- Underpowered studies may fail to detect a true effect due to small sample sizes
- Can lead to the erroneous conclusion that there is no significant difference or relationship between variables
- Example: A study with 50 participants may fail to detect a true difference in blood pressure between two treatment groups, even if the difference exists in the population
Prevents overinterpretation of results
- Overpowered studies may detect statistically significant but practically insignificant effects
- Can lead to the wastage of resources and the overemphasis of minor differences
- Example: A study with 10,000 participants may find a statistically significant difference in weight loss between two diet groups, even if the difference is only 0.5 pounds

Balancing Type I and Type II Errors

Type I error (false positive)
- Rejecting a true null hypothesis
- Commonly set at 5% (α = 0.05)
- Smaller α levels require larger sample sizes
Type II error (false negative)
- Failing to reject a false null hypothesis
- Commonly set at 20% (β = 0.20, corresponding to a power of 80%)
- Smaller β levels (higher power) require larger sample sizes
Adequate sample size determination helps balance the risks of Type I and Type II errors
- Ensures sufficient power to detect meaningful effects
- Minimizes the chances of false positives
- Example: A study with a sample size of 200 may have a power of 90% to detect a medium effect size, while maintaining a Type I error rate of 5%

Power Analysis and Study Design

Conduct power analysis before the study begins
- Determine the appropriate sample size based on the desired power, effect size, and significance level
- Helps researchers design more efficient and informative studies
- Ensures that the study has sufficient power to answer the research question
Use the results of power analysis to guide study design
- Adjust the sample size, effect size, or significance level as needed
- Consider the feasibility and cost of recruiting the required number of participants
- Example: If a power analysis indicates that a sample size of 500 is needed to detect a small effect size, but recruiting 500 participants is not feasible, researchers may need to adjust their research question or consider alternative study designs
Adequate sample size determination leads to more reliable and reproducible results
- Increases the chances of detecting true effects
- Reduces the risk of false positives and false negatives
- Enhances the credibility and generalizability of the study findings

🐛Biostatistics Unit 4 Review

4.1 Sampling methods and sample size determination

🐛Biostatistics Unit 4 Review

4.1 Sampling methods and sample size determination

Unit & Topic Study Guides

Sampling Methods in Research

Probability Sampling Techniques

Non-Probability Sampling Techniques

Sample Size Calculation

Factors Influencing Sample Size

Sample Size Formulas for Different Study Designs

Sampling Bias and Representativeness

Types of Sampling Bias

Assessing Sample Representativeness

Sample Size and Statistical Power

Importance of Adequate Sample Size

Balancing Type I and Type II Errors

Power Analysis and Study Design

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

🐛Biostatistics
Unit 4 Review