📊Probability and Statistics Unit 6 Review

6.5 Observational studies and experiments

📊Probability and Statistics
Unit 6 Review

6.5 Observational studies and experiments

Written by the Fiveable Content Team • Last updated September 2025

📊Probability and Statistics

Unit & Topic Study Guides

6.1 Simple random sampling

6.2 Stratified sampling

6.3 Cluster sampling

6.4 Systematic sampling

6.5 Observational studies and experiments

Observational studies and experiments are crucial tools in scientific research. They help us understand relationships between variables and test hypotheses. Each method has its strengths and limitations, shaping how we gather and interpret data.

Observational studies examine real-world phenomena without manipulation, while experiments control variables to establish causality. Understanding their differences is key to choosing the right approach and interpreting results accurately in various research contexts.

Types of observational studies

Observational studies involve collecting data without manipulating the variables of interest, allowing researchers to study relationships between variables in real-world settings
Different types of observational studies are used depending on the research question, available resources, and the nature of the variables being studied

Prospective vs retrospective studies

Prospective studies follow participants forward in time, collecting data on exposures and outcomes as they occur (Framingham Heart Study)
Retrospective studies look back in time, using existing data to examine the relationship between past exposures and current outcomes (case-control studies of smoking and lung cancer)
Prospective studies are generally more reliable but more time-consuming and expensive than retrospective studies

Cross-sectional vs longitudinal studies

Cross-sectional studies collect data from a population at a single point in time, providing a snapshot of the variables of interest (National Health and Nutrition Examination Survey)
Longitudinal studies follow the same individuals over an extended period, allowing researchers to track changes in variables over time (Nurses' Health Study)
Cross-sectional studies are quicker and less expensive but cannot establish temporal relationships between variables, while longitudinal studies can provide insights into cause-and-effect relationships

Cohort vs case-control studies

Cohort studies follow a group of individuals who share a common characteristic (exposure) and compare their outcomes to a group without the exposure (Physicians' Health Study)
Case-control studies compare individuals with a specific outcome (cases) to those without the outcome (controls) and look back to determine the exposure status (study of thalidomide and birth defects)
Cohort studies are better for studying rare exposures and multiple outcomes, while case-control studies are more efficient for studying rare outcomes

Designing observational studies

Careful design of observational studies is crucial to ensure the validity and reliability of the results, minimizing bias and confounding factors
Key steps in designing an observational study include defining the research question, selecting the study population, determining the sample size, choosing the sampling method, and collecting data

Defining the research question

A clear and specific research question guides the entire study design process, ensuring that the study is focused and feasible
The research question should be based on a thorough literature review and should address a gap in the existing knowledge
Example research question: "Is there an association between daily coffee consumption and the risk of developing type 2 diabetes?"

Selecting the study population

The study population should be representative of the target population to which the results will be generalized
Inclusion and exclusion criteria should be clearly defined to ensure that the study population is appropriate for answering the research question
Example study population: "Adults aged 18-65 years living in the United States who have no history of type 2 diabetes at baseline"

Determining the sample size

An adequate sample size is necessary to detect meaningful differences or associations between variables with sufficient statistical power
Sample size calculations should take into account the expected effect size, the desired level of significance, and the acceptable level of type II error (power)
Example sample size calculation: "To detect a 20% difference in the risk of developing type 2 diabetes between coffee drinkers and non-drinkers with 80% power and a significance level of 0.05, a sample size of 1,000 participants is required"

Choosing the sampling method

The sampling method should be chosen based on the research question, the available resources, and the characteristics of the study population
Common sampling methods include simple random sampling, stratified sampling, cluster sampling, and convenience sampling
Example sampling method: "Stratified random sampling will be used to ensure that the sample is representative of the U.S. adult population in terms of age, gender, and race/ethnicity"

Collecting data using surveys or measurements

Data collection methods should be standardized and validated to ensure the accuracy and reliability of the data
Surveys should be carefully designed to minimize response bias and ensure that the questions are clear and unambiguous
Measurements should be performed using calibrated instruments and standardized protocols to minimize measurement error
Example data collection: "Participants will complete a validated food frequency questionnaire to assess their daily coffee consumption, and their fasting blood glucose levels will be measured at baseline and at annual follow-up visits"

Advantages of observational studies

Observational studies offer several advantages over experimental studies, particularly in terms of their ability to study real-world populations and outcomes
These advantages make observational studies an essential tool in epidemiology and public health research

Real-world settings and populations

Observational studies allow researchers to study variables and relationships as they occur in natural settings, without the artificial constraints of an experimental setting
This enables researchers to investigate the effects of exposures and interventions in diverse, representative populations, enhancing the external validity of the findings
Example: A study of the association between air pollution and respiratory health in a large urban population

Efficiency in time and cost

Observational studies are often less time-consuming and less expensive than experimental studies, as they do not require the manipulation of variables or the creation of controlled conditions
This efficiency allows researchers to study larger sample sizes and longer time periods, which can be particularly valuable for investigating rare outcomes or long-term effects
Example: A retrospective study using existing medical records to investigate the association between obesity and the risk of developing certain cancers

Ability to study rare outcomes or exposures

Observational studies are well-suited for studying rare outcomes or exposures, as they can include large, diverse populations and long follow-up periods
This is particularly valuable for investigating the effects of uncommon exposures (environmental toxins) or outcomes (rare diseases) that would be difficult or unethical to study in an experimental setting
Example: A case-control study investigating the association between a rare genetic mutation and the development of a specific type of cancer

Limitations of observational studies

Despite their advantages, observational studies have several limitations that can affect the validity and reliability of their findings
These limitations should be carefully considered when designing and interpreting observational studies

Lack of control over extraneous variables

In observational studies, researchers do not have control over extraneous variables that may influence the relationship between the variables of interest
This lack of control can make it difficult to isolate the effect of a specific exposure or intervention, as other factors may be contributing to the observed outcomes
Example: A study investigating the association between coffee consumption and heart disease may be confounded by factors such as smoking, diet, and physical activity

Potential for confounding and bias

Observational studies are susceptible to confounding, where a third variable influences both the exposure and the outcome, creating a spurious association
Selection bias can occur when the study population is not representative of the target population, while information bias can result from inaccurate or incomplete data collection
Example: A study finding an association between alcohol consumption and lung cancer may be confounded by smoking, as individuals who drink heavily are more likely to smoke

Difficulty in establishing causality

Observational studies can demonstrate associations between variables but cannot definitively establish causality
This is because the lack of random assignment and control over extraneous variables makes it difficult to determine the direction of the relationship and rule out alternative explanations
Example: A study showing an association between high levels of stress and the development of heart disease cannot prove that stress causes heart disease, as other factors (poor diet, lack of exercise) may be contributing to both the stress and the heart disease

Types of experiments

Experiments are a key tool in scientific research, allowing researchers to test hypotheses and establish causal relationships between variables
Different types of experiments are used depending on the research question, the nature of the variables being studied, and the available resources

Controlled vs natural experiments

Controlled experiments are conducted in a laboratory setting, where the researcher manipulates the independent variable and controls for extraneous variables
Natural experiments occur when an event or policy change creates a quasi-experimental situation, allowing researchers to study the effects of the change in a real-world setting
Example of a controlled experiment: Testing the effectiveness of a new drug in a randomized controlled trial
Example of a natural experiment: Studying the impact of a smoking ban on air quality and respiratory health in a city

Between-subjects vs within-subjects designs

Between-subjects designs involve comparing different groups of participants, each exposed to a different level of the independent variable
Within-subjects designs involve exposing each participant to all levels of the independent variable, allowing for comparisons within individuals
Between-subjects designs are less susceptible to carryover effects but require larger sample sizes, while within-subjects designs are more efficient but may be affected by order effects
Example of a between-subjects design: Comparing the effects of three different exercise regimens on weight loss in separate groups of participants
Example of a within-subjects design: Testing the effects of different noise levels on task performance, with each participant exposed to all noise levels

Factorial vs non-factorial designs

Factorial designs involve manipulating two or more independent variables simultaneously, allowing researchers to study the main effects and interactions between the variables
Non-factorial designs involve manipulating only one independent variable at a time
Factorial designs are more efficient and can provide more information but may be more complex to interpret
Example of a factorial design: Studying the effects of both drug dosage and frequency of administration on patient outcomes
Example of a non-factorial design: Comparing the effectiveness of a new teaching method to a traditional method, with only the teaching method being manipulated

Designing experiments

Proper design is essential for ensuring the validity and reliability of experimental results
Key steps in designing an experiment include formulating testable hypotheses, identifying variables, defining experimental units, randomization and random assignment, and controlling for confounding variables

Formulating testable hypotheses

A testable hypothesis is a prediction about the relationship between the independent and dependent variables that can be verified through experimentation
Hypotheses should be based on existing theory and evidence and should be stated in a clear, concise, and falsifiable manner
Example hypothesis: "Participants who receive the new drug will have significantly lower blood pressure compared to those who receive a placebo"

Identifying independent and dependent variables

The independent variable is the factor that is manipulated by the researcher, while the dependent variable is the outcome that is measured
Clearly defining and operationalizing the variables is essential for ensuring that the experiment is replicable and the results are interpretable
Example independent variable: Type of exercise (aerobic, strength training, or flexibility)
Example dependent variable: Change in body mass index (BMI) after 12 weeks

Defining the experimental units

Experimental units are the entities that are randomly assigned to different treatment conditions
Experimental units can be individuals, groups, or even objects, depending on the nature of the research question
Example experimental units: Individual participants in a study comparing the effects of different diets on weight loss

Randomization and random assignment

Randomization involves using a chance process to assign experimental units to different treatment conditions
Random assignment helps to ensure that any differences between the groups are due to the treatment rather than pre-existing differences
Example randomization procedure: Using a computer-generated random number sequence to assign participants to either the treatment or control group

Controlling for confounding variables

Confounding variables are factors that are related to both the independent and dependent variables and can obscure the true relationship between them
Controlling for confounding variables through methods such as matching, stratification, or statistical adjustment is essential for ensuring the internal validity of the experiment
Example confounding variable: Age in a study of the effects of a new medication on blood pressure, as age is related to both medication use and blood pressure

Advantages of experiments

Experiments offer several key advantages over observational studies, particularly in terms of their ability to establish causal relationships and control for extraneous variables
These advantages make experiments a powerful tool for testing hypotheses and advancing scientific knowledge

Control over extraneous variables

In experiments, researchers have control over extraneous variables that may influence the relationship between the independent and dependent variables
This control allows researchers to isolate the effect of the independent variable and minimize the impact of confounding factors
Example: In a study of the effects of a new teaching method on student performance, researchers can control for factors such as class size, teacher experience, and student background

Ability to establish causality

Experiments are designed to test causal hypotheses and determine the direction of the relationship between variables
By manipulating the independent variable and controlling for extraneous factors, researchers can conclude that changes in the dependent variable are caused by changes in the independent variable
Example: A randomized controlled trial demonstrating that a new drug causes a reduction in blood pressure, rather than simply being associated with lower blood pressure

High internal validity

Internal validity refers to the extent to which the results of a study can be attributed to the manipulation of the independent variable rather than other factors
Experiments typically have high internal validity due to their ability to control for extraneous variables and randomly assign participants to treatment conditions
Example: A well-designed experiment comparing the effects of different exercise regimens on weight loss, with participants randomly assigned to each regimen and all other factors held constant

Limitations of experiments

Despite their strengths, experiments also have several limitations that can affect the generalizability and applicability of their findings
These limitations should be carefully considered when designing and interpreting experiments

Artificiality of the setting

Experiments are often conducted in highly controlled, artificial settings that may not reflect the complexity and variability of real-world environments
This artificiality can limit the external validity of the findings and make it difficult to generalize the results to other populations or settings
Example: A laboratory study of the effects of noise on task performance may not fully capture the impact of noise in a real-world office environment

Limited generalizability to real-world populations

Experimental samples are often highly selected and may not be representative of the broader population of interest
This limited generalizability can make it difficult to apply the findings of an experiment to real-world populations and settings
Example: A study of the effects of a new educational intervention on a sample of high-achieving students may not generalize to students with diverse backgrounds and abilities

Ethical considerations and constraints

Experiments often involve exposing participants to different treatment conditions, which can raise ethical concerns about the potential risks and benefits to participants
Ethical guidelines and institutional review boards place constraints on the types of experiments that can be conducted and the populations that can be studied
Example: A proposed experiment exposing participants to a potentially harmful substance may not be approved due to ethical concerns, even if it could provide valuable scientific information

Comparing observational studies and experiments

Observational studies and experiments are two key approaches to scientific research, each with its own strengths and limitations
Understanding the differences between these approaches is essential for selecting the most appropriate method for a given research question and interpreting the results of scientific studies

Internal vs external validity

Internal validity refers to the extent to which the results of a study can be attributed to the manipulation of the independent variable, while external validity refers to the extent to which the results can be generalized to other populations and settings
Experiments typically have high internal validity due to their ability to control for extraneous variables, but may have limited external validity due to the artificiality of the setting and the selectivity of the sample
Observational studies often have higher external validity as they are conducted in real-world settings with diverse populations, but may have lower internal validity due to the lack of control over extraneous variables

Causal inference vs association

Experiments are designed to test causal hypotheses and establish the direction of the relationship between variables, while observational studies can only demonstrate associations between variables
The ability to make causal inferences is a key strength of experiments, but the lack of random assignment and control in observational studies makes it difficult to rule out alternative explanations for the observed associations
Example: An experiment showing that a new drug causes a reduction in blood pressure vs an observational study showing an association between taking the drug and lower blood pressure

Practicality vs control

Observational studies are often more practical and feasible than experiments, as they do not require the manipulation of variables or the creation of controlled conditions
However, the lack of control in observational studies can make it difficult to isolate the effects of specific variables and rule out confounding factors
Experiments offer greater control over variables and the ability to establish causality, but may be more resource-intensive and less applicable to real-world settings
Example: A large-scale observational study of the relationship between diet and heart disease vs a controlled feeding experiment testing the effects of specific dietary factors on cardiovascular risk markers

📊Probability and Statistics Unit 6 Review

6.5 Observational studies and experiments

📊Probability and Statistics Unit 6 Review

6.5 Observational studies and experiments

Unit & Topic Study Guides

Types of observational studies

Prospective vs retrospective studies

Cross-sectional vs longitudinal studies

Cohort vs case-control studies

Designing observational studies

Defining the research question

Selecting the study population

Determining the sample size

Choosing the sampling method

Collecting data using surveys or measurements

Advantages of observational studies

Real-world settings and populations

Efficiency in time and cost

Ability to study rare outcomes or exposures

Limitations of observational studies

Lack of control over extraneous variables

Potential for confounding and bias

Difficulty in establishing causality

Types of experiments

Controlled vs natural experiments

Between-subjects vs within-subjects designs

Factorial vs non-factorial designs

Designing experiments

Formulating testable hypotheses

Identifying independent and dependent variables

Defining the experimental units

Randomization and random assignment

Controlling for confounding variables

Advantages of experiments

Control over extraneous variables

Ability to establish causality

High internal validity

Limitations of experiments

Artificiality of the setting

Limited generalizability to real-world populations

Ethical considerations and constraints

Comparing observational studies and experiments

Internal vs external validity

Causal inference vs association

Practicality vs control

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

📊Probability and Statistics
Unit 6 Review