Fiveable

📊Probability and Statistics Unit 6 Review

QR code for Probability and Statistics practice questions

6.5 Observational studies and experiments

📊Probability and Statistics
Unit 6 Review

6.5 Observational studies and experiments

Written by the Fiveable Content Team • Last updated September 2025
Written by the Fiveable Content Team • Last updated September 2025
📊Probability and Statistics
Unit & Topic Study Guides

Observational studies and experiments are crucial tools in scientific research. They help us understand relationships between variables and test hypotheses. Each method has its strengths and limitations, shaping how we gather and interpret data.

Observational studies examine real-world phenomena without manipulation, while experiments control variables to establish causality. Understanding their differences is key to choosing the right approach and interpreting results accurately in various research contexts.

Types of observational studies

  • Observational studies involve collecting data without manipulating the variables of interest, allowing researchers to study relationships between variables in real-world settings
  • Different types of observational studies are used depending on the research question, available resources, and the nature of the variables being studied

Prospective vs retrospective studies

  • Prospective studies follow participants forward in time, collecting data on exposures and outcomes as they occur (Framingham Heart Study)
  • Retrospective studies look back in time, using existing data to examine the relationship between past exposures and current outcomes (case-control studies of smoking and lung cancer)
  • Prospective studies are generally more reliable but more time-consuming and expensive than retrospective studies

Cross-sectional vs longitudinal studies

  • Cross-sectional studies collect data from a population at a single point in time, providing a snapshot of the variables of interest (National Health and Nutrition Examination Survey)
  • Longitudinal studies follow the same individuals over an extended period, allowing researchers to track changes in variables over time (Nurses' Health Study)
  • Cross-sectional studies are quicker and less expensive but cannot establish temporal relationships between variables, while longitudinal studies can provide insights into cause-and-effect relationships

Cohort vs case-control studies

  • Cohort studies follow a group of individuals who share a common characteristic (exposure) and compare their outcomes to a group without the exposure (Physicians' Health Study)
  • Case-control studies compare individuals with a specific outcome (cases) to those without the outcome (controls) and look back to determine the exposure status (study of thalidomide and birth defects)
  • Cohort studies are better for studying rare exposures and multiple outcomes, while case-control studies are more efficient for studying rare outcomes

Designing observational studies

  • Careful design of observational studies is crucial to ensure the validity and reliability of the results, minimizing bias and confounding factors
  • Key steps in designing an observational study include defining the research question, selecting the study population, determining the sample size, choosing the sampling method, and collecting data

Defining the research question

  • A clear and specific research question guides the entire study design process, ensuring that the study is focused and feasible
  • The research question should be based on a thorough literature review and should address a gap in the existing knowledge
  • Example research question: "Is there an association between daily coffee consumption and the risk of developing type 2 diabetes?"

Selecting the study population

  • The study population should be representative of the target population to which the results will be generalized
  • Inclusion and exclusion criteria should be clearly defined to ensure that the study population is appropriate for answering the research question
  • Example study population: "Adults aged 18-65 years living in the United States who have no history of type 2 diabetes at baseline"

Determining the sample size

  • An adequate sample size is necessary to detect meaningful differences or associations between variables with sufficient statistical power
  • Sample size calculations should take into account the expected effect size, the desired level of significance, and the acceptable level of type II error (power)
  • Example sample size calculation: "To detect a 20% difference in the risk of developing type 2 diabetes between coffee drinkers and non-drinkers with 80% power and a significance level of 0.05, a sample size of 1,000 participants is required"

Choosing the sampling method

  • The sampling method should be chosen based on the research question, the available resources, and the characteristics of the study population
  • Common sampling methods include simple random sampling, stratified sampling, cluster sampling, and convenience sampling
  • Example sampling method: "Stratified random sampling will be used to ensure that the sample is representative of the U.S. adult population in terms of age, gender, and race/ethnicity"

Collecting data using surveys or measurements

  • Data collection methods should be standardized and validated to ensure the accuracy and reliability of the data
  • Surveys should be carefully designed to minimize response bias and ensure that the questions are clear and unambiguous
  • Measurements should be performed using calibrated instruments and standardized protocols to minimize measurement error
  • Example data collection: "Participants will complete a validated food frequency questionnaire to assess their daily coffee consumption, and their fasting blood glucose levels will be measured at baseline and at annual follow-up visits"

Advantages of observational studies

  • Observational studies offer several advantages over experimental studies, particularly in terms of their ability to study real-world populations and outcomes
  • These advantages make observational studies an essential tool in epidemiology and public health research

Real-world settings and populations

  • Observational studies allow researchers to study variables and relationships as they occur in natural settings, without the artificial constraints of an experimental setting
  • This enables researchers to investigate the effects of exposures and interventions in diverse, representative populations, enhancing the external validity of the findings
  • Example: A study of the association between air pollution and respiratory health in a large urban population

Efficiency in time and cost

  • Observational studies are often less time-consuming and less expensive than experimental studies, as they do not require the manipulation of variables or the creation of controlled conditions
  • This efficiency allows researchers to study larger sample sizes and longer time periods, which can be particularly valuable for investigating rare outcomes or long-term effects
  • Example: A retrospective study using existing medical records to investigate the association between obesity and the risk of developing certain cancers

Ability to study rare outcomes or exposures

  • Observational studies are well-suited for studying rare outcomes or exposures, as they can include large, diverse populations and long follow-up periods
  • This is particularly valuable for investigating the effects of uncommon exposures (environmental toxins) or outcomes (rare diseases) that would be difficult or unethical to study in an experimental setting
  • Example: A case-control study investigating the association between a rare genetic mutation and the development of a specific type of cancer

Limitations of observational studies

  • Despite their advantages, observational studies have several limitations that can affect the validity and reliability of their findings
  • These limitations should be carefully considered when designing and interpreting observational studies

Lack of control over extraneous variables

  • In observational studies, researchers do not have control over extraneous variables that may influence the relationship between the variables of interest
  • This lack of control can make it difficult to isolate the effect of a specific exposure or intervention, as other factors may be contributing to the observed outcomes
  • Example: A study investigating the association between coffee consumption and heart disease may be confounded by factors such as smoking, diet, and physical activity

Potential for confounding and bias

  • Observational studies are susceptible to confounding, where a third variable influences both the exposure and the outcome, creating a spurious association
  • Selection bias can occur when the study population is not representative of the target population, while information bias can result from inaccurate or incomplete data collection
  • Example: A study finding an association between alcohol consumption and lung cancer may be confounded by smoking, as individuals who drink heavily are more likely to smoke

Difficulty in establishing causality

  • Observational studies can demonstrate associations between variables but cannot definitively establish causality
  • This is because the lack of random assignment and control over extraneous variables makes it difficult to determine the direction of the relationship and rule out alternative explanations
  • Example: A study showing an association between high levels of stress and the development of heart disease cannot prove that stress causes heart disease, as other factors (poor diet, lack of exercise) may be contributing to both the stress and the heart disease

Types of experiments

  • Experiments are a key tool in scientific research, allowing researchers to test hypotheses and establish causal relationships between variables
  • Different types of experiments are used depending on the research question, the nature of the variables being studied, and the available resources

Controlled vs natural experiments

  • Controlled experiments are conducted in a laboratory setting, where the researcher manipulates the independent variable and controls for extraneous variables
  • Natural experiments occur when an event or policy change creates a quasi-experimental situation, allowing researchers to study the effects of the change in a real-world setting
  • Example of a controlled experiment: Testing the effectiveness of a new drug in a randomized controlled trial
  • Example of a natural experiment: Studying the impact of a smoking ban on air quality and respiratory health in a city

Between-subjects vs within-subjects designs

  • Between-subjects designs involve comparing different groups of participants, each exposed to a different level of the independent variable
  • Within-subjects designs involve exposing each participant to all levels of the independent variable, allowing for comparisons within individuals
  • Between-subjects designs are less susceptible to carryover effects but require larger sample sizes, while within-subjects designs are more efficient but may be affected by order effects
  • Example of a between-subjects design: Comparing the effects of three different exercise regimens on weight loss in separate groups of participants
  • Example of a within-subjects design: Testing the effects of different noise levels on task performance, with each participant exposed to all noise levels

Factorial vs non-factorial designs

  • Factorial designs involve manipulating two or more independent variables simultaneously, allowing researchers to study the main effects and interactions between the variables
  • Non-factorial designs involve manipulating only one independent variable at a time
  • Factorial designs are more efficient and can provide more information but may be more complex to interpret
  • Example of a factorial design: Studying the effects of both drug dosage and frequency of administration on patient outcomes
  • Example of a non-factorial design: Comparing the effectiveness of a new teaching method to a traditional method, with only the teaching method being manipulated

Designing experiments

  • Proper design is essential for ensuring the validity and reliability of experimental results
  • Key steps in designing an experiment include formulating testable hypotheses, identifying variables, defining experimental units, randomization and random assignment, and controlling for confounding variables

Formulating testable hypotheses

  • A testable hypothesis is a prediction about the relationship between the independent and dependent variables that can be verified through experimentation
  • Hypotheses should be based on existing theory and evidence and should be stated in a clear, concise, and falsifiable manner
  • Example hypothesis: "Participants who receive the new drug will have significantly lower blood pressure compared to those who receive a placebo"

Identifying independent and dependent variables

  • The independent variable is the factor that is manipulated by the researcher, while the dependent variable is the outcome that is measured
  • Clearly defining and operationalizing the variables is essential for ensuring that the experiment is replicable and the results are interpretable
  • Example independent variable: Type of exercise (aerobic, strength training, or flexibility)
  • Example dependent variable: Change in body mass index (BMI) after 12 weeks

Defining the experimental units

  • Experimental units are the entities that are randomly assigned to different treatment conditions
  • Experimental units can be individuals, groups, or even objects, depending on the nature of the research question
  • Example experimental units: Individual participants in a study comparing the effects of different diets on weight loss

Randomization and random assignment

  • Randomization involves using a chance process to assign experimental units to different treatment conditions
  • Random assignment helps to ensure that any differences between the groups are due to the treatment rather than pre-existing differences
  • Example randomization procedure: Using a computer-generated random number sequence to assign participants to either the treatment or control group

Controlling for confounding variables

  • Confounding variables are factors that are related to both the independent and dependent variables and can obscure the true relationship between them
  • Controlling for confounding variables through methods such as matching, stratification, or statistical adjustment is essential for ensuring the internal validity of the experiment
  • Example confounding variable: Age in a study of the effects of a new medication on blood pressure, as age is related to both medication use and blood pressure

Advantages of experiments

  • Experiments offer several key advantages over observational studies, particularly in terms of their ability to establish causal relationships and control for extraneous variables
  • These advantages make experiments a powerful tool for testing hypotheses and advancing scientific knowledge

Control over extraneous variables

  • In experiments, researchers have control over extraneous variables that may influence the relationship between the independent and dependent variables
  • This control allows researchers to isolate the effect of the independent variable and minimize the impact of confounding factors
  • Example: In a study of the effects of a new teaching method on student performance, researchers can control for factors such as class size, teacher experience, and student background

Ability to establish causality

  • Experiments are designed to test causal hypotheses and determine the direction of the relationship between variables
  • By manipulating the independent variable and controlling for extraneous factors, researchers can conclude that changes in the dependent variable are caused by changes in the independent variable
  • Example: A randomized controlled trial demonstrating that a new drug causes a reduction in blood pressure, rather than simply being associated with lower blood pressure

High internal validity

  • Internal validity refers to the extent to which the results of a study can be attributed to the manipulation of the independent variable rather than other factors
  • Experiments typically have high internal validity due to their ability to control for extraneous variables and randomly assign participants to treatment conditions
  • Example: A well-designed experiment comparing the effects of different exercise regimens on weight loss, with participants randomly assigned to each regimen and all other factors held constant

Limitations of experiments

  • Despite their strengths, experiments also have several limitations that can affect the generalizability and applicability of their findings
  • These limitations should be carefully considered when designing and interpreting experiments

Artificiality of the setting

  • Experiments are often conducted in highly controlled, artificial settings that may not reflect the complexity and variability of real-world environments
  • This artificiality can limit the external validity of the findings and make it difficult to generalize the results to other populations or settings
  • Example: A laboratory study of the effects of noise on task performance may not fully capture the impact of noise in a real-world office environment

Limited generalizability to real-world populations

  • Experimental samples are often highly selected and may not be representative of the broader population of interest
  • This limited generalizability can make it difficult to apply the findings of an experiment to real-world populations and settings
  • Example: A study of the effects of a new educational intervention on a sample of high-achieving students may not generalize to students with diverse backgrounds and abilities

Ethical considerations and constraints

  • Experiments often involve exposing participants to different treatment conditions, which can raise ethical concerns about the potential risks and benefits to participants
  • Ethical guidelines and institutional review boards place constraints on the types of experiments that can be conducted and the populations that can be studied
  • Example: A proposed experiment exposing participants to a potentially harmful substance may not be approved due to ethical concerns, even if it could provide valuable scientific information

Comparing observational studies and experiments

  • Observational studies and experiments are two key approaches to scientific research, each with its own strengths and limitations
  • Understanding the differences between these approaches is essential for selecting the most appropriate method for a given research question and interpreting the results of scientific studies

Internal vs external validity

  • Internal validity refers to the extent to which the results of a study can be attributed to the manipulation of the independent variable, while external validity refers to the extent to which the results can be generalized to other populations and settings
  • Experiments typically have high internal validity due to their ability to control for extraneous variables, but may have limited external validity due to the artificiality of the setting and the selectivity of the sample
  • Observational studies often have higher external validity as they are conducted in real-world settings with diverse populations, but may have lower internal validity due to the lack of control over extraneous variables

Causal inference vs association

  • Experiments are designed to test causal hypotheses and establish the direction of the relationship between variables, while observational studies can only demonstrate associations between variables
  • The ability to make causal inferences is a key strength of experiments, but the lack of random assignment and control in observational studies makes it difficult to rule out alternative explanations for the observed associations
  • Example: An experiment showing that a new drug causes a reduction in blood pressure vs an observational study showing an association between taking the drug and lower blood pressure

Practicality vs control

  • Observational studies are often more practical and feasible than experiments, as they do not require the manipulation of variables or the creation of controlled conditions
  • However, the lack of control in observational studies can make it difficult to isolate the effects of specific variables and rule out confounding factors
  • Experiments offer greater control over variables and the ability to establish causality, but may be more resource-intensive and less applicable to real-world settings
  • Example: A large-scale observational study of the relationship between diet and heart disease vs a controlled feeding experiment testing the effects of specific dietary factors on cardiovascular risk markers