🎲Intro to Probability Unit 11 Review

11.3 Properties of correlation

🎲Intro to Probability
Unit 11 Review

11.3 Properties of correlation

Written by the Fiveable Content Team • Last updated September 2025

🎲Intro to Probability

Unit & Topic Study Guides

11.1 Definition and properties of covariance

11.2 Correlation coefficient and its interpretation

11.3 Properties of correlation

11.4 Applications of covariance and correlation

Correlation is a crucial concept in probability, measuring the strength and direction of linear relationships between variables. It's bounded between -1 and 1, with 0 indicating no linear relationship. Understanding correlation's properties helps interpret data relationships accurately.

Correlation has interesting properties like symmetry and invariance under linear transformations. However, it has limitations too. It doesn't imply causation, misses nonlinear relationships, and can be affected by outliers. Knowing these nuances is key to proper statistical analysis.

Correlation Properties

Range and Interpretation

Correlation coefficients always fall between -1 and 1, inclusive
- -1 signifies a perfect negative linear relationship
- 0 indicates no linear relationship
- 1 represents a perfect positive linear relationship
Measures strength and direction of linear relationships between two variables
Typically denoted as ρ (rho) for population correlation or r for sample correlation
Square of correlation coefficient (r²) shows proportion of variance in one variable explained by linear relationship with other variable
- Example: r² of 0.64 means 64% of variance in Y explained by X

Symmetry and Invariance

Exhibits symmetry correlation between X and Y equals correlation between Y and X
Remains invariant under linear transformations of variables
- Changing scale or adding constants to either/both variables does not affect correlation
- Example: Correlation between height in inches and weight in pounds same as correlation between height in centimeters and weight in kilograms
Sensitive to outliers can significantly influence strength and direction of relationship
- Example: A few extreme data points in a scatterplot can dramatically alter the correlation coefficient

Correlation and Independence

Relationship Between Correlation and Independence

Zero correlation does not necessarily imply independence between random variables
Independence of random variables always results in zero correlation
Non-zero correlation always indicates dependence between random variables
For bivariate normal distributions, zero correlation equivalent to independence (special case)
Absence of linear correlation does not rule out other forms of dependence
- Example: Y = X² has zero linear correlation but strong nonlinear relationship

Practical Considerations

Correlation measures only linear relationships while independence considers all possible relationships
Very low correlation values (close to zero) often interpreted as practical independence
- Requires caution in interpretation
- Example: Correlation of 0.05 between shoe size and test scores might be considered practically independent
In real-world data analysis, weak correlations (|r| < 0.3) often treated as negligible
- Context-dependent interpretation necessary

Correlation Limitations

Nonlinear Relationships and Causality

Fails to capture nonlinear patterns or complex associations between variables
- Example: Sine wave relationship between variables shows zero correlation despite clear pattern
Zero correlation does not mean no relationship only absence of linear relationship
Does not imply causation strong correlation does not indicate one variable causes changes in other
- Example: Ice cream sales and crime rates may correlate due to shared influence of temperature
Spurious correlations occur when two variables correlated due to influence of unmeasured third variable
- Example: Correlation between number of pirates and global temperature (both decreasing over time)

Statistical and Methodological Issues

Presence of outliers or influential points can distort correlation coefficient
- Can lead to misleading conclusions about relationship between variables
Not robust to monotonic transformations of data
- Can change strength and even direction of correlation
- Example: Log transformation of positively skewed data may alter correlation with another variable
Only measures strength of linear relationships
- Misses important nonlinear patterns
- Example: U-shaped relationship between age and happiness shows near-zero correlation

Population vs Sample Correlation

Definitions and Calculations

Population correlation (ρ) describes true relationship between variables in entire population
Sample correlation (r) estimated from subset of population subject to sampling variability
Sample correlation formula involves standardizing variables and taking average product
Population correlation defined using expected values and standard deviations
Fisher z-transformation normalizes sampling distribution of correlation coefficients
- Used for constructing confidence intervals and hypothesis testing

Statistical Properties and Considerations

Sample correlation biased for small sample sizes
- Tends to underestimate absolute value of population correlation
- Example: Sample of 10 data points likely to produce less accurate estimate than sample of 100
Confidence intervals constructed for sample correlations estimate range of plausible population correlation values
As sample size increases, sample correlation converges to population correlation
- Assumes random sampling and absence of systematic biases
Sample correlation used to estimate unknown population correlation
- Example: Studying correlation between study time and test scores in a class of 30 students to infer relationship for all students

🎲Intro to Probability Unit 11 Review

11.3 Properties of correlation

🎲Intro to Probability
Unit 11 Review

11.3 Properties of correlation

Unit & Topic Study Guides

Correlation Properties

Range and Interpretation

Symmetry and Invariance

Correlation and Independence

Relationship Between Correlation and Independence

Practical Considerations

Correlation Limitations

Nonlinear Relationships and Causality

Statistical and Methodological Issues

Population vs Sample Correlation

Definitions and Calculations

Statistical Properties and Considerations

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

Study Content & Tools

Company

Resources

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes