Fiveable
Fiveable
pep
Fiveable
Fiveable

or

Log in

Find what you need to study


Light

1.2 The Language of Variation: Variables

5 min readdecember 27, 2022

L

Lusine Ghazaryan

Jed Quiaoit

Jed Quiaoit

L

Lusine Ghazaryan

Jed Quiaoit

Jed Quiaoit

Attend a live cram event

Review all units live with expert teachers & students

Types of Variables

Before taking you on the journey of learning, statistics, let's make some sense of data. 🤔

Data is actually in plural form; it contains information about individuals or units that have characteristics, also called variables. The values that variables assume are called data. Since the variables can be categorical or quantitative, data can also be divided into categorical and quantitative. 📦

When the variable assumes values that are attributes, we call the variable categorical, and data as categorical—for example, the colors of cars, names of states, districts, countries. The values for colors of cars may stretch from white to black, any possible color you may see on the street. Then it makes sense to group those values and compare them. 

When we measure a characteristic that results in numerical values, then we deal with quantitative variables and subsequently with quantitative data—for example, the number of days, the price of the product, the age of the individuals. The quantitative data divided further into two types: discrete and .

Recall your algebra class when we called discrete to those numbers that were whole and to those numbers that come in the intervals. The price, weight, age are because it can assume numbers in intervals. When data assumed are numbers, then it makes sense to find an average. 

The variables can be measured at different levels: nominal, ordinal, interval, and ratio. The qualitative variables are nominal and ordinal. The difference between the two is that ordinal has some order between qualitative data, but nominal has not. For example, the satisfaction level of customers can be ranked by some order from most to least.  The difference between interval and ratio is that interval level measurement ranks data, but there is no meaningful 0, whereas the ratio has 0 in its meaning.

The variables change from one individual to another, and so data change over time. If we ask the same question to different people we’ll get different answers. Statistics tools will help us notice the relationships and varied patterns among individuals. This variability makes the study of statistics more interesting. ⭐

Key Vocabulary

  • Individuals

  • Variable

  • Data

  • Quantitative Variable

  • Distribution

Going Deeper: Categorical vs. Quantitative Variables

Earlier, we established that variables refer to characteristics that change from one individual to another: age group, dominant hand, height, you name it! In statistics, one of the ways variables can be classified is between categorical or quantitative. Let's build upon the definitions we introduced earlier.

  • Categorical variables are variables that can be placed into categories or groups. These variables do not have a numerical value and cannot be ordered or ranked. Examples: gender, race, and marital status. 🫵

  • Quantitative variables are variables that can be measured or counted and have a numerical value. These variables can be either or discrete. quantitative variables can take on any value within a given range, such as height or weight. Discrete quantitative variables can only take on certain values, such as the number of children in a household or the number of times a person has been hospitalized. 🔢

It is important to correctly identify the type of variables in a study because different statistical techniques are appropriate for analyzing data from different types of variables. For example, t-tests are commonly used to analyze data from quantitative variables, while chi-square tests are commonly used to analyze data from categorical variables. Don't worry about the tests for now! We'll talk more about them later in Units 6 to 9 of this course.

Still confused? Here's a list of categorical variables:

  1. Gender (male or female)

  2. Race (white, black, Hispanic, etc.)

  3. Marital status (single, married, etc.)

  4. Employment status (employed, unemployed, self-employed, etc.)

  5. Education level (high school, associate's degree, bachelor's degree, etc.)

  6. Political party (Republican, Democrat, Independent, etc.)

  7. Religion (Christian, Muslim, Hindu, etc.)

  8. Eye color (blue, brown, green, etc.)

  9. Hair color (blonde, brunette, red, etc.)

  10. Birthplace (United States, Canada, Mexico, etc.)

What about quantitative variables? Here's a list of some of them:

  1. Age (8, 16, 34, etc.)

  2. Height (180 cm, 5'2", 2 meters, etc.)

  3. Weight

  4. Income

  5. Body mass index (BMI)

  6. Blood pressure

  7. Heart rate

  8. Hours of sleep (a controversial one for teens)

  9. Distance traveled

  10. Number of siblings

Example Question

Let's dive even deeper by look at this example to see how well we can make a distinction between the two types of variables and data. In the example below we can learn more about variables. 😀

Transportation Safety

The chart shows the number of job-related injuries in each of the transportation industries in 1998.

Industry               Number of injuries

Railroad                     4520  

Intercity bus               5100  

Subway                      6850

Trucking                     7144

Airline                        9950

1. What are the variables that we are studying?

Looking at the table, we can see that we have two variables; type of industry and number of injuries.

2. Categorize each variable as quantitative or qualitative.

The type of industry, of course, is a qualitative variable, as the values are names for transportation. At the same time, the number of job-related injuries is quantitative, as the values are numbers.

3. Categorize each quantitative variable as discrete or .

The number of job-related injuries is discrete.

4. Identify the level of measurement for each variable.

The type of industry is nominal, and the number of job-related injuries is a ratio. 

5. The railroad is shown as the safest transportation industry. Does that mean railroads have fewer accidents than the other industries? Explain.

This question makes you think about what the number means to you. The railroads do show fewer job-related injuries; however, there may be other things to consider. For example, railroads employ fewer people than the other transportation industries in the study.

6. From the information given, comment on the relationship between the variables. 

We can see that the railroads have the fewest job-related injuries. In contrast, the airline industry has the most job-related injuries (more than twice those of the railroad industry). The numbers of job-related injuries in the subway and trucking industries are fairly comparable. 

Bottom line: always look at data and see what you can see behind, how they are related, and how they compare to each other.

🎥 Watch: AP Stats - Unit 1 Streams

Key Terms to Review (8)

Categorical Variable

: A categorical variable is one that represents characteristics or qualities rather than numerical values. It consists of categories or groups into which data can be classified.

Chi-Square Test

: A statistical test used to determine if there is a significant association between two categorical variables. It compares the observed frequencies with the expected frequencies under the assumption of independence.

Continuous

: Continuous data refers to numerical data that can take on any value within a given range. It can be measured and divided into smaller units, and there are infinite possible values between any two points.

Interval Level of Measurement

: Interval level of measurement is a type of measurement scale that not only categorizes data but also allows for meaningful comparisons between the values. It has equal intervals between the numbers, but there is no true zero point.

Nominal Level of Measurement

: Nominal level of measurement is the lowest level of measurement where variables are categorized into distinct groups or categories based on their characteristics or attributes.

Ordinal Level of Measurement

: Ordinal level of measurement is a type of measurement scale where variables are ranked or ordered based on their attributes. The order matters, but the differences between values may not be equal or meaningful.

Ratio Level of Measurement

: Ratio level of measurement is similar to interval level, as it allows for meaningful comparisons and equal intervals. However, ratio level also has a true zero point which represents an absence or complete lack of the measured attribute.

t-test

: A t-test is a statistical test that compares two sample means to determine if they are significantly different from each other.

1.2 The Language of Variation: Variables

5 min readdecember 27, 2022

L

Lusine Ghazaryan

Jed Quiaoit

Jed Quiaoit

L

Lusine Ghazaryan

Jed Quiaoit

Jed Quiaoit

Attend a live cram event

Review all units live with expert teachers & students

Types of Variables

Before taking you on the journey of learning, statistics, let's make some sense of data. 🤔

Data is actually in plural form; it contains information about individuals or units that have characteristics, also called variables. The values that variables assume are called data. Since the variables can be categorical or quantitative, data can also be divided into categorical and quantitative. 📦

When the variable assumes values that are attributes, we call the variable categorical, and data as categorical—for example, the colors of cars, names of states, districts, countries. The values for colors of cars may stretch from white to black, any possible color you may see on the street. Then it makes sense to group those values and compare them. 

When we measure a characteristic that results in numerical values, then we deal with quantitative variables and subsequently with quantitative data—for example, the number of days, the price of the product, the age of the individuals. The quantitative data divided further into two types: discrete and .

Recall your algebra class when we called discrete to those numbers that were whole and to those numbers that come in the intervals. The price, weight, age are because it can assume numbers in intervals. When data assumed are numbers, then it makes sense to find an average. 

The variables can be measured at different levels: nominal, ordinal, interval, and ratio. The qualitative variables are nominal and ordinal. The difference between the two is that ordinal has some order between qualitative data, but nominal has not. For example, the satisfaction level of customers can be ranked by some order from most to least.  The difference between interval and ratio is that interval level measurement ranks data, but there is no meaningful 0, whereas the ratio has 0 in its meaning.

The variables change from one individual to another, and so data change over time. If we ask the same question to different people we’ll get different answers. Statistics tools will help us notice the relationships and varied patterns among individuals. This variability makes the study of statistics more interesting. ⭐

Key Vocabulary

  • Individuals

  • Variable

  • Data

  • Quantitative Variable

  • Distribution

Going Deeper: Categorical vs. Quantitative Variables

Earlier, we established that variables refer to characteristics that change from one individual to another: age group, dominant hand, height, you name it! In statistics, one of the ways variables can be classified is between categorical or quantitative. Let's build upon the definitions we introduced earlier.

  • Categorical variables are variables that can be placed into categories or groups. These variables do not have a numerical value and cannot be ordered or ranked. Examples: gender, race, and marital status. 🫵

  • Quantitative variables are variables that can be measured or counted and have a numerical value. These variables can be either or discrete. quantitative variables can take on any value within a given range, such as height or weight. Discrete quantitative variables can only take on certain values, such as the number of children in a household or the number of times a person has been hospitalized. 🔢

It is important to correctly identify the type of variables in a study because different statistical techniques are appropriate for analyzing data from different types of variables. For example, t-tests are commonly used to analyze data from quantitative variables, while chi-square tests are commonly used to analyze data from categorical variables. Don't worry about the tests for now! We'll talk more about them later in Units 6 to 9 of this course.

Still confused? Here's a list of categorical variables:

  1. Gender (male or female)

  2. Race (white, black, Hispanic, etc.)

  3. Marital status (single, married, etc.)

  4. Employment status (employed, unemployed, self-employed, etc.)

  5. Education level (high school, associate's degree, bachelor's degree, etc.)

  6. Political party (Republican, Democrat, Independent, etc.)

  7. Religion (Christian, Muslim, Hindu, etc.)

  8. Eye color (blue, brown, green, etc.)

  9. Hair color (blonde, brunette, red, etc.)

  10. Birthplace (United States, Canada, Mexico, etc.)

What about quantitative variables? Here's a list of some of them:

  1. Age (8, 16, 34, etc.)

  2. Height (180 cm, 5'2", 2 meters, etc.)

  3. Weight

  4. Income

  5. Body mass index (BMI)

  6. Blood pressure

  7. Heart rate

  8. Hours of sleep (a controversial one for teens)

  9. Distance traveled

  10. Number of siblings

Example Question

Let's dive even deeper by look at this example to see how well we can make a distinction between the two types of variables and data. In the example below we can learn more about variables. 😀

Transportation Safety

The chart shows the number of job-related injuries in each of the transportation industries in 1998.

Industry               Number of injuries

Railroad                     4520  

Intercity bus               5100  

Subway                      6850

Trucking                     7144

Airline                        9950

1. What are the variables that we are studying?

Looking at the table, we can see that we have two variables; type of industry and number of injuries.

2. Categorize each variable as quantitative or qualitative.

The type of industry, of course, is a qualitative variable, as the values are names for transportation. At the same time, the number of job-related injuries is quantitative, as the values are numbers.

3. Categorize each quantitative variable as discrete or .

The number of job-related injuries is discrete.

4. Identify the level of measurement for each variable.

The type of industry is nominal, and the number of job-related injuries is a ratio. 

5. The railroad is shown as the safest transportation industry. Does that mean railroads have fewer accidents than the other industries? Explain.

This question makes you think about what the number means to you. The railroads do show fewer job-related injuries; however, there may be other things to consider. For example, railroads employ fewer people than the other transportation industries in the study.

6. From the information given, comment on the relationship between the variables. 

We can see that the railroads have the fewest job-related injuries. In contrast, the airline industry has the most job-related injuries (more than twice those of the railroad industry). The numbers of job-related injuries in the subway and trucking industries are fairly comparable. 

Bottom line: always look at data and see what you can see behind, how they are related, and how they compare to each other.

🎥 Watch: AP Stats - Unit 1 Streams

Key Terms to Review (8)

Categorical Variable

: A categorical variable is one that represents characteristics or qualities rather than numerical values. It consists of categories or groups into which data can be classified.

Chi-Square Test

: A statistical test used to determine if there is a significant association between two categorical variables. It compares the observed frequencies with the expected frequencies under the assumption of independence.

Continuous

: Continuous data refers to numerical data that can take on any value within a given range. It can be measured and divided into smaller units, and there are infinite possible values between any two points.

Interval Level of Measurement

: Interval level of measurement is a type of measurement scale that not only categorizes data but also allows for meaningful comparisons between the values. It has equal intervals between the numbers, but there is no true zero point.

Nominal Level of Measurement

: Nominal level of measurement is the lowest level of measurement where variables are categorized into distinct groups or categories based on their characteristics or attributes.

Ordinal Level of Measurement

: Ordinal level of measurement is a type of measurement scale where variables are ranked or ordered based on their attributes. The order matters, but the differences between values may not be equal or meaningful.

Ratio Level of Measurement

: Ratio level of measurement is similar to interval level, as it allows for meaningful comparisons and equal intervals. However, ratio level also has a true zero point which represents an absence or complete lack of the measured attribute.

t-test

: A t-test is a statistical test that compares two sample means to determine if they are significantly different from each other.


© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.