Fiveable
Fiveable

or

Log in

Find what you need to study


Light

Find what you need to study

2.4 Using Programs with Data

4 min readjanuary 2, 2023

Minna Chow

Minna Chow

Milo Chang

Milo Chang

Minna Chow

Minna Chow

Milo Chang

Milo Chang

With the current demand for data processing, it's no wonder that so many computer programs exist for that very purpose. In fact, there's even a name for it: the process of examining very large data sets to find useful information, such as or relationships, is known as .

A common example is a , such as or . You can use these programs to record, modify, and organize data. If you're using numbers, you can write equations and perform operations on your data as well.

You can also process text data using (or ) tools.  looks for within a written piece (anywhere in length from a clause to a novel and beyond) to categorize or classify it. If you've ever had a program tell you what the tone of your writing was, you've seen at work. can be used to sort product reviews, detect in public opinion and identify anonymous authors.

Data processing programs can also allow you to make tables and diagrams, such as line or bar graphs, to visualize your data. Creating visualizations of data allows you to convey what the data means and to make apparent. It's much easier to see positive or negative from a line chart, for instance, than when the data's sitting in a table. This is especially true when a lot of data is involved.

Other examples of data processing programs include search tools, like the ones that Google uses for images:

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-VaLHQMlXwlRy.png?alt=media&token=3c2a3df8-ad1b-491f-894f-3a5d5a252103

Image source: Google Images

You can use these to make finding information easier and faster and to specify what it is that you're looking for. For example, if you want an image in a certain color for a mood-board, you can find it using the color filter. If you want images taken before or after a certain date, you can use the time filter.

Different engines have different search tools based on what the search engine is used for. The search tools for an online academic journal, for instance, differ from the search tools that Google Images uses.

Some programs also have  capabilities, which means that they can create and extract different subsets of data for users to work with. These subsets can be based on time (like only looking at results from the winter) or value (like only looking at values below 30 or only positive values).

Transforming Data

One of the cool things that programs can do with data is to transform it! This is when you edit or modify data in some way to extract more information from it.

Data Transformation Examples:

  • Modifying every element of a data set. This can be an arithmetic modification, although it isn't necessarily one.
    • Ex. Multiplying each number by some constant value (Like if you wanted to convert a list of measurements from liters to millilitres.)
    • Another non-arithmetic example is adding a grade level or class rank to a list of student records.
  • Filtering a data set by category, as mentioned above.
    • Besides time or value, data sets can also be filtered by quality, such as which extracurricular activities a group of students are in.
  • Combining or comparing data in some way.
    • Ex. Comparing the average SAT score of students going to all the colleges in one state and combining that data with average scores from other states.
  • Creating tools.
    • Ex. graphs, charts, and word-bubbles.

These tools are often used in an  by users. You get to choose what filtering tools you want to use or what subsets you want to look at. You can also run data through data processing programs multiple times, depending on what information you want to look for. For example, you can look at data by the date it was collected, then sort it again by the location where it was collected from.

Manipulating data by combining, clustering or classifying it can bring out new information and previously unseen in the raw data, making it a helpful tool for data analysis.

Data Analysis Discoveries

Some of the things we can discover by analyzing data are:

    • What repeats again and again?
    • Ex: Does a product sell particularly well over one season or in one area over many years?
    • Is there a steadily rising trend? A falling trend? Any fluctuations, or variances, in the data?
    • Ex: Here's a graph of the interest over time for Fiveable since its founding in 2018. (Note the two spikes around AP exam season.)
    • https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-RJe7vBW58vru.png?alt=media&token=d657244d-d9a6-448e-b262-261970e569f0

      Data Source: Google Trends

  • , or relationships

    • Ex: Is there a relationship between extracurricular choice and favorite subject? Time of day and performance on tests? Driving experience and number of traffic tickets?
    • Is there a positive correlation? A negative one? None at all?
    • Remember that correlation does not equal causation! Just because two things correlate with each other doesn't mean that one caused the other.
  • Outliers

    • Are there any ? Why is there an outlier?

In conclusion...

That's Big Idea 2: Data for you! Our next Big Idea Guide is a crash course to algorithms and (basic) programming.

Key Terms to Review (13)

Correlations

: Correlations refer to the statistical relationship between two or more variables. It measures how closely these variables are related to each other, ranging from -1 (perfect negative correlation) to 1 (perfect positive correlation).

Data Filtering

: Data filtering is the process of selectively extracting or removing specific pieces of data from a larger dataset based on certain criteria or conditions. It allows you to focus on relevant information while excluding irrelevant or unwanted data.

Data mining

: Data mining involves extracting useful patterns or knowledge from large datasets using techniques such as statistical analysis, machine learning, and pattern recognition.

Data Visualization

: Data visualization is the process of representing data in a visual format, such as charts, graphs, or maps, to make it easier to understand and analyze.

Google Sheets

: Google Sheets is a web-based spreadsheet program offered by Google as part of its suite of productivity tools. It allows multiple users to collaborate on the same spreadsheet simultaneously, providing real-time updates and cloud storage.

Iterative and Interactive Process

: An iterative and interactive process refers to a method of problem-solving or development where the steps are repeated multiple times, with each repetition building upon the previous one. It involves constant feedback and collaboration between the user and the system.

Microsoft Excel

: Microsoft Excel is a spreadsheet program that allows users to organize, analyze, and manipulate data using formulas, functions, and charts.

Outliers

: Outliers are data points that significantly deviate from the overall pattern or trend of a dataset. They can skew statistical analyses and affect the accuracy of results.

Patterns

: Patterns refer to recurring solutions or designs that can be applied to solve similar problems. They provide a structured approach for solving problems efficiently by reusing proven methods.

Spreadsheet Program

: A spreadsheet program is software that allows users to organize, analyze, and manipulate numerical data in rows and columns. It provides functions for calculations, graphing capabilities, and tools for creating charts or tables.

Text Analysis

: Text analysis refers to the process of extracting meaningful information from written text by analyzing its content, structure, and context.

Text Mining

: Text mining involves extracting useful patterns or knowledge from large amounts of unstructured textual data using techniques such as natural language processing and machine learning.

Trends

: Trends are patterns that show changes over time. In computer science, analyzing trends can help identify patterns in data or predict future behavior.

2.4 Using Programs with Data

4 min readjanuary 2, 2023

Minna Chow

Minna Chow

Milo Chang

Milo Chang

Minna Chow

Minna Chow

Milo Chang

Milo Chang

With the current demand for data processing, it's no wonder that so many computer programs exist for that very purpose. In fact, there's even a name for it: the process of examining very large data sets to find useful information, such as or relationships, is known as .

A common example is a , such as or . You can use these programs to record, modify, and organize data. If you're using numbers, you can write equations and perform operations on your data as well.

You can also process text data using (or ) tools.  looks for within a written piece (anywhere in length from a clause to a novel and beyond) to categorize or classify it. If you've ever had a program tell you what the tone of your writing was, you've seen at work. can be used to sort product reviews, detect in public opinion and identify anonymous authors.

Data processing programs can also allow you to make tables and diagrams, such as line or bar graphs, to visualize your data. Creating visualizations of data allows you to convey what the data means and to make apparent. It's much easier to see positive or negative from a line chart, for instance, than when the data's sitting in a table. This is especially true when a lot of data is involved.

Other examples of data processing programs include search tools, like the ones that Google uses for images:

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-VaLHQMlXwlRy.png?alt=media&token=3c2a3df8-ad1b-491f-894f-3a5d5a252103

Image source: Google Images

You can use these to make finding information easier and faster and to specify what it is that you're looking for. For example, if you want an image in a certain color for a mood-board, you can find it using the color filter. If you want images taken before or after a certain date, you can use the time filter.

Different engines have different search tools based on what the search engine is used for. The search tools for an online academic journal, for instance, differ from the search tools that Google Images uses.

Some programs also have  capabilities, which means that they can create and extract different subsets of data for users to work with. These subsets can be based on time (like only looking at results from the winter) or value (like only looking at values below 30 or only positive values).

Transforming Data

One of the cool things that programs can do with data is to transform it! This is when you edit or modify data in some way to extract more information from it.

Data Transformation Examples:

  • Modifying every element of a data set. This can be an arithmetic modification, although it isn't necessarily one.
    • Ex. Multiplying each number by some constant value (Like if you wanted to convert a list of measurements from liters to millilitres.)
    • Another non-arithmetic example is adding a grade level or class rank to a list of student records.
  • Filtering a data set by category, as mentioned above.
    • Besides time or value, data sets can also be filtered by quality, such as which extracurricular activities a group of students are in.
  • Combining or comparing data in some way.
    • Ex. Comparing the average SAT score of students going to all the colleges in one state and combining that data with average scores from other states.
  • Creating tools.
    • Ex. graphs, charts, and word-bubbles.

These tools are often used in an  by users. You get to choose what filtering tools you want to use or what subsets you want to look at. You can also run data through data processing programs multiple times, depending on what information you want to look for. For example, you can look at data by the date it was collected, then sort it again by the location where it was collected from.

Manipulating data by combining, clustering or classifying it can bring out new information and previously unseen in the raw data, making it a helpful tool for data analysis.

Data Analysis Discoveries

Some of the things we can discover by analyzing data are:

    • What repeats again and again?
    • Ex: Does a product sell particularly well over one season or in one area over many years?
    • Is there a steadily rising trend? A falling trend? Any fluctuations, or variances, in the data?
    • Ex: Here's a graph of the interest over time for Fiveable since its founding in 2018. (Note the two spikes around AP exam season.)
    • https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-RJe7vBW58vru.png?alt=media&token=d657244d-d9a6-448e-b262-261970e569f0

      Data Source: Google Trends

  • , or relationships

    • Ex: Is there a relationship between extracurricular choice and favorite subject? Time of day and performance on tests? Driving experience and number of traffic tickets?
    • Is there a positive correlation? A negative one? None at all?
    • Remember that correlation does not equal causation! Just because two things correlate with each other doesn't mean that one caused the other.
  • Outliers

    • Are there any ? Why is there an outlier?

In conclusion...

That's Big Idea 2: Data for you! Our next Big Idea Guide is a crash course to algorithms and (basic) programming.

Key Terms to Review (13)

Correlations

: Correlations refer to the statistical relationship between two or more variables. It measures how closely these variables are related to each other, ranging from -1 (perfect negative correlation) to 1 (perfect positive correlation).

Data Filtering

: Data filtering is the process of selectively extracting or removing specific pieces of data from a larger dataset based on certain criteria or conditions. It allows you to focus on relevant information while excluding irrelevant or unwanted data.

Data mining

: Data mining involves extracting useful patterns or knowledge from large datasets using techniques such as statistical analysis, machine learning, and pattern recognition.

Data Visualization

: Data visualization is the process of representing data in a visual format, such as charts, graphs, or maps, to make it easier to understand and analyze.

Google Sheets

: Google Sheets is a web-based spreadsheet program offered by Google as part of its suite of productivity tools. It allows multiple users to collaborate on the same spreadsheet simultaneously, providing real-time updates and cloud storage.

Iterative and Interactive Process

: An iterative and interactive process refers to a method of problem-solving or development where the steps are repeated multiple times, with each repetition building upon the previous one. It involves constant feedback and collaboration between the user and the system.

Microsoft Excel

: Microsoft Excel is a spreadsheet program that allows users to organize, analyze, and manipulate data using formulas, functions, and charts.

Outliers

: Outliers are data points that significantly deviate from the overall pattern or trend of a dataset. They can skew statistical analyses and affect the accuracy of results.

Patterns

: Patterns refer to recurring solutions or designs that can be applied to solve similar problems. They provide a structured approach for solving problems efficiently by reusing proven methods.

Spreadsheet Program

: A spreadsheet program is software that allows users to organize, analyze, and manipulate numerical data in rows and columns. It provides functions for calculations, graphing capabilities, and tools for creating charts or tables.

Text Analysis

: Text analysis refers to the process of extracting meaningful information from written text by analyzing its content, structure, and context.

Text Mining

: Text mining involves extracting useful patterns or knowledge from large amounts of unstructured textual data using techniques such as natural language processing and machine learning.

Trends

: Trends are patterns that show changes over time. In computer science, analyzing trends can help identify patterns in data or predict future behavior.


© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.