📊Business Intelligence Unit 8 Review

8.2 Supervised and Unsupervised Learning Algorithms

📊Business Intelligence
Unit 8 Review

8.2 Supervised and Unsupervised Learning Algorithms

Written by the Fiveable Content Team • Last updated September 2025

📊Business Intelligence

Unit & Topic Study Guides

8.1 Predictive Analytics Fundamentals

8.2 Supervised and Unsupervised Learning Algorithms

8.3 Model Evaluation and Validation Techniques

8.4 Business Applications of Predictive Modeling

Machine learning algorithms fall into two main categories: supervised and unsupervised. Supervised learning uses labeled data to predict outcomes, while unsupervised learning uncovers hidden patterns in unlabeled data. These approaches power various applications, from predicting house prices to grouping customers.

Supervised algorithms like linear regression and decision trees tackle specific prediction tasks. Unsupervised techniques such as clustering and dimensionality reduction reveal underlying data structures. Choosing the right algorithm depends on the problem type, data characteristics, and desired outcomes.

Supervised and Unsupervised Learning Algorithms

Supervised vs unsupervised learning

Supervised learning trains models using labeled data with known target variables (house prices) to predict outcomes for new, unseen data
- Requires input features (square footage, number of bedrooms) and corresponding target variables to learn patterns
- Enables prediction or classification tasks (predicting house prices, classifying email as spam or not spam)
Unsupervised learning discovers hidden patterns or structures in unlabeled data without predefined target variables
- Identifies inherent groupings (customer segments) or reduces data dimensionality for visualization or feature extraction
- Includes clustering algorithms (K-means) and dimensionality reduction techniques (Principal Component Analysis)

Applications of supervised algorithms

Linear regression predicts continuous target variables assuming a linear relationship between input features and the target
- Estimates coefficients to minimize the sum of squared errors between predicted and actual values
- Equation: $y = \beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \beta_nx_n$
Logistic regression predicts binary target variables (customer churn) by modeling the probability of the target belonging to a particular class
- Uses the logistic function to map input features to a probability between 0 and 1
- Equation: $p(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \beta_nx_n)}}$
Decision trees recursively split data based on input features to create a tree-like model for prediction or classification
- Internal nodes represent decisions based on features (age > 30), leaf nodes represent class labels or continuous values
- Can be prone to overfitting, mitigated by pruning or setting maximum depth
Random forests combine multiple decision trees to improve performance and reduce overfitting
- Builds trees on bootstrapped samples of training data using random subsets of input features
- Final prediction is the majority vote (classification) or average (regression) of individual tree predictions

Techniques for unsupervised learning

Clustering groups similar data points based on intrinsic characteristics
- K-means partitions data into K clusters based on Euclidean distance between data points and cluster centroids
  1. Assigns data points to nearest centroid
  2. Updates centroids based on assigned points
  3. Repeats steps 1-2 until convergence
- Hierarchical clustering creates a tree-like structure (dendrogram) by iteratively merging or splitting clusters based on similarity
Dimensionality reduction reduces the number of input features while preserving essential information
- Enables visualization of high-dimensional data, noise removal, and computational efficiency
- Principal Component Analysis (PCA) identifies directions of maximum variance (principal components) and projects data onto lower-dimensional space
- t-SNE (t-Distributed Stochastic Neighbor Embedding) preserves local structure of high-dimensional data in a lower-dimensional representation

Algorithm selection criteria

Problem type dictates the choice of algorithm
- Regression for predicting continuous target variables
  - Linear regression for linear relationships (housing prices based on square footage)
  - Decision trees or random forests for non-linear relationships (stock prices based on economic indicators)
- Classification for predicting categorical target variables
  - Logistic regression for binary classification (customer churn)
  - Decision trees or random forests for multi-class classification (image recognition)
Data characteristics influence algorithm selection
- Linear regression for linear relationships, decision trees or random forests for non-linear relationships
- Dimensionality reduction (PCA) for high-dimensional data (gene expression data)
- Decision trees and random forests are less sensitive to outliers compared to linear and logistic regression
- Decision trees provide interpretable rules, while linear and logistic regression coefficients can be interpreted as feature importances

📊Business Intelligence Unit 8 Review

8.2 Supervised and Unsupervised Learning Algorithms

📊Business Intelligence
Unit 8 Review

8.2 Supervised and Unsupervised Learning Algorithms

Unit & Topic Study Guides

Supervised and Unsupervised Learning Algorithms

Supervised vs unsupervised learning

Applications of supervised algorithms

Techniques for unsupervised learning

Algorithm selection criteria

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

Study Content & Tools

Company

Resources

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes