Heap Sort is a powerful sorting algorithm that leverages the binary heap data structure. It transforms an array into a max heap, then systematically extracts the maximum element to create a sorted array. This process combines efficiency with in-place sorting.
The algorithm's two main phases โ building the max heap and repeatedly extracting the maximum element โ result in a time complexity of O(n log n). While not stable, Heap Sort guarantees consistent performance across all cases and operates with O(1) space complexity, making it a valuable tool in algorithm design.
Heap Sort Algorithm
Fundamentals of Heap Sort
- Comparison-based sorting algorithm utilizing binary heap data structure
- Sorts elements in ascending or descending order
- Consists of two main phases
- Building a max heap from the input array
- Repeatedly extracting the maximum element to create a sorted array
- Heapify operation maintains heap property by comparing node with children and swapping if necessary
- Transforms input array into max heap with largest element at root
- Repeatedly swaps root (largest element) with last unsorted element and calls heapify on reduced heap
- Performs sorting in-place without additional memory proportional to input size
Heap Sort Process
- Start by building max heap from input array
- After max heap construction, algorithm follows these steps:
- Swap root (maximum element) with last unsorted element
- Reduce heap size by 1
- Call heapify on root to restore max heap property
- Repeat process until all elements are sorted
- Example of Heap Sort steps:
- Initial array: [4, 10, 3, 5, 1]
- Max heap: [10, 5, 3, 4, 1]
- First swap: [1, 5, 3, 4, 10]
- Heapify: [5, 4, 3, 1, 10]
- Second swap: [1, 4, 3, 5, 10]
- Continue process until fully sorted
Implementing Heap Sort
Binary Heap Structure
- Complete binary tree where each node satisfies heap property
- Parent greater than or equal to children (max heap)
- Parent less than or equal to children (min heap)
- Typically implemented using array
- Parent and child relationships determined by index calculations
- Parent index:
- Left child index:
- Right child index:
- Example of array representation:
- Array: [10, 5, 3, 4, 1]
- Corresponding binary heap:
10 / \ 5 3 / \ 4 1
Key Functions in Heap Sort Implementation
- Heap Sort implementation requires three main functions
- heapify
- buildMaxHeap
- heapSort
- heapify function
- Compares node with children
- Recursively ensures heap property for subtree rooted at node
- Time complexity: O(log n)
- buildMaxHeap function
- Iteratively calls heapify on all non-leaf nodes
- Starts from last non-leaf node, moves towards root
- Time complexity: O(n)
- heapSort function
- Calls buildMaxHeap
- Repeatedly extracts maximum element
- Restores heap property until array sorted
- Time complexity: O(n log n)
- Example implementation in Python:
def heapify(arr, n, i): largest = i left = 2 i + 1 right = 2 i + 2 if left < n and arr[left] > arr[largest]: largest = left if right < n and arr[right] > arr[largest]: largest = right if largest != i: arr[i], arr[largest] = arr[largest], arr[i] heapify(arr, n, largest) def build_max_heap(arr): n = len(arr) for i in range(n // 2 - 1, -1, -1): heapify(arr, n, i) def heap_sort(arr): n = len(arr) build_max_heap(arr) for i in range(n - 1, 0, -1): arr[0], arr[i] = arr[i], arr[0] heapify(arr, i, 0)
Correctness of Heap Sort
Loop Invariants and Induction
- Correctness proven using loop invariants and induction on input array size
- buildMaxHeap function loop invariant
- At start of each iteration, all subtrees rooted at nodes with indices greater than current index satisfy max heap property
- Sorting phase loop invariant
- At start of each iteration, first i elements of array in final sorted positions
- Remaining n-i elements form max heap
- Base case for induction
- Array with one element trivially sorted and satisfies heap property
- Inductive step
- Proves if algorithm correctly sorts array of size n-1, it will correctly sort array of size n
Proving Heapify and Overall Correctness
- Heapify function correctness
- Shows it maintains max heap property for subtree
- Assumes child subtrees are already max heaps
- Overall correctness proof combines
- Correctness of buildMaxHeap
- Correctness of sorting phase
- Example of correctness proof for small array:
- Initial array: [4, 2, 8, 1]
- After buildMaxHeap: [8, 4, 2, 1]
- First iteration: [1, 4, 2, 8]
- Second iteration: [2, 1, 4, 8]
- Third iteration: [1, 2, 4, 8]
- At each step, invariants hold and largest element moves to correct position
Time and Space Complexity of Heap Sort
Time Complexity Analysis
- Overall time complexity: O(n log n) in all cases (best, average, worst)
- Building initial max heap
- Takes O(n) time
- Proven by analysis of buildMaxHeap function using master theorem or substitution method
- Heapify operation
- Time complexity: O(log n) in worst case
- May need to traverse height of heap
- Sorting phase
- Consists of n-1 extractions and heapify operations
- Each operation takes O(log n) time
- Total time: O(n log n)
- Comparison with other sorting algorithms
- Merge Sort: Also O(n log n), but requires additional space
- Quick Sort: O(n log n) average case, O(n^2) worst case
- Insertion Sort: O(n^2), but performs better on small or nearly sorted arrays
Space Complexity and Algorithm Characteristics
- Space complexity: O(1) or constant space
- Performs sorting in-place without additional memory proportional to input size
- Not a stable sorting algorithm
- May change relative order of equal elements in sorted output
- Advantages of Heap Sort
- Guaranteed O(n log n) performance
- In-place sorting
- Disadvantages
- Often has poorer cache performance due to non-local memory accesses
- Example of cache performance issue:
- Array: [1, 2, 3, 4, 5, 6, 7, 8]
- Heap representation:
8 / \ 7 6 / \ / \ 4 5 2 3
- Accessing elements requires jumping to different parts of array, reducing cache efficiency