Heap Sort and Binary Heap
Prerequisite Knowledge
Before reading this article, you need to first study:
Summary in One Sentence
Heap sort is a sorting algorithm derived from the binary heap structure, with a complexity of . Heap sort mainly consists of two steps: first, creating a binary heap in-place on the array to be sorted (Heapify), and then performing in-place sorting (Sort).
You can open the visualization panel below, click to jump to the let tree = ...
section of the code to see how the array is abstracted into a complete binary tree; repeatedly click on the Heap.swim
part of the code to observe the in-place heap construction process; click on the Heap.sink
part to see the in-place sorting process.
To learn the heap sort algorithm, you must understand the principles of the binary heap structure, otherwise, the sorting process may be completely incomprehensible.
The previous article Basic Binary Heap introduced the binary heap structure, and Binary Heap Implementation of Priority Queue used the binary heap structure to implement a SimpleMinPQ
priority queue, where elements are removed from the queue in ascending order.
This article will introduce the heap sort algorithm, a new sorting algorithm derived from the properties of binary heaps, which is both elegant and efficient.
First, I will revisit several key principles of implementing a priority queue with a binary heap. If there is anything you do not understand, be sure to review the previous article, otherwise, heap sort will be difficult to comprehend.
The binary heap (priority queue) is implemented using an array at its base, but logically it is a complete binary tree, primarily maintained by the
swim
andsink
methods.A priority queue can be a min-heap or a max-heap. A min-heap maintains the smallest element at the top of the heap, while a max-heap maintains the largest element at the top.
When inserting elements into the priority queue, the element is first appended to the bottom of the binary heap, then the
swim
method is called to float the element to the appropriate position, with a time complexity of .When deleting the top element of the priority queue, the last element at the bottom of the heap is first swapped to the top as the new top element, then the
sink
method is called to sink this new top element to the appropriate position, with a time complexity of .
The simplest idea for a heap sort algorithm is to directly use the priority queue: insert all elements into the priority queue and then retrieve them, which would complete the sorting.
// sort the array from smallest to largest using a priority queue
void sort(int[] nums) {
// create a min heap that sorts elements from smallest to largest
SimpleMinPQ pq = new SimpleMinPQ(nums.length);
// first insert all elements into the priority queue
for (int num : nums) {
// push operation automatically builds a binary heap, with time complexity O(logN)
pq.push(num);
}
// then extract all elements, resulting in a sorted order from smallest to largest
for (int i = 0; i < nums.length; i++) {
// pop operation removes the smallest element from
// the top of the binary heap, with time complexity
nums[i] = pq.pop();
}
}
Since the push
and pop
methods of a priority queue have a complexity of , the overall time complexity of sorting is , where N
is the length of the input array.
This approach can yield the correct sorting result, but the space complexity is , as the priority queue we create is an additional data structure that uses an array to store elements.
Therefore, the problem heap sort aims to solve is to avoid using extra auxiliary space and perform sink
and swim
operations directly on the original array, completing the sort in time.
Two Key Steps of Heap Sort
In-place Heap Construction (Heapify): Directly transform the array to be sorted into a binary heap in place.
Sorting: Continuously extract elements from the heap to obtain a sorted result.
Take a few minutes to think about it yourself. Comparing the process of adding and removing elements in a priority queue, it is indeed not difficult to implement these two steps in-place using swim
and sink
methods, and you should be able to figure it out independently.
Before explaining the heap sort code implementation in detail, I'll first present the swim
and sink
methods of a binary heap along with the supporting utility functions, because I will guide you through optimizing the heap sort code step by step later, and I won’t repeat the implementation of these functions.
These functions are extracted from the priority queue implementation in Binary Heap Implementation of Priority Queue, with the array passed in as a function parameter, and the other logic remains unchanged: