Heap Sort and Binary Heap
Prerequisites
Before reading this article, you need to study:
Summary
Heap sort is a sorting algorithm derived from the binary heap structure. Its time complexity is . Heap sort has two main steps: first, build a binary heap in place from the array (Heapify), then sort the array in place (Sort).
You can open the visualization panel below. Click to go to the let heap = ...
part to see how the array is represented as a complete binary tree. Keep clicking Heap.swim
to see the process of building the heap in place. Click Heap.sink
to see how the sorting is done in place.
To learn heap sort, you must understand the principle of binary heap, otherwise the sorting process may not make sense.
Algorithm Visualization
In the previous article Binary Heap Basics, we introduced the structure of a binary heap. In Using Binary Heap to Implement Priority Queue, we used the binary heap to build aSimpleMinPQ
priority queue, where inserted elements are taken out in increasing order.
This article will introduce the heap sort algorithm. It is a new sorting method based on the properties of binary heap. Heap sort is efficient and elegant.
First, I want to review a few key points about using binary heaps for priority queues. If you don't understand any of them, go back and review the previous articles, or you will not understand heap sort.
A binary heap (priority queue) is implemented using an array, but logically it is a complete binary tree. The
swim
andsink
methods are used to maintain the heap property.A priority queue can be a min-heap or a max-heap. A min-heap keeps the smallest element at the top; a max-heap keeps the largest element at the top.
When you insert an element into a priority queue, you add it to the bottom of the heap, then call
swim
to move it up to the correct position. This takes time.When you remove the top element from the priority queue, you swap the last element of the heap to the top, then call
sink
to move it down to the correct place. This also takes time.
So, the simplest idea for heap sort is to use the priority queue directly: put all the elements into the priority queue, then take them out one by one, and you get a sorted array.
// sort the array from smallest to largest using a priority queue
void sort(int[] nums) {
// create a min heap that sorts elements from smallest to largest
SimpleMinPQ pq = new SimpleMinPQ(nums.length);
// first insert all elements into the priority queue
for (int num : nums) {
// push operation automatically builds a binary heap, with time complexity O(logN)
pq.push(num);
}
// then extract all elements, resulting in a sorted order from smallest to largest
for (int i = 0; i < nums.length; i++) {
// pop operation removes the smallest element from the top
// of the binary heap, with time complexity O(logN)
nums[i] = pq.pop();
}
}
// sort the array from smallest to largest using a priority queue
void sort(std::vector<int>& nums) {
// create a min heap that sorts elements from smallest to largest
SimpleMinPQ pq(nums.size());
// first insert all elements into the priority queue
for (int num : nums) {
// push operation automatically builds a binary heap, with time complexity O(logN)
pq.push(num);
}
// then extract all elements, resulting in a sorted order from smallest to largest
for (size_t i = 0; i < nums.size(); i++) {
// pop operation removes the smallest element from the top
// of the binary heap, with time complexity O(logN)
nums[i] = pq.pop();
}
}
# sort the array from smallest to largest using a priority queue
def sort(nums):
# create a min heap that sorts elements from smallest to largest
pq = SimpleMinPQ(len(nums))
# first insert all elements into the priority queue
for num in nums:
# push operation automatically builds a binary heap, with time complexity O(logN)
pq.push(num)
# then extract all elements, resulting in a sorted order from smallest to largest
for i in range(len(nums)):
# pop operation removes the smallest element from the top
# of the binary heap, with time complexity O(logN)
nums[i] = pq.pop()
// sort the array from smallest to largest using a priority queue
func sort(nums []int) {
// create a min heap that sorts elements from smallest to largest
pq := NewSimpleMinPQ(len(nums))
// first insert all elements into the priority queue
for _, num := range nums {
// push operation automatically builds a binary heap, with time complexity O(logN)
pq.Push(num)
}
// then extract all elements, resulting in a sorted order from smallest to largest
for i := 0; i < len(nums); i++ {
// pop operation removes the smallest element from the top
// of the binary heap, with time complexity O(logN)
nums[i] = pq.Pop()
}
}
// sort the array from smallest to largest using a priority queue
function sort(nums) {
// create a min heap that sorts elements from smallest to largest
const pq = new SimpleMinPQ(nums.length);
// first insert all elements into the priority queue
for (let i = 0; i < nums.length; i++) {
// push operation automatically builds a binary heap, with time complexity O(logN)
pq.push(nums[i]);
}
// then extract all elements, resulting in a sorted order from smallest to largest
for (let i = 0; i < nums.length; i++) {
// pop operation removes the smallest element from the top
// of the binary heap, with time complexity O(logN)
nums[i] = pq.pop();
}
}
Since the push
and pop
methods of a priority queue have a complexity of , the overall time complexity of sorting is , where N
is the length of the input array.
This approach can yield the correct sorting result, but the space complexity is , as the priority queue we create is an additional data structure that uses an array to store elements.
Therefore, the problem heap sort aims to solve is to avoid using extra auxiliary space and perform sink
and swim
operations directly on the original array, completing the sort in time.
Two Key Steps of Heap Sort
In-place Heap Construction (Heapify): Directly transform the array to be sorted into a binary heap in place.
Sorting: Continuously extract elements from the heap to obtain a sorted result.
Take a few minutes to think about it yourself. Comparing the process of adding and removing elements in a priority queue, it is indeed not difficult to implement these two steps in-place using swim
and sink
methods, and you should be able to figure it out independently.
Before explaining the heap sort code implementation in detail, I'll first present the swim
and sink
methods of a binary heap along with the supporting utility functions, because I will guide you through optimizing the heap sort code step by step later, and I won’t repeat the implementation of these functions.
These functions are extracted from the priority queue implementation in Binary Heap Implementation of Priority Queue, with the array passed in as a function parameter, and the other logic remains unchanged: