Bubble Sort with Stability
Prerequisite Knowledge
Before reading this article, you should first learn:
Summary in One Sentence
The bubble sort algorithm is an optimization of Selection Sort. It sorts by swapping the inverted pairs to the right of nums[sortedIndex]
, making it a stable sorting algorithm.
You can open the visualization panel, click the play button, and then use the accelerate/decelerate buttons to adjust the speed, allowing you to intuitively understand the process of bubble sort:
The previous article explained Selection Sort, one of the simplest and most straightforward sorting algorithms. It analyzed several issues in selection sort that need optimization:
Selection sort is an unstable sorting algorithm because it swaps the smallest element with the current element each time, which can change the relative order of identical elements.
The time complexity of selection sort is independent of the initial order of the data. Even if the input is an already sorted array, the time complexity remains .
The time complexity of selection sort is , with approximately operations. Conventional optimization strategies cannot reduce the time complexity.
This article focuses on the various shortcomings of selection sort to see if there are ways to address them.
Regaining Sorting Stability
The previous discussion analyzed the reason selection sort loses stability: it swaps the smallest element (nums[minIndex]
) with the current element (nums[sortedIndex]
), potentially altering the relative positions of identical elements.
Upon closely examining this swapping process, its goal is simply to place nums[minIndex]
at nums[sortedIndex]
. The algorithm doesn't concern itself with where the element at nums[sortedIndex]
should go. The swap operation is used because it is the simplest method and doesn't involve data relocation.
In the swapping process, placing nums[minIndex]
at nums[sortedIndex]
does not affect the relative order of identical elements:
[2, 2', 2'', 1, 1']
^ ^
[1, 2', 2'', _, 1']
^ ^
sortedIndex minIndex
The step that truly disrupts stability is moving nums[sortedIndex]
to the position of nums[minIndex]
:
[1, 2', 2'', 2, 1']
^ ^
We can see that the relative order of the elements 2, 2', 2''
has been disrupted.
Therefore, the optimization should focus here. Instead of simply swapping nums[sortedIndex]
with nums[minIndex]
for convenience, you should mimic the operation of inserting an element in the middle of an array. Move the elements from nums[sortedIndex..minIndex]
one position backward as a whole, creating a space at nums[sortedIndex + 1]
for the element nums[sortedIndex]
.
[2, 2', 2'', 1, 1']
^ ^
[1, 2', 2'', _, 1']
^ ^
[1, _, 2', 2'', 1']
^ ^
[1, 2, 2', 2'', 1']
^ ^
sortedIndex minIndex
As you can see, the relative order of 2, 2', 2''
and 1, 1'
remains unchanged, which makes selection sort a stable sorting algorithm.
The specific code is as follows. You only need to modify the part of the selection sort code where elements are swapped:
// Perform the first wave of optimization on selection sort to achieve stability
void sort(int[] nums) {
int n = nums.length;
int sortedIndex = 0;
while (sortedIndex < n) {
// Find the smallest value nums[minIndex] in the unsorted part
int minIndex = sortedIndex;
for (int i = sortedIndex + 1; i < n; i++) {
if (nums[i] < nums[minIndex]) {
minIndex = i;
}
}
// Swap the smallest value with the element at sortedIndex
// int tmp = nums[sortedIndex];
// nums[sortedIndex] = nums[minIndex];
// nums[minIndex] = tmp;
// Optimization: Insert nums[minIndex] into the position of nums[sortedIndex]
// Move the elements of nums[sortedIndex..minIndex] one position backward
int minVal = nums[minIndex];
// Array data moving operation
for (int i = minIndex; i > sortedIndex; i--) {
nums[i] = nums[i - 1];
}
nums[sortedIndex] = minVal;
sortedIndex++;
}
}
You can try submitting this algorithm for LeetCode Problem 912, "Sort an Array." Although it will eventually time out and fail, it demonstrates the correctness of the algorithm.
Compared to the standard selection sort, this algorithm gains stability but at the cost of reduced efficiency. Even though the time complexity of the nested loops remains in Big O notation, the addition of another for loop means the actual execution count will be greater than the standard selection sort's executions.
Let's see if we can further optimize it to avoid this extra for loop.
Optimizing Time Complexity
Carefully observing the code of the algorithm above, the while loop primarily performs two tasks:
The first for loop finds the minimum value in
nums[sortedIndex..]
.The second for loop inserts this minimum value at the position
nums[sortedIndex]
.
Can we combine these two steps? Specifically, while searching for the minimum value in nums[sortedIndex..]
, can we simultaneously perform other necessary actions, so that once the minimum is found, it is already in its correct position without any further data movement?
The answer is yes, let me show you how:
// perform a second wave of optimization on selection sort
// to achieve stability while avoiding additional for loops
// this algorithm has another name, called bubble sort
void sort(int[] nums) {
int n = nums.length;
int sortedIndex = 0;
while (sortedIndex < n) {
// find the minimum value in nums[sortedIndex..]
// simultaneously move this minimum value
// step by step to the position of
for (int i = n - 1; i > sortedIndex; i--) {
if (nums[i] < nums[i - 1]) {
// swap(nums[i], nums[i - 1])
int tmp = nums[i];
nums[i] = nums[i - 1];
nums[i - 1] = tmp;
}
}
sortedIndex++;
}
}
This optimization is quite clever. By iterating through nums[sortedIndex..]
in reverse, if an inversion is found, swap them. This way, the minimum value gradually moves to the position nums[sortedIndex]
.
Since we only swap adjacent inverted pairs and do not touch elements with the same value, this algorithm is a stable sort.
The time complexity of this algorithm remains . The actual number of operations is similar to selection sort, forming an arithmetic series sum, approximately times.
Bubble Sort
This algorithm is called Bubble Sort because its execution resembles bubbles rising from the end of the array to the head, each time pushing the smallest value to its correct position.
Early Termination of the Algorithm
One issue mentioned with selection sort is that its time complexity is unaffected by the initial order of data. Even if the input array is already sorted, selection sort still performs operations.
With the optimizations mentioned above, this issue can be addressed. See the code for details:
// further optimize by terminating the algorithm early when the array is sorted
void sort(int[] nums) {
int n = nums.length;
int sortedIndex = 0;
while (sortedIndex < n) {
// add a boolean variable to record if a swap operation has been performed
boolean swapped = false;
for (int i = n - 1; i > sortedIndex; i--) {
if (nums[i] < nums[i - 1]) {
// swap(nums[i], nums[i - 1])
int tmp = nums[i];
nums[i] = nums[i - 1];
nums[i - 1] = tmp;
swapped = true;
}
}
// if no swap operation is performed, it indicates that the
// array is already sorted, and the algorithm can terminate
if (!swapped) {
break;
}
sortedIndex++;
}
}
In summary, the above series of optimizations for selection sort have resulted in improved stability and the ability to terminate the algorithm early when the array is already sorted. The only downside is that the time complexity remains and has not been reduced.
Next, we will continue to explore other methods that might improve selection sort.