Use Linked List to Enhance Hash Table (LinkedHashMap)
In the previous article Principles of Hash Tables, we discussed that you cannot rely on the order of keys when looping through a hash table. The keys in a hash table are unordered.
But based on your programming experience, you may have some questions.
For example, if you use Python, you might know that since Python 3.7, the standard hash table dict
guarantees that the keys will be returned in the order you inserted them. For example, look at this code:
d = dict()
d['a'] = 1
d['b'] = 2
d['c'] = 3
print(list(d.keys())) # ['a', 'b', 'c']
d['y'] = 4
print(list(d.keys())) # ['a', 'b', 'c', 'y']
d['d'] = 5
print(list(d.keys())) # ['a', 'b', 'c', 'y', 'd']
No matter how many keys you insert, the keys
method returns them in the same order you added them. It feels like adding elements to the end of an array. How is this possible?
If you use Golang, you might see something even more interesting. Look at this test code:
package main
import (
"fmt"
)
func main() {
// initialize the map
myMap := map[string]int{
"1": 1,
"2": 2,
"3": 3,
"4": 4,
"5": 5,
}
// define a function to iterate through the map
printMapKeys := func(m map[string]int) {
for key := range m {
fmt.Print(key, " ")
}
fmt.Println()
}
// iterate through the map multiple times to observe the order of keys
printMapKeys(myMap)
printMapKeys(myMap)
printMapKeys(myMap)
printMapKeys(myMap)
}
// my output is as follows:
// 1 2 3 4 5
// 5 1 2 3 4
// 2 3 4 5 1
// 1 2 3 4 5
In other words, the order of traversal is random each time. But as mentioned before in How Hash Tables Work, even though the keys in a hash table are unordered, if you don't change the table, the traversal order should stay the same. So why does Golang's map give a different order every time? Isn't this strange?
You can think about the reason first. Below is the answer.
Click to see the answer
Let's talk about Golang first. The reason the order is different every time is because it is designed that way on purpose.
This reason is a bit funny. Golang wants to stop developers from depending on the order of hash table traversal, so it returns a different order on purpose each time you loop through a map. This design shows that many developers have not learned the basics of hash tables.
Let's think more deeply. How does it shuffle the order? Is it really random?
Actually, no. If you look closely, the "random" order follows a pattern. Do you remember the Circular Array we talked about before?
| 1 2 3 4 5
5 | 1 2 3 4
2 3 4 5 | 1
| 1 2 3 4 5
Can you see it now? If there is no resizing of the table, the order of traversal is actually fixed. It just doesn't always start from the start of the underlying table
array. Instead, it starts from a random position, then uses the circular array trick to go through the whole table
array. This way, the order looks random each time, but performance is not lost just to make it random.
Now let's talk about Python. Python can keep all the keys in insertion order because it combines a standard hash table with a linked list, creating a new data structure: a hash linked list.
Other languages have similar structures, like Java's LinkedHashMap
. This data structure keeps the performance of hash tables for add, remove, and search, and also keeps the insertion order like an array or linked list.
How does it do that? I will explain more below.