Use Array to Enhance Hash Table (ArrayHashMap)
In the previous chapter Enhancing Hash Tables with Linked Lists, we used doubly linked lists to enhance hash tables, implementing a data structure like LinkedHashMap
that maintains the insertion order of keys in a hash table.
Linked lists can enhance hash tables, and arrays, being good siblings to linked lists, can also enhance hash tables.
Adding randomKey()
API
Now, here's a challenge for you: based on the standard hash table API, add a new randomKey()
API that can return a random key in time complexity:
interface Map<K, V> {
// Get the value corresponding to the key, time complexity O(1)
V get(K key);
// Add/modify key-value pair, time complexity O(1)
void put(K key, V value);
// Remove key-value pair, time complexity O(1)
void remove(K key);
// Check if the key is contained, time complexity O(1)
boolean containsKey(K key);
// Return all keys, time complexity O(N)
List<K> keys();
// New API: Randomly return a key, required time complexity O(1)
K randomKey();
}
Uniform Random
Note that when we generally talk about randomness, we refer to uniform random, meaning each element has an equal probability of being selected. For example, if you have n
elements, your random algorithm must ensure that each element has a probability of 1/n
of being selected to be considered uniform random.
So, can you do it? Don't underestimate this simple requirement; the implementation is quite ingenious.
From previous lessons, you should know that the essence of a hash table is a table
array. Now, if you want to randomly return a key from a hash table, it's easy to think of randomly selecting an element from an array.
In a standard array, randomly getting an element is straightforward. You just use a random number generator to produce a random index in the range [0, size)
, which effectively means you've found a random element:
int randomeElement(int[] arr) {
Random r = new Random();
// generate a random index in the range [0, arr.length)
return arr[r.nextInt(arr.length)];
}
int randomeElement(vector<int>& arr) {
// generate a random index in the range [0, arr.size())
return arr[rand() % arr.size()];
}
import random
def randomeElement(arr: List[int]) -> int:
# generate a random index in the range [0, len(arr))
return arr[random.randint(0, len(arr) - 1)]
import "math/rand"
func randomeElement(arr []int) int {
// generate a random index in the range [0, len(arr))
return arr[rand.Intn(len(arr))]
}
function randomeElement(arr) {
// generate a random index in the range [0, arr.length)
return arr[Math.floor(Math.random() * arr.length)];
}
This algorithm is correct, with a complexity of O(1), and the probability of each element being selected is 1/n
, where n
is the total number of elements in the arr
array.
However, note that this function has a prerequisite: the elements in the array must be stored compactly without any gaps, such as arr = [1, 2, 3, 4]
. This ensures that any random index corresponds to a valid element.
If there are gaps in the array, problems arise. For example, if arr = [1, 2, null, 4]
, where arr[2] = null
represents a gap with no stored element. If you happen to generate a random number that is 2, what should you do?
You might think of doing a linear search to the left or right to find a non-null element to return, like this:
// return a non-null random element (pseudo code)
int randomeElement(int[] arr) {
Random r = new Random();
// generate a random index in the range [0, arr.length)
int i = r.nextInt(arr.length);
while (arr[i] == null) {
// the randomly generated index i happens to be null
// use the circular array technique to probe to the right
// until a non-null element is found
i = (i + 1) % arr.length;
}
return arr[i];
}
This approach is not feasible, as the algorithm has two issues:
There is a loop, causing the worst-case time complexity to rise to , which does not meet the requirement.
The algorithm is not uniformly random, because your search direction is fixed, making the elements on the right side of the gap more likely to be selected. For example, for
arr = [1, 2, null, 4]
, the probabilities of selecting elements1, 2, 4
are1/4, 1/4, 2/4
, respectively.
There might be another way: if you have bad luck once, try randomizing several times until finding a non-null element:
// return a non-empty random element (pseudo code)
int randomeElement(int[] arr) {
Random r = new Random();
// generate a random index in the range [0, arr.length)
int i = r.nextInt(arr.length);
while (arr[i] == null) {
// the randomly generated index i happens to be null
// regenerate a random index
i = r.nextInt(arr.length);
}
return arr[i];
}
Currently, this algorithm is uniformly random, but the issue is apparent: its time complexity unexpectedly depends on random numbers! It is definitely not , which doesn't meet the requirements.
Have you been stumped by the problem of randomly returning an element from an array with gaps?
Don't forget, our goal now is to randomly return a key from a hash table. The table
array underlying the hash table not only contains gaps but is even more complex:

If your hash table uses open addressing to resolve hash collisions, then you're dealing with the scenario of an array with gaps.
If your hash table uses chaining, it's more complicated. Each element in the array is a linked list, so randomly selecting an index is not enough; you also need to randomly choose a node from the linked list.
Also, consider the probabilities. With chaining, even if you randomly choose an array index uniformly and then a node uniformly from the linked list at that index, is the key you get uniformly random?
Actually, it's not. In the image above, the probability of selecting k1, k2, k3
is 1/2 * 1/3 = 1/6
, while the probability of selecting k4, k5
is 1/2 * 1/2 = 1/4
, which is not uniformly random.
About Probabilistic Algorithms
Probabilistic algorithms are a very interesting class of problems. Both in algorithm problems and real-world applications, some classic random algorithms are used. I will discuss these in more detail in Random Algorithms in Games and Weighted Random Selection, but for now, you don't need to master them.
The only solution is to traverse the entire table
array using the keys
method, store all the keys in an array, and then randomly return one key. However, this results in a complexity of , which still doesn't meet the requirements.
Feeling stuck? This is why accumulating experience in designing classic data structures is essential. If you encounter similar issues during interviews or exams, it might be difficult to come up with a solution on the spot. Next, I will introduce how to enhance a hash table with arrays to easily implement the randomKey()
API.