Skip List Basics
Prerequisites
Before reading this article, you should first learn:
In actual interviews, you will rarely be asked to write the code for skip lists by hand, but you might be asked about the basic principles and complexity analysis of skip lists. Therefore, it is necessary for this site to explain this data structure.
This article is in the basic section and will not specifically explain the implementation details of skip lists, only the core principles. Beginners reading this should understand that such a data structure exists, its basic principles, and its time complexity. The specific code implementation will be placed in the data structure design section.
In Basics of Linked Lists, we mentioned that the time complexity for adding, deleting, updating, or searching for an element at a specified index in a singly linked list is .
In fact, if you have the node of the linked list to be operated on, you can complete deletion, modification, or insertion by manipulating a few pointers, and the time complexity is .
The main time consumption is in the lookup operation, because to find the node corresponding to an index, you must start from the head node and traverse to the target node one by one before performing the deletion, modification, or insertion.
So, is there a way to optimize the linked list to support fast lookup operations?
One way is to use key-value mapping, which allows you to access the target node directly in time, avoiding the time-consuming traversal. This approach will be detailed in Hash Linked Lists (LinkedHashMap).
Another way is the skip list, which is introduced in this article. By using the idea of trading space for time, it records extra information in additional space. The time complexity for addition, deletion, search, and update operations can be optimized to .
Core Principles of Skip Lists
Let's take querying a specific index as an example to see how skip lists optimize singly linked lists.
A regular singly linked list looks like this:
index 0 1 2 3 4 5 6 7 8 9
node a->b->c->d->e->f->g->h->i->j
If we want to query the element at index 7, we have to start from index 0 and traverse forward until we reach index 7, finding the target node h
.
A skip list, however, is structured like this:
indexLevel 0-----------------------8-----10
indexLevel 0-----------4-----------8-----10
indexLevel 0-----2-----4-----6-----8-----10
indexLevel 0--1--2--3--4--5--6--7--8--9--10
nodeLevel a->b->c->d->e->f->g->h->i->j->k
A skip list adds multiple layers of indices on top of the original list. Each higher layer reduces the number of index nodes by half, doubling the interval between indices. Hence, the height of the indices is .
When querying the element at index 7, we can start from the highest level index and search downwards layer by layer:
First, the highest level index range is [0, 8]
, confirming that index 7 is within this range, so we begin searching from node 0 on the next level;
On the second level, starting from node 0, the index range [0, 4]
does not include index 7, so we move right to node 4. The index range [4, 8]
includes index 7, so we continue searching from node 4 on the next level;
On the third level, starting from node 4, the index range [4, 6]
does not include index 7, so we move right to node 6. The index range [6, 8]
includes index 7, so we continue searching from node 6 on the next level;
On the fourth level, starting from node 6, the index range [6, 7]
includes index 7, finally locating the target node h
.
Throughout this search process, we traverse levels of indices, with no more than 2 moves per level (since each upper index range is divided in half at the next level), making the query time complexity of skip lists .
Summary
The simplified example above should give you an intuitive understanding of the core principles of skip lists. Skip lists are a classic example of the space-time tradeoff design approach, where additional layers of indexes are maintained to increase space complexity while reducing the time complexity of operations like insertion, deletion, and search.
The actual implementation of skip lists is somewhat more complex and differs from the simplified example above. Here are a few additional points:
The example above only demonstrates the search operation, but skip lists must also support insertion and deletion. This involves dynamically adjusting nodes in the index layers, ensuring that each layer's index intervals are as close to binary as possible. This ensures that the height of the index layers remains , otherwise the time complexity would degrade.
Skip lists are not just for finding nodes corresponding to indexes; they can also be applied to more general scenarios, such as storing and searching ordered key-value pairs. In fact, the use cases for skip lists are very similar to those of binary search trees, which we will study later. However, the code implementation of skip lists is much simpler compared to self-balancing binary search trees.
For the specific implementation of skip lists, I will update the data structure design chapter. Stay tuned.