Array (Sequential Storage)
When we talk about "arrays," there are different contexts because the array types and APIs provided by various programming languages are different. Therefore, let's first unify the terminology to facilitate the subsequent explanation.
I suggest we can temporarily classify "arrays" into two main categories: "static arrays" and "dynamic arrays."
A "static array" is a block of continuous memory space, and we can access the elements in this memory space through indexing. This is the original form of an array.
On the other hand, "dynamic arrays" are built on top of static arrays by programming languages to make them easier to use. They add commonly used APIs such as push
, insert
, remove
, etc. These APIs allow us to manipulate array elements more conveniently without writing the code to implement these operations ourselves.
In this chapter, we will use the most primitive static arrays to implement a dynamic array and create common APIs for adding, deleting, querying, and modifying elements. Once you understand this, you'll know the underlying principles of data structures provided by standard libraries.
With dynamic arrays, more complex data structures like queues, stacks, and hash tables, which will be discussed later, all rely on them for implementation.
Static Arrays
When creating a static array, you need to specify the element type and the number of elements in the array. Only languages like C++, Java, and Golang provide ways to create static arrays. Languages like Python and JavaScript do not offer definitions for static arrays.
The usage of static arrays is quite primitive and rarely used in actual software development or algorithm problems. We typically use dynamic arrays directly. However, to understand the principles, it is necessary to explain them here.
The method to define a static array is as follows:
// define a static array of size 10
int arr[10];
// initialize the array values to 0 using memset function
memset(arr, 0, sizeof(arr));
// assign values using index
arr[0] = 1;
arr[1] = 2;
// retrieve values using index
int a = arr[0];
// define a static array of size 10
int[] arr = new int[10];
// assign values using indices
arr[0] = 1;
arr[1] = 2;
// retrieve values using indices
int a = arr[0];
// define a static array of size 10
var arr [10]int
// assign values using index
arr[0] = 1
arr[1] = 2
// retrieve values using index
a := arr[0]
# Seriously, Python does not have a way to define static arrays
# We can use lists to simulate static arrays for now
# define a static array of size 10
arr = [0] * 10
# assign values using index
arr[0] = 1
arr[1] = 2
# retrieve values using index
a = arr[0]
// Seriously, JavaScript does not have a way to define static arrays
// We can use arrays to simulate static arrays for now
// define a static array of size 10
let arr = new Array(10);
// assign values using index
arr[0] = 1;
arr[1] = 2;
// retrieve values using index
let a = arr[0];
That's it, nothing more.
Let's take C++ as an example. What exactly does the code int arr[10]
do? It mainly involves the following:
It allocates a contiguous block of memory in the memory space, with a size of
10 * sizeof(int)
bytes. Anint
occupies 4 bytes in computer memory, amounting to a total of 40 bytes.It defines an array pointer named
arr
that points to the starting address of this memory block.
What does the code arr[1] = 2
do? It mainly involves the following:
It calculates the offset by adding
1 * sizeof(int)
bytes (4 bytes) to the starting address ofarr
, locating the starting address of the second element in the memory space.It writes the integer
2
into the 4-byte memory space starting from this address.
For Beginners
I remember when I first entered university and started learning the basics of the C language, some classmates struggled to understand the concept of an array of pointers and a pointer to an array. However, once you comprehend this simple process, everything becomes clear.
Why do array indices start from 0? It's to make address calculation easier.
arr[0]
is the starting address ofarr
, and the next 4 bytes store the value of the first element;arr[1]
is the starting address ofarr
plus1 * 4
bytes, which is the starting address of the second element, and the next 4 bytes store the value of the second element. The same logic applies toarr[2], arr[3]
, and so on.Since the array name
arr
points to the starting address of the memory block,arr
is inherently a pointer. Directly obtaining the value at this address gives you the value of the first element. In other words, the value of*arr
isarr[0]
, which is the value of the first element.If you do not use functions like
memset
to initialize the values of an array, the values within the array are indeterminate. The statementint arr[10]
merely requests the operating system to allocate a continuous block of memory. You don't know if this space was previously used and what might be stored within it. Therefore, it's common to use thememset
function to initialize this memory space before use.
Of course, the above content is specific to C/C++, as it is commonly introduced in computer science basics. In languages like Java and Golang, static arrays are automatically initialized with element values set to 0 upon creation, so explicit initialization is unnecessary.
Summary
To summarize the cause-and-effect logic above, a static array is essentially a block of contiguous memory space. From the statement int arr[10]
, we can deduce:
We know the starting address of this memory space (the array name
arr
points to the starting address of this memory space).We know the type of each element (e.g., int), which informs us of the memory space size occupied by each element (e.g., an int occupies 4 bytes, 32 bits).
This memory space is contiguous, with a total size of
10 * sizeof(int)
, which is 40 bytes.
Thus, we gain the array's superpower of "random access": given any array index, we can directly retrieve the corresponding element's value in time.
This is because we can calculate the target element's memory address using the starting address and the index. The memory addressing time of a computer can be considered , so the time complexity for random access in arrays is .
However, the greatest strength of an array, its contiguous memory characteristic, also brings challenges. Below, I will introduce these challenges.
CRUD Operations
The primary functions of data structures are Create, Read, Update, and Delete (CRUD), and nothing else.
In the previous discussion about the underlying principles of arrays, we only covered the "Read" and "Update" operations, which involve accessing and modifying element values using indices. So, how are the "Create" and "Delete" operations implemented?
Create
Adding elements to a static array can be somewhat complex and requires consideration of different scenarios.
Scenario 1: Appending an Element to the End of an Array
For example, if I have an array of size 10 that currently holds 4 elements and I want to append a new element at the end, what should I do?
It's quite simple; you can directly assign a value to the corresponding index. Here's the basic logic in code:
// An array of size 10 already contains 4 elements
int[] arr = new int[10];
for (int i = 0; i < 4; i++) {
arr[i] = i;
}
// Now want to append an element 4 at the end of the array
arr[4] = 4;
// Then append an element 5 at the end of the array
arr[5] = 5;
// And so on
// ...
// An array of size 10 already has 4 elements
int arr[10];
for (int i = 0; i < 4; i++) {
arr[i] = i;
}
// Now I want to add an element 4 at the end of the array
arr[4] = 4;
// Then add an element 5 at the end of the array
arr[5] = 5;
// And so on
// ...
# An array of size 10 already has 4 elements
arr = [0] * 10
for i in range(4):
arr[i] = i
# Now we want to append an element 4 to the end of the array
arr[4] = 4
# Then append an element 5 to the end of the array
arr[5] = 5
# And so on
# ...
// The array of size 10 already has 4 elements
var arr [10]int
for i := 0; i < 4; i++ {
arr[i] = i
}
// Now we want to append an element 4 at the end of the array
arr[4] = 4
// Then append an element 5 at the end of the array
arr[5] = 5
// And so on
// ...
// An array of size 10 already contains 4 elements
var arr = new Array(10);
for (var i = 0; i < 4; i++) {
arr[i] = i;
}
// Now want to add an element 4 to the end of the array
arr[4] = 4;
// Then add an element 5 to the end of the array
arr[5] = 5;
// And so on
// ...
As we can see, since we are only assigning to an index, appending an element at the end of an array has a time complexity of .
Case 2: Inserting an Element in the Middle of an Array
For example, I have an array arr
of size 10, with elements in the first 4 indices. Now, I want to insert a new element at the 3rd position (arr[2]
). How should I proceed?
This involves "data shifting" to make space for the new element, and then we can insert the new element. The basic code logic is as follows:
// An array of size 10 already has 4 elements
int[] arr = new int[10];
for (int i = 0; i < 4; i++) {
arr[i] = i;
}
// Insert element 666 at the 3rd position
// Need to move the elements from the 3rd position onwards one position back
// Note to traverse the array in reverse to avoid overwriting
// existing elements, refer to the visual panel below if unclear
for (int i = 4; i > 2; i--) {
arr[i] = arr[i - 1];
}
// Now the 3rd position is open, and we can insert the new element
arr[2] = 666;
// An array of size 10 already has 4 elements
int arr[10];
for (int i = 0; i < 4; i++) {
arr[i] = i;
}
// Insert element 666 at the 3rd position
// Need to move the elements from the 3rd position onwards one step back
// Note that we should traverse the existing elements in the array backwards
// to avoid overwriting, refer to the visualization panel below if unclear
for (int i = 4; i > 2; i--) {
arr[i] = arr[i - 1];
}
// Now the 3rd position is vacant, we can insert the new element
arr[2] = 666;
# An array of size 10 has already been filled with 4 elements
arr = [0] * 10
for i in range(4):
arr[i] = i
# Insert element 666 at the 3rd position
# Need to move the elements from the 3rd position and onwards one position back
# Note that we should traverse the array in reverse order to avoid
# overwriting, see the visualization panel below if you don't
for i in range(4, 2, -1):
arr[i] = arr[i - 1]
# Now the 3rd position is empty, we can insert the new element
arr[2] = 666
import "fmt"
func main() {
// The array of size 10 already contains 4 elements
var arr [10]int
for i := 0; i < 4; i++ {
arr[i] = i
}
// Insert element 666 at the 3rd position
// Need to move the elements at the 3rd position and after one position back
// Be sure to traverse the existing elements of the array backwards to
// avoid overwriting, if you don't understand, see the visual panel
for i := 4; i > 2; i-- {
arr[i] = arr[i-1]
}
// Now the 3rd position is empty, and the new element can be inserted
arr[2] = 666
// Print the array result
fmt.Println(arr)
}
// An array of size 10 already has 4 elements
var arr = new Array(10);
for (var i = 0; i < 4; i++) {
arr[i] = i;
}
// Insert element 666 at the 3rd position
// Need to move the elements from the 3rd position and onwards one position back
// Note that you should iterate the array backwards to avoid
// overwriting existing elements, refer to the visual panel below if
for (var i = 4; i > 2; i--) {
arr[i] = arr[i - 1];
}
// Now the 3rd position is free, you can insert the new element
arr[2] = 666;
In summary, the time complexity of inserting an element in the middle of an array is due to the need for data shifting to make space for the new element.
Scenario Three: Array Space is Full
When you create a static array, its size is fixed. For example, if I create an array int arr[10]
(a continuous 40-byte memory space) and store 10 elements in it, what happens if I want to insert another element? There's no space left for a new element, whether you wish to append it to the end or insert it in the middle.
Some might suggest simply adding 4 more bytes of continuous memory space after the existing 40 bytes to store the new element. However, that's not possible.
Continuous memory must be allocated all at once and cannot be arbitrarily increased or decreased afterward. The memory space following your allocated block might already be in use by other programs, and you can't claim it just because you need it.
So, what can be done? You must request a larger block of memory, copy the existing data over, and then insert the new element. This is known as the "expansion" operation for arrays.
For instance, you could create a larger array int arr[20]
and copy the original 10 elements into it, creating space for new elements.
The basic logic is as follows:
// the array of size 10 is already full
int[] arr = new int[10];
for (int i = 0; i < 10; i++) {
arr[i] = i;
}
// now want to append an element 10 at the end of the array
// need to expand the array first
int[] newArr = new int[20];
// copy the original 10 elements
for (int i = 0; i < 10; i++) {
newArr[i] = arr[i];
}
// the old array's memory space will be handled by the garbage collector
// ...
// append the new element in the new larger array
newArr[10] = 10;
// The array of size 10 is already filled
int arr[10];
for (int i = 0; i < 10; i++) {
arr[i] = i;
}
// Now want to append an element 10 to the end of the array
// Need to expand the array first
int newArr[20];
// Copy the original 10 elements over
for (int i = 0; i < 10; i++) {
newArr[i] = arr[i];
}
// Release the memory space of the old array
// ...
// Append the new element in the new larger array
newArr[10] = 10;
# The array of size 10 is already full
arr = [i for i in range(10)]
# Now want to append an element 10 to the end of the array
# Need to expand the array first
newArr = [0] * 20
# Copy the original 10 elements over
for i in range(10):
newArr[i] = arr[i]
# Free the memory space of the old array
# ...
# Append the new element in the new large array
newArr[10] = 10
// The array of size 10 is already full
arr := make([]int, 10)
for i := 0; i < 10; i++ {
arr[i] = i
}
// Now want to append an element 10 to the end of the array
// Need to expand the array first
newArr := make([]int, 20)
// Copy the original 10 elements over
for i := 0; i < 10; i++ {
newArr[i] = arr[i]
}
// Release the memory space of the old array
// ...
// Append the new element to the new larger array
newArr[10] = 10
// The array of size 10 is already filled
var arr = new Array(10);
for (var i = 0; i < 10; i++) {
arr[i] = i;
}
// Now we want to append an element 10 at the end of the array
// We need to expand the array first
var newArr = new Array(20);
// Copy the original 10 elements over
for (var i = 0; i < 10; i++) {
newArr[i] = arr[i];
}
// Release the memory space of the old array
// ...
// Append the new element in the new large array
newArr[10] = 10;
In summary, when an array expands, it involves creating a new array and copying data, resulting in a time complexity of .
Deletion
The operation of deleting an element is similar to adding an element and also requires a discussion of different cases.
Case 1: Deleting the Last Element
For instance, if I have an array of size 10 containing 5 elements and I want to delete the last element, what should I do?
It's simple. Just mark the last element with a special value to indicate it has been deleted. For this example, we'll use -1 as the special value to represent deletion. When implementing dynamic arrays later, there will be more robust methods for deleting array elements. Here, the point is to explain that the essence of deleting the last element of an array is a single random access operation, with a time complexity of .
The basic code logic is as follows:
// An array of size 10 already contains 5 elements
int[] arr = new int[10];
for (int i = 0; i < 5; i++) {
arr[i] = i;
}
// Delete the last element, temporarily use -1 to represent the deleted element
arr[4] = -1;
// An array of size 10 already has 5 elements
int arr[10];
for (int i = 0; i < 5; i++) {
arr[i] = i;
}
// Remove the last element, temporarily use -1 to represent the element has been deleted
arr[4] = -1;
# An array of size 10 already contains 5 elements
arr = [0] * 10
for i in range(5):
arr[i] = i
# Remove the last element, temporarily use -1 to represent the deleted element
arr[4] = -1
// The array of size 10 already contains 5 elements
var arr [10]int
for i := 0; i < 5; i++ {
arr[i] = i
}
// Remove the last element, temporarily using -1 to represent the deleted element
arr[4] = -1
// An array of size 10 already contains 5 elements
var arr = new Array(10);
for (var i = 0; i < 5; i++) {
arr[i] = i;
}
// Remove the last element, temporarily use -1 to represent the deleted element
arr[4] = -1;
Case 2: Deleting an Intermediate Element
For instance, I have an array of size 10 that contains 5 elements. Now, I want to delete the second element (arr[1]
). How should I proceed?
This involves "shifting data," where all elements after the one being deleted are moved one position forward to maintain the continuity of the array elements.
The basic code logic is as follows:
// An array of size 10 already contains 5 elements
int[] arr = new int[10];
for (int i = 0; i < 5; i++) {
arr[i] = i;
}
// delete arr[1]
// need to move all elements after arr[1] one position forward
// note that you should traverse the array forward to avoid
// overwriting, refer to the visualization panel below if you don't
for (int i = 1; i < 4; i++) {
arr[i] = arr[i + 1];
}
// set the last element to -1 to indicate deletion
arr[4] = -1;
// An array of size 10 already contains 5 elements
int arr[10];
for (int i = 0; i < 5; i++) {
arr[i] = i;
}
// Delete arr[1]
// Elements after arr[1] need to be shifted forward by one position
// Make sure to traverse the existing elements of the array in order to avoid
// overwriting. If unclear, please refer to the visualization panel below.
for (int i = 1; i < 4; i++) {
arr[i] = arr[i + 1];
}
// Set the last element to -1 to indicate deletion
arr[4] = -1;
# An array of size 10 has already been filled with 5 elements
arr = [0] * 10
for i in range(5):
arr[i] = i
# Delete arr[1]
# Need to move all elements after arr[1] one position forward
# Note that we should traverse the existing elements in the array forward to
# avoid overwriting, see the visualization panel below if you don't understand
for i in range(1, 4):
arr[i] = arr[i + 1]
# Set the last element to -1 to indicate deletion
arr[4] = -1
// an array of size 10 already contains 5 elements
var arr [10]int
for i := 0; i < 5; i++ {
arr[i] = i
}
// delete arr[1]
// need to move all elements after arr[1] one position forward
// be sure to traverse the array of existing elements forward to avoid
// overwriting, if you don't understand, please see the visualization panel
for i := 1; i < 4; i++ {
arr[i] = arr[i + 1]
}
// set the last element to -1 to indicate deletion
arr[4] = -1
// An array of size 10 has 5 elements already
var arr = new Array(10);
for (var i = 0; i < 5; i++) {
arr[i] = i;
}
// delete arr[1]
// need to move the elements after arr[1] one position forward
// note that we should traverse the existing elements of the array from
// left to right to avoid overwriting. If unclear, see the visual panel
for (var i = 1; i < 4; i++) {
arr[i] = arr[i + 1];
}
// set the last element to -1 to indicate deletion
arr[4] = -1;
In summary, the time complexity of removing an element from the middle of an array is due to data shifting.
Summary
In conclusion, the time complexities for operations on static arrays are as follows:
- Addition:
- Appending an element at the end: .
- Inserting an element in the middle (not at the end): .
- Deletion:
- Removing an element from the end: .
- Removing an element from the middle (not at the end): .
- Access: Given a specific index, retrieving the value of the element at that index has a time complexity of .
- Update: Given a specific index, modifying the value of the element at that index has a time complexity of .
Some readers might ask, didn’t we discuss the operation of array expansion earlier? Expansion involves allocating new array space and copying data, which has a time complexity of . Why isn't this complexity included in the "addition" complexity?
This is a good question, but array expansion is not triggered every time an element is added. Therefore, the complexity of expansion should be analyzed using "amortized time complexity." I have explained this concept in detail in the article Analyzing Time and Space Complexity, so I won't elaborate here.
Another point beginners should note is that when we say the complexity of accessing or updating an array is , this applies only when a specific index is given. Conversely, if you are given a value and need to find its corresponding index in the array, you would have to search through the entire array. This complexity would be .
Therefore, it is important to understand the principles rather than memorize concepts. Once you understand the principles, you can derive the concepts yourself.
Dynamic Arrays
Previously, we discussed the capabilities and limitations of static arrays. Now, let's talk about dynamic arrays. Dynamic arrays are an enhanced version of static arrays and are one of the most commonly used data structures in software development and algorithm challenges.
First, do not assume that dynamic arrays solve the inefficiency of inserting and deleting elements from the middle of static arrays. That issue cannot be resolved. The capability of arrays for random access comes from their contiguous memory space, which inevitably leads to challenges with data relocation and resizing.
The underlying structure of dynamic arrays is still a static array, but it automatically handles resizing and encapsulates operations like adding, deleting, searching, and modifying, making it more convenient for us to use.
Here are some simple examples of how dynamic arrays are used in different programming languages:
// create a dynamic array
// no need to explicitly specify array size, it will
// automatically resize based on the number of elements
ArrayList<Integer> arr = new ArrayList<>();
for (int i = 0; i < 10; i++) {
// append elements at the end, time complexity O(1)
arr.add(i);
}
// insert elements in the middle, time complexity O(N)
// insert element 666 at index 2
arr.add(2, 666);
// insert elements at the beginning, time complexity O(N)
arr.add(0, -1);
// remove the last element, time complexity O(1)
arr.remove(arr.size() - 1);
// remove elements in the middle, time complexity O(N)
// remove element at index 2
arr.remove(2);
// query element by index, time complexity O(1)
int a = arr.get(0);
// modify element by index, time complexity O(1)
arr.set(0, 100);
// find index by element value, time complexity O(N)
int index = arr.indexOf(666);
// create a dynamic array
// no need to specify the size explicitly, it will
// automatically resize based on the number of stored
vector<int> arr;
for (int i = 0; i < 10; i++) {
// append elements at the end, time complexity O(1)
arr.push_back(i);
}
// insert elements in the middle, time complexity O(N)
// insert element 666 at index 2
arr.insert(arr.begin() + 2, 666);
// insert elements at the beginning, time complexity O(N)
arr.insert(arr.begin(), -1);
// remove elements from the end, time complexity O(1)
arr.pop_back();
// remove elements from the middle, time complexity O(N)
// remove the element at index 2
arr.erase(arr.begin() + 2);
// query elements by index, time complexity O(1)
int a = arr[0];
// modify elements by index, time complexity O(1)
arr[0] = 100;
// find index by element value, time complexity O(N)
int index = find(arr.begin(), arr.end(), 666) - arr.begin();
# create a dynamic array
# no need to explicitly specify the array size, it will
# automatically expand and shrink based on the actual number of
arr = []
for i in range(10):
# append elements to the end, time complexity O(1)
arr.append(i)
# insert elements in the middle, time complexity O(N)
# insert element 666 at index 2
arr.insert(2, 666)
# insert elements at the beginning, time complexity O(N)
arr.insert(0, -1)
# delete the last element, time complexity O(1)
arr.pop()
# delete elements in the middle, time complexity O(N)
# delete the element at index 2
arr.pop(2)
# query elements by index, time complexity O(1)
a = arr[0]
# modify elements by index, time complexity O(1)
arr[0] = 100
# find index by element value, time complexity O(N)
index = arr.index(666)
import (
"fmt"
)
func main() {
// create a dynamic array
// no need to explicitly specify array size, it will auto
// expand/contract based on the actual number of elements
arr := make([]int, 0)
for i := 0; i < 10; i++ {
// append elements to the end, time complexity O(1)
arr = append(arr, i)
}
// insert elements in the middle, time complexity O(N)
// insert element 666 at index 2
arr = append(arr[:2], append([]int{666}, arr[2:]...)...)
// insert elements at the beginning, time complexity O(N)
arr = append([]int{-1}, arr...)
// delete the last element, time complexity O(1)
arr = arr[:len(arr)-1]
// delete elements in the middle, time complexity O(N)
// delete the element at index 2
arr = append(arr[:2], arr[3:]...)
// query element by index, time complexity O(1)
a := arr[0]
// modify element by index, time complexity O(1)
arr[0] = 100
// find index by element value, time complexity O(N)
index := -1
for i, v := range arr {
if v == 666 {
index = i
break
}
}
}
// create a dynamic array
// No need to explicitly specify the array size, it will
// automatically resize based on the number of elements
var arr = [];
for (var i = 0; i < 10; i++) {
// Append elements to the end, time complexity O(1)
arr.push(i);
}
// Insert elements in the middle, time complexity O(N)
// Insert element 666 at index 2
arr.splice(2, 0, 666);
// Insert elements at the beginning, time complexity O(N)
arr.unshift(-1);
// Remove the last element, time complexity O(1)
arr.pop();
// Remove elements in the middle, time complexity O(N)
// Remove the element at index 2
arr.splice(2, 1);
// Query elements by index, time complexity O(1)
var a = arr[0];
// Modify elements by index, time complexity O(1)
arr[0] = 100;
// Find index by element value, time complexity O(N)
var index = arr.indexOf(666);
In the following sections, I will guide you step-by-step to implement a dynamic array, helping you gain a deeper understanding of the principles behind dynamic arrays.