Array (Sequential Storage)

labuladongOriginalAbout 4950 words

When we talk about "arrays," there are different contexts because the array types and APIs provided by various programming languages are different. Therefore, let's first unify the terminology to facilitate the subsequent explanation.

I suggest we can temporarily classify "arrays" into two main categories: "static arrays" and "dynamic arrays."

A "static array" is a block of continuous memory space, and we can access the elements in this memory space through indexing. This is the original form of an array.

On the other hand, "dynamic arrays" are built on top of static arrays by programming languages to make them easier to use. They add commonly used APIs such as push, insert, remove, etc. These APIs allow us to manipulate array elements more conveniently without writing the code to implement these operations ourselves.

In this chapter, we will use the most primitive static arrays to implement a dynamic array and create common APIs for adding, deleting, querying, and modifying elements. Once you understand this, you'll know the underlying principles of data structures provided by standard libraries.

With dynamic arrays, more complex data structures like queues, stacks, and hash tables, which will be discussed later, all rely on them for implementation.

Static Arrays

When creating a static array, you need to specify the element type and the number of elements in the array. Only languages like C++, Java, and Golang provide ways to create static arrays. Languages like Python and JavaScript do not offer definitions for static arrays.

The usage of static arrays is quite primitive and rarely used in actual software development or algorithm problems. We typically use dynamic arrays directly. However, to understand the principles, it is necessary to explain them here.

The method to define a static array is as follows:

cpp

// define a static array of size 10
int arr[10];

// initialize the array values to 0 using memset function
memset(arr, 0, sizeof(arr));

// assign values using index
arr[0] = 1;
arr[1] = 2;

// retrieve values using index
int a = arr[0];

java

// define a static array of size 10
int[] arr = new int[10];

// assign values using indices
arr[0] = 1;
arr[1] = 2;

// retrieve values using indices
int a = arr[0];

// define a static array of size 10
var arr [10]int

// assign values using index
arr[0] = 1
arr[1] = 2

// retrieve values using index
a := arr[0]

python

# Seriously, Python does not have a way to define static arrays
# We can use lists to simulate static arrays for now

# define a static array of size 10
arr = [0] * 10

# assign values using index
arr[0] = 1
arr[1] = 2

# retrieve values using index
a = arr[0]

javascript

// Seriously, JavaScript does not have a way to define static arrays
// We can use arrays to simulate static arrays for now

// define a static array of size 10
let arr = new Array(10);

// assign values using index
arr[0] = 1;
arr[1] = 2;

// retrieve values using index
let a = arr[0];

That's it, nothing more.

Let's take C++ as an example. What exactly does the code int arr[10] do? It mainly involves the following:

It allocates a contiguous block of memory in the memory space, with a size of 10 * sizeof(int) bytes. An int occupies 4 bytes in computer memory, amounting to a total of 40 bytes.
It defines an array pointer named arr that points to the starting address of this memory block.

What does the code arr[1] = 2 do? It mainly involves the following:

It calculates the offset by adding 1 * sizeof(int) bytes (4 bytes) to the starting address of arr, locating the starting address of the second element in the memory space.
It writes the integer 2 into the 4-byte memory space starting from this address.

For Beginners

I remember when I first entered university and started learning the basics of the C language, some classmates struggled to understand the concept of an array of pointers and a pointer to an array. However, once you comprehend this simple process, everything becomes clear.

Why do array indices start from 0? It's to make address calculation easier. arr[0] is the starting address of arr, and the next 4 bytes store the value of the first element; arr[1] is the starting address of arr plus 1 * 4 bytes, which is the starting address of the second element, and the next 4 bytes store the value of the second element. The same logic applies to arr[2], arr[3], and so on.
Since the array name arr points to the starting address of the memory block, arr is inherently a pointer. Directly obtaining the value at this address gives you the value of the first element. In other words, the value of *arr is arr[0], which is the value of the first element.
If you do not use functions like memset to initialize the values of an array, the values within the array are indeterminate. The statement int arr[10] merely requests the operating system to allocate a continuous block of memory. You don't know if this space was previously used and what might be stored within it. Therefore, it's common to use the memset function to initialize this memory space before use.

Of course, the above content is specific to C/C++, as it is commonly introduced in computer science basics. In languages like Java and Golang, static arrays are automatically initialized with element values set to 0 upon creation, so explicit initialization is unnecessary.

Summary

To summarize the cause-and-effect logic above, a static array is essentially a block of contiguous memory space. From the statement int arr[10], we can deduce:

We know the starting address of this memory space (the array name arr points to the starting address of this memory space).
We know the type of each element (e.g., int), which informs us of the memory space size occupied by each element (e.g., an int occupies 4 bytes, 32 bits).
This memory space is contiguous, with a total size of 10 * sizeof(int), which is 40 bytes.

Thus, we gain the array's superpower of "random access": given any array index, we can directly retrieve the corresponding element's value in $O(1)$ time.

This is because we can calculate the target element's memory address using the starting address and the index. The memory addressing time of a computer can be considered $O(1)$ , so the time complexity for random access in arrays is $O(1)$ .

However, the greatest strength of an array, its contiguous memory characteristic, also brings challenges. Below, I will introduce these challenges.

CRUD Operations

The primary functions of data structures are Create, Read, Update, and Delete (CRUD), and nothing else.

In the previous discussion about the underlying principles of arrays, we only covered the "Read" and "Update" operations, which involve accessing and modifying element values using indices. So, how are the "Create" and "Delete" operations implemented?

Create

Adding elements to a static array can be somewhat complex and requires consideration of different scenarios.

Scenario 1: Appending an Element to the End of an Array

For example, if I have an array of size 10 that currently holds 4 elements and I want to append a new element at the end, what should I do?

It's quite simple; you can directly assign a value to the corresponding index. Here's the basic logic in code:

java

// An array of size 10 already contains 4 elements
int[] arr = new int[10];
for (int i = 0; i < 4; i++) {
    arr[i] = i;
}

// Now want to append an element 4 at the end of the array
arr[4] = 4;

// Then append an element 5 at the end of the array
arr[5] = 5;

// And so on
// ...

cpp

// An array of size 10 already has 4 elements
int arr[10];
for (int i = 0; i < 4; i++) {
    arr[i] = i;
}

// Now I want to add an element 4 at the end of the array
arr[4] = 4;

// Then add an element 5 at the end of the array
arr[5] = 5;

// And so on
// ...

python

# An array of size 10 already has 4 elements
arr = [0] * 10
for i in range(4):
    arr[i] = i

# Now we want to append an element 4 to the end of the array
arr[4] = 4

# Then append an element 5 to the end of the array
arr[5] = 5

# And so on
# ...

// The array of size 10 already has 4 elements
var arr [10]int
for i := 0; i < 4; i++ {
    arr[i] = i
}

// Now we want to append an element 4 at the end of the array
arr[4] = 4

// Then append an element 5 at the end of the array
arr[5] = 5

// And so on
// ...

javascript

// An array of size 10 already contains 4 elements
var arr = new Array(10);
for (var i = 0; i < 4; i++) {
    arr[i] = i;
}

// Now want to add an element 4 to the end of the array
arr[4] = 4;

// Then add an element 5 to the end of the array
arr[5] = 5;

// And so on
// ...

As you can see, since it's just assigning a value to the index, the time complexity of appending an element to the end of the array is $O(1)$ .

Case 2: Inserting Elements in the Middle of the Array

For example, I have an array arr of size 10, with elements in the first 4 indices, and now I want to insert a new element at the 3rd position (index 2 arr[2]), what should I do?

This involves "data shifting", making room for the new element before it can be inserted. The general logic of the code is as follows:

java

// An array of size 10 already has 4 elements
int[] arr = new int[10];
for (int i = 0; i < 4; i++) {
    arr[i] = i;
}

// Insert element 666 at the index 2
// Need to move the elements from the index 2 and onwards one step back
// Note that we should traverse the array backwards to avoid
// overwriting existing elements, refer to the visual panel below if unclear
for (int i = 4; i > 2; i--) {
    arr[i] = arr[i - 1];
}

// Now the 3rd position is open, and we can insert the new element
arr[2] = 666;

cpp

// An array of size 10 already has 4 elements
int arr[10];
for (int i = 0; i < 4; i++) {
    arr[i] = i;
}

// Insert element 666 at the index 2
// Need to move the elements from the index 2 and onwards one step back
// Note that we should traverse the existing elements in the array backwards
// to avoid overwriting, refer to the visualization panel below if unclear
for (int i = 4; i > 2; i--) {
    arr[i] = arr[i - 1];
}

// Now the 3rd position is vacant, we can insert the new element
arr[2] = 666;

python

# An array of size 10 has already been filled with 4 elements
arr = [0] * 10
for i in range(4):
    arr[i] = i

# Insert element 666 at the index 2
# Need to move the elements from the index 2 and onwards one step back
# Note that we should traverse the array in reverse order to avoid
# overwriting, see the visualization panel below if you don't understand
for i in range(4, 2, -1):
    arr[i] = arr[i - 1]

# Now the 3rd position is empty, we can insert the new element
arr[2] = 666

import "fmt"

func main() {
    // The array of size 10 already contains 4 elements
    var arr [10]int
    for i := 0; i < 4; i++ {
        arr[i] = i
    }

    // Insert element 666 at the index 2
    // Need to move the elements from the index 2 and onwards one step back
    // Be sure to traverse the existing elements of the array backwards to
    // avoid overwriting, if you don't understand, see the visual panel below
    for i := 4; i > 2; i-- {
        arr[i] = arr[i-1]
    }

    // Now the 3rd position is empty, and the new element can be inserted
    arr[2] = 666

    // Print the array result
    fmt.Println(arr)
}

javascript

// An array of size 10 already has 4 elements
var arr = new Array(10);
for (var i = 0; i < 4; i++) {
    arr[i] = i;
}

// Insert element 666 at the index 2
// Need to move the elements from the index 2 and onwards one step back
// Note that you should iterate the array backwards to avoid
// overwriting existing elements, refer to the visual panel below if unclear
for (var i = 4; i > 2; i--) {
    arr[i] = arr[i - 1];
}

// Now the 3rd position is free, you can insert the new element
arr[2] = 666;

Algorithm Visualization

In summary, the time complexity of inserting an element in the middle of an array is $O(N)$ due to data shifting to make room for the new element.

Scenario Three: Array Space is Full

When creating a static array, its size must be determined initially. For example, if I create an array int arr[10] (a continuous memory space of 40 bytes) and store 10 elements in it, what should I do if I want to insert another element? Whether appending at the end or inserting in the middle, there is no space left for the new element.

Some readers might suggest simply adding 4 more bytes of continuous memory space after the current 40 bytes to store the new element. Isn’t that simple?

That won't work. Continuous memory must be allocated all at once and cannot be arbitrarily adjusted afterward. The memory space following your continuous block may already be occupied by other programs, and you cannot claim it as you wish.

What should be done then? You have to request a larger memory space, copy the original data into it, and then insert the new element. This process is known as array "expansion".

For instance, you could create a larger array int arr[20] and copy the original 10 elements into it, providing extra space for inserting new elements.

The general logic is as follows:

java

// the array of size 10 is already full
int[] arr = new int[10];
for (int i = 0; i < 10; i++) {
    arr[i] = i;
}

// now want to append an element 10 at the end of the array
// need to expand the array first
int[] newArr = new int[20];
// copy the original 10 elements
for (int i = 0; i < 10; i++) {
    newArr[i] = arr[i];
}

// the old array's memory space will be handled by the garbage collector
// ...

// append the new element in the new larger array
newArr[10] = 10;

cpp

// The array of size 10 is already filled
int arr[10];
for (int i = 0; i < 10; i++) {
    arr[i] = i;
}

// Now want to append an element 10 to the end of the array
// Need to expand the array first
int newArr[20];
// Copy the original 10 elements over
for (int i = 0; i < 10; i++) {
    newArr[i] = arr[i];
}

// Release the memory space of the old array
// ...

// Append the new element in the new larger array
newArr[10] = 10;

python

# The array of size 10 is already full
arr = [i for i in range(10)]

# Now want to append an element 10 to the end of the array
# Need to expand the array first
newArr = [0] * 20

# Copy the original 10 elements over
for i in range(10):
    newArr[i] = arr[i]

# Free the memory space of the old array
# ...

# Append the new element in the new large array
newArr[10] = 10

// The array of size 10 is already full
arr := make([]int, 10)
for i := 0; i < 10; i++ {
    arr[i] = i
}

// Now want to append an element 10 to the end of the array
// Need to expand the array first
newArr := make([]int, 20)
// Copy the original 10 elements over
for i := 0; i < 10; i++ {
    newArr[i] = arr[i]
}

// Release the memory space of the old array
// ...

// Append the new element to the new larger array
newArr[10] = 10

javascript

// The array of size 10 is already filled
var arr = new Array(10);
for (var i = 0; i < 10; i++) {
    arr[i] = i;
}

// Now we want to append an element 10 at the end of the array
// We need to expand the array first
var newArr = new Array(20);
// Copy the original 10 elements over
for (var i = 0; i < 10; i++) {
    newArr[i] = arr[i];
}

// Release the memory space of the old array
// ...

// Append the new element in the new large array
newArr[10] = 10;

In summary, when an array expands, it involves creating a new array and copying data, resulting in a time complexity of $O(N)$ .

Deletion

The operation of deleting an element is similar to adding an element and also requires a discussion of different cases.

Case 1: Deleting the Last Element

For instance, if I have an array of size 10 containing 5 elements and I want to delete the last element, what should I do?

It's simple. Just mark the last element with a special value to indicate it has been deleted. For this example, we'll use -1 as the special value to represent deletion. When implementing dynamic arrays later, there will be more robust methods for deleting array elements. Here, the point is to explain that the essence of deleting the last element of an array is a single random access operation, with a time complexity of $O(1)$ .

The basic code logic is as follows:

java

// An array of size 10 already contains 5 elements
int[] arr = new int[10];
for (int i = 0; i < 5; i++) {
    arr[i] = i;
}

// Delete the last element, temporarily use -1 to represent the deleted element
arr[4] = -1;

cpp

// An array of size 10 already has 5 elements
int arr[10];
for (int i = 0; i < 5; i++) {
    arr[i] = i;
}

// Remove the last element, temporarily use -1 to represent the element has been deleted
arr[4] = -1;

python

# An array of size 10 already contains 5 elements
arr = [0] * 10
for i in range(5):
    arr[i] = i

# Remove the last element, temporarily use -1 to represent the deleted element
arr[4] = -1

// The array of size 10 already contains 5 elements
var arr [10]int
for i := 0; i < 5; i++ {
    arr[i] = i
}

// Remove the last element, temporarily using -1 to represent the deleted element
arr[4] = -1

javascript

// An array of size 10 already contains 5 elements
var arr = new Array(10);
for (var i = 0; i < 5; i++) {
    arr[i] = i;
}

// Remove the last element, temporarily use -1 to represent the deleted element
arr[4] = -1;

Case 2: Deleting an Intermediate Element

For instance, I have an array of size 10 that contains 5 elements. Now, I want to delete the second element (arr[1]). How should I proceed?

This involves "shifting data," where all elements after the one being deleted are moved one position forward to maintain the continuity of the array elements.

The basic code logic is as follows:

java

// An array of size 10 already contains 5 elements
int[] arr = new int[10];
for (int i = 0; i < 5; i++) {
    arr[i] = i;
}

// delete arr[1]
// need to move all elements after arr[1] one position forward
// note that you should traverse the array forward to avoid
// overwriting, refer to the visualization panel below if you don't understand
for (int i = 1; i < 4; i++) {
    arr[i] = arr[i + 1];
}

// set the last element to -1 to indicate deletion
arr[4] = -1;

cpp

// An array of size 10 already contains 5 elements
int arr[10];
for (int i = 0; i < 5; i++) {
    arr[i] = i;
}

// Delete arr[1]
// Elements after arr[1] need to be shifted forward by one position
// Make sure to traverse the existing elements of the array in order to avoid
// overwriting. If unclear, please refer to the visualization panel below.
for (int i = 1; i < 4; i++) {
    arr[i] = arr[i + 1];
}

// Set the last element to -1 to indicate deletion
arr[4] = -1;

python

# An array of size 10 has already been filled with 5 elements
arr = [0] * 10
for i in range(5):
    arr[i] = i

# Delete arr[1]
# Need to move all elements after arr[1] one position forward
# Note that we should traverse the existing elements in the array forward to
# avoid overwriting, see the visualization panel below if you don't understand
for i in range(1, 4):
    arr[i] = arr[i + 1]

# Set the last element to -1 to indicate deletion
arr[4] = -1

// an array of size 10 already contains 5 elements
var arr [10]int
for i := 0; i < 5; i++ {
    arr[i] = i
}

// delete arr[1]
// need to move all elements after arr[1] one position forward
// be sure to traverse the array of existing elements forward to avoid
// overwriting, if you don't understand, please see the visualization panel below
for i := 1; i < 4; i++ {
    arr[i] = arr[i + 1]
}

// set the last element to -1 to indicate deletion
arr[4] = -1

javascript

// An array of size 10 has 5 elements already
var arr = new Array(10);
for (var i = 0; i < 5; i++) {
    arr[i] = i;
}

// delete arr[1]
// need to move the elements after arr[1] one position forward
// note that we should traverse the existing elements of the array from
// left to right to avoid overwriting. If unclear, see the visual panel below
for (var i = 1; i < 4; i++) {
    arr[i] = arr[i + 1];
}

// set the last element to -1 to indicate deletion
arr[4] = -1;

Algorithm Visualization

In summary, the time complexity of removing an element from the middle of an array is $O(N)$ due to data shifting.

Summary

In conclusion, the time complexities for operations on static arrays are as follows:

Addition:
- Appending an element at the end: $O(1)$ .
- Inserting an element in the middle (not at the end): $O(N)$ .
Deletion:
- Removing an element from the end: $O(1)$ .
- Removing an element from the middle (not at the end): $O(N)$ .
Access: Given a specific index, retrieving the value of the element at that index has a time complexity of $O(1)$ .
Update: Given a specific index, modifying the value of the element at that index has a time complexity of $O(1)$ .

Some readers might ask, didn’t we discuss the operation of array expansion earlier? Expansion involves allocating new array space and copying data, which has a time complexity of $O(N)$ . Why isn't this complexity included in the "addition" complexity?

This is a good question, but array expansion is not triggered every time an element is added. Therefore, the complexity of expansion should be analyzed using "amortized time complexity." I have explained this concept in detail in the article Analyzing Time and Space Complexity, so I won't elaborate here.

Another point beginners should note is that when we say the complexity of accessing or updating an array is $O(1)$ , this applies only when a specific index is given. Conversely, if you are given a value and need to find its corresponding index in the array, you would have to search through the entire array. This complexity would be $O(N)$ .

Therefore, it is important to understand the principles rather than memorize concepts. Once you understand the principles, you can derive the concepts yourself.

Dynamic Arrays

Previously, we discussed the capabilities and limitations of static arrays. Now, let's talk about dynamic arrays. Dynamic arrays are an enhanced version of static arrays and are one of the most commonly used data structures in software development and algorithm challenges.

First, do not assume that dynamic arrays solve the inefficiency of inserting and deleting elements from the middle of static arrays. That issue cannot be resolved. The capability of arrays for random access comes from their contiguous memory space, which inevitably leads to challenges with data relocation and resizing.

The underlying structure of dynamic arrays is still a static array, but it automatically handles resizing and encapsulates operations like adding, deleting, searching, and modifying, making it more convenient for us to use.

Here are some simple examples of how dynamic arrays are used in different programming languages:

java

// create a dynamic array
// no need to explicitly specify array size, it will
// automatically resize based on the number of elements stored
ArrayList<Integer> arr = new ArrayList<>();

for (int i = 0; i < 10; i++) {
    // append elements at the end, time complexity O(1)
    arr.add(i);
}

// insert elements in the middle, time complexity O(N)
// insert element 666 at index 2
arr.add(2, 666);

// insert elements at the beginning, time complexity O(N)
arr.add(0, -1);

// remove the last element, time complexity O(1)
arr.remove(arr.size() - 1);

// remove elements in the middle, time complexity O(N)
// remove element at index 2
arr.remove(2);

// query element by index, time complexity O(1)
int a = arr.get(0);

// modify element by index, time complexity O(1)
arr.set(0, 100);

// find index by element value, time complexity O(N)
int index = arr.indexOf(666);

cpp

// create a dynamic array
// no need to specify the size explicitly, it will
// automatically resize based on the number of stored elements
vector<int> arr;

for (int i = 0; i < 10; i++) {
    // append elements at the end, time complexity O(1)
    arr.push_back(i);
}

// insert elements in the middle, time complexity O(N)
// insert element 666 at index 2
arr.insert(arr.begin() + 2, 666);

// insert elements at the beginning, time complexity O(N)
arr.insert(arr.begin(), -1);

// remove elements from the end, time complexity O(1)
arr.pop_back();

// remove elements from the middle, time complexity O(N)
// remove the element at index 2
arr.erase(arr.begin() + 2);

// query elements by index, time complexity O(1)
int a = arr[0];

// modify elements by index, time complexity O(1)
arr[0] = 100;

// find index by element value, time complexity O(N)
int index = find(arr.begin(), arr.end(), 666) - arr.begin();

python

# create a dynamic array
# no need to explicitly specify the array size, it will
# automatically expand and shrink based on the actual number of stored elements
arr = []

for i in range(10):
    # append elements to the end, time complexity O(1)
    arr.append(i)

# insert elements in the middle, time complexity O(N)
# insert element 666 at index 2
arr.insert(2, 666)

# insert elements at the beginning, time complexity O(N)
arr.insert(0, -1)

# delete the last element, time complexity O(1)
arr.pop()

# delete elements in the middle, time complexity O(N)
# delete the element at index 2
arr.pop(2)

# query elements by index, time complexity O(1)
a = arr[0]

# modify elements by index, time complexity O(1)
arr[0] = 100

# find index by element value, time complexity O(N)
index = arr.index(666)

import (
	"fmt"
)

func main() {
	// create a dynamic array
	// no need to explicitly specify array size, it will auto
// expand/contract based on the actual number of elements
	arr := make([]int, 0)

	for i := 0; i < 10; i++ {
		// append elements to the end, time complexity O(1)
		arr = append(arr, i)
	}

	// insert elements in the middle, time complexity O(N)
	// insert element 666 at index 2
	arr = append(arr[:2], append([]int{666}, arr[2:]...)...)

	// insert elements at the beginning, time complexity O(N)
	arr = append([]int{-1}, arr...)

	// delete the last element, time complexity O(1)
	arr = arr[:len(arr)-1]

	// delete elements in the middle, time complexity O(N)
	// delete the element at index 2
	arr = append(arr[:2], arr[3:]...)

	// query element by index, time complexity O(1)
	a := arr[0]

	// modify element by index, time complexity O(1)
	arr[0] = 100

	// find index by element value, time complexity O(N)
	index := -1
	for i, v := range arr {
		if v == 666 {
			index = i
			break
		}
	}
}

javascript

// create a dynamic array
// No need to explicitly specify the array size, it will
// automatically resize based on the number of elements stored
var arr = [];

for (var i = 0; i < 10; i++) {
    // Append elements to the end, time complexity O(1)
    arr.push(i);
}

// Insert elements in the middle, time complexity O(N)
// Insert element 666 at index 2
arr.splice(2, 0, 666);

// Insert elements at the beginning, time complexity O(N)
arr.unshift(-1);

// Remove the last element, time complexity O(1)
arr.pop();

// Remove elements in the middle, time complexity O(N)
// Remove the element at index 2
arr.splice(2, 1);

// Query elements by index, time complexity O(1)
var a = arr[0];

// Modify elements by index, time complexity O(1)
arr[0] = 100;

// Find index by element value, time complexity O(N)
var index = arr.indexOf(666);

In the following sections, I will guide you step-by-step to implement a dynamic array, helping you gain a deeper understanding of the principles behind dynamic arrays.