Graph Structure DFS/BFS Traversal
Prerequisite Knowledge
Before reading this article, you should first study:
Summary in One Sentence
Graph traversal is an extension of Multi-way Tree Traversal, primarily using Depth-First Search (DFS) and Breadth-First Search (BFS).
The key difference is that tree structures do not have cycles, while graph structures might. Therefore, we need to mark nodes that have been visited to avoid infinite loops within cycles.
The traversal of "nodes" and "paths" in graphs differs slightly. When traversing "nodes," a visited
array is used to mark nodes in the pre-order position. When traversing all "paths" in a graph, an onPath
array is used to mark nodes in the pre-order position and unmark them in the post-order position.
The Visualization Panel supports creating graph structures and visualizing the paths of DFS/BFS traversals. You can see that although graph structures appear more complex than tree structures, graph traversal is essentially still tree traversal.
Below, I will explain the above summary in detail with code and the visualization panel.
Depth-First Search (DFS)
In the previous article Graph Structure Basics and General Implementation, we mentioned that we generally do not use classes like Vertex
to store graphs. However, I will use this class here to help you compare graph traversal with multi-way tree traversal. Later, I will provide traversal code based on adjacency lists/adjacency matrices.
Traversing All Nodes (visited
Array)
Let's compare the framework of multi-way tree traversal with that of graph traversal:
// N-ary tree node
class Node {
int val;
List<Node> children;
}
// Traversal framework for N-ary tree
void traverse(Node root) {
// base case
if (root == null) {
return;
}
// Pre-order position
System.out.println("visit " + root.val);
for (Node child : root.children) {
traverse(child);
}
// Post-order position
}
// Graph node
class Vertex {
int id;
Vertex[] neighbors;
}
// Traversal framework for graph
// Need a visited array to record visited nodes
// Avoid going back to prevent infinite loops
void traverse(Vertex s, boolean[] visited) {
// base case
if (s == null) {
return;
}
if (visited[s.id]) {
// Prevent infinite loops
return;
}
// Pre-order position
visited[s.id] = true;
System.out.println("visit " + s.id);
for (Vertex neighbor : s.neighbors) {
traverse(neighbor, visited);
}
// Post-order position
}
// N-ary tree node
class Node {
public:
int val;
std::vector<Node*> children;
};
// traversal framework for N-ary tree
void traverse(Node* root) {
// base case
if (root == nullptr) {
return;
}
// pre-order position
std::cout << "visit " << root->val << std::endl;
for (auto child : root->children) {
traverse(child);
}
// post-order position
}
// graph vertex
class Vertex {
public:
int id;
std::vector<Vertex*> neighbors;
};
// traversal framework for graph
// a visited array is needed to record the visited vertices
// avoid going back and falling into an infinite loop
void traverse(Vertex* s, std::vector<bool>& visited) {
// base case
if (s == nullptr) {
return;
}
if (visited[s->id]) {
// prevent infinite loop
return;
}
// pre-order position
visited[s->id] = true;
std::cout << "visit " << s->id << std::endl;
for (auto neighbor : s->neighbors) {
traverse(neighbor);
}
// post-order position
}
# N-ary tree node
class Node:
def __init__(self, val=0, children=None):
self.val = val
self.children = children if children is not None else []
# Traversal framework for N-ary tree
def traverse(root):
# base case
if root is None:
return
# pre-order position
print(f"visit {root.val}")
for child in root.children:
traverse(child)
# post-order position
# Graph node
class Vertex:
def __init__(self, id=0, neighbors=None):
self.id = id
self.neighbors = neighbors if neighbors is not None else []
# Traversal framework for graph
# Need a visited array to record visited nodes
# Avoid going backwards and falling into infinite loop
def traverse_graph(s, visited):
# base case
if s is None:
return
if visited.get(s.id, False):
# Prevent infinite loop
return
# pre-order position
visited[s.id] = True
print(f"visit {s.id}")
for neighbor in s.neighbors:
traverse_graph(neighbor)
# post-order position
// Node multi-way tree node
type Node struct {
val int
children []*Node
}
// Framework for traversing a multi-way tree
func traverse(root *Node) {
// base case
if root == nil {
return
}
// Pre-order position
fmt.Println("visit", root.val)
for _, child := range root.children {
traverse(child)
}
// Post-order position
}
// Vertex graph node
type Vertex struct {
id int
neighbors []*Vertex
}
// Framework for graph traversal
// Need a visited array to keep track of visited nodes
// Avoid going back and falling into an infinite loop
func traverse(s *Vertex, visited []bool) {
// base case
if s == nil {
return
}
if visited[s.id] {
// Prevent infinite loop
return
}
// Pre-order position
visited[s.id] = true
fmt.Println("visit", s.id)
for _, neighbor := range s.neighbors {
traverse(neighbor, visited)
}
// Post-order position
}
// N-ary tree node
class Node {
constructor(val) {
this.val = val;
this.children = [];
}
}
// N-ary tree traversal framework
var traverseTree = function(root) {
// base case
if (root === null) {
return;
}
// pre-order position
console.log("visit " + root.val);
for (var i = 0; i < root.children.length; i++) {
traverseTree(root.children[i]);
}
// post-order position
};
// graph node
class Vertex {
constructor(id) {
this.id = id;
this.neighbors = [];
}
}
// graph traversal framework
// need a visited array to record visited nodes
// avoid backtracking into a loop
var traverseGraph = function(s, visited) {
// base case
if (s === null) {
return;
}
if (visited[s.id]) {
// prevent infinite loop
return;
}
// pre-order position
visited[s.id] = true;
console.log("visit " + s.id);
for (var i = 0; i < s.neighbors.length; i++) {
traverseGraph(s.neighbors[i]);
}
// post-order position
};
As you can see, graph traversal requires an additional visited
array compared to multi-way tree traversal. This array is used to record nodes that have been visited to avoid infinite loops when encountering cycles.
Why Cycles Cause Infinite Loops
Let’s look at the simplest cycle scenario: there is an edge 1 -> 2
and another edge 2 -> 1
, forming a cycle between nodes 1
and 2
:
1 <=> 2
If we don't mark visited nodes, starting from 1
, we will move to 2
, then back to 1
, then to 2
again, and so on, resulting in an infinite loop 1->2->1->2->...
.
With a visited
array, the first time we visit 1
, we mark it as visited. When we encounter 1->2->1
, we see that 1
has already been visited and stop the recursion, thus avoiding the infinite loop.
With this understanding, we can write the graph traversal code based on an adjacency list or adjacency matrix. Although the underlying storage of adjacency lists and matrices differs, they provide a unified API. So, we can directly use the methods from the Graph
interface in Graph Basics and General Implementation:
// traverse all nodes of the graph
void traverse(Graph graph, int s, boolean[] visited) {
// base case
if (s < 0 || s >= graph.size()) {
return;
}
if (visited[s]) {
// prevent infinite loop
return;
}
// pre-order position
visited[s] = true;
System.out.println("visit " + s);
for (Edge e : graph.neighbors(s)) {
traverse(graph, e.to, visited);
}
// post-order position
}
// traverse all nodes of the graph
void traverse(const Graph& graph, int s, std::vector<bool>& visited) {
// base case
if (s < 0 || s >= graph.size()) {
return;
}
if (visited[s]) {
// prevent infinite loop
return;
}
// pre-order position
visited[s] = true;
std::cout << "visit " << s << std::endl;
for (const Graph::Edge& e : graph.neighbors(s)) {
traverse(graph, e.to, visited);
}
// post-order position
}
# traverse all nodes of the graph
def traverse(graph, s, visited):
# base case
if s < 0 or s >= len(graph):
return
if visited[s]:
# prevent infinite loop
return
# pre-order position
visited[s] = True
print("visit", s)
for e in graph.neighbors(s):
traverse(graph, e.to, visited)
# post-order position
// traverse all nodes of the graph
func traverse(graph Graph, s int, visited []bool) {
// base case
if s < 0 || s >= len(graph) {
return
}
if visited[s] {
// prevent infinite loop
return
}
// pre-order position
visited[s] = true
fmt.Println("visit", s)
for _, e := range graph.neighbors(s) {
traverse(graph, e.to, visited)
}
// post-order position
}
// traverse all nodes of the graph
var traverse = function(graph, s, visited) {
// base case
if (s < 0 || s >= graph.size()) {
return;
}
if (visited[s]) {
// prevent infinite loop
return;
}
// pre-order position
visited[s] = true;
console.log("visit " + s);
for (var e of graph.neighbors(s)) {
traverse(graph, e.to, visited);
}
// post-order position
};
Due to the pruning effect of the visited
array, this traversal function will visit all nodes in the graph once and attempt to traverse all edges once. Therefore, the algorithm's time complexity is , where E
is the total number of edges, and V
is the total number of nodes.
Why is the time complexity ?
Previously, when we discussed Binary Tree Traversal, we stated that the time complexity of a binary tree traversal function is , where is the total number of nodes.
Since a graph structure is an extension of a tree structure, why is the time complexity of graph traversal , including the number of edges , instead of ?
This is an excellent question. Take a couple of minutes to think about it. I've written the answer below.
Click to view the answer
In fact, for binary trees/multi-way trees, the traversal function should also account for the number of edges. However, for tree structures, the number of edges and the number of nodes are approximately equal, so the time complexity remains .
In a tree structure, edges only connect parent nodes to child nodes. Thus, except for the root node, you can pair each node with the edge coming from its parent node, making it evident that the number of edges and nodes are approximately equal.
In contrast, in a graph structure, any two nodes can be connected by an edge, so the number of edges and nodes do not have a specific relationship. Therefore, we state that the time complexity of graph traversal is .
Traversing All Paths (Using the onPath
Array)
For a tree structure, there is no difference between traversing all "paths" and traversing all "nodes." However, for a graph structure, there is a slight distinction between traversing all "paths" and traversing all "nodes."
In a tree structure, each node can only be reached from its parent node, making the path from the root node root
to any node targetNode
unique. In other words, by traversing all nodes of the tree structure once, you can inevitably find the unique path from root
to targetNode
: