Graph Structure DFS/BFS Traversal
Prerequisites
Before reading this article, you should first study:
In a Nutshell
Graph traversal is an extension of N-ary tree traversal, and the main traversal methods are Depth-First Search (DFS) and Breadth-First Search (BFS).
The only difference is that trees do not have cycles, while graphs may have cycles. Therefore, we need to mark visited nodes to avoid infinite loops caused by cycles during traversal.
There are differences between traversing all "nodes" and all "paths" in a graph. When traversing all "nodes", use a visited
array to mark nodes before recursion; when traversing all "paths", use an onPath
array to mark nodes before recursion and remove the mark after recursion.
The visualization panel supports creating graph structures and visualizing the traversal paths of DFS/BFS. You can clearly see that although graphs look more complex than trees, the essence of graph traversal is still tree traversal.
Let's look at the DFS algorithm first. You can open the visualization panel below and click the console.log
line multiple times to see the DFS traversal process. The traverse
function is essentially traversing a recursive N-ary tree:
Algorithm Visualization
Now let's look at the BFS algorithm. You can open the visualization panel below and click the console.log
line multiple times to see the BFS traversal process. Essentially, it is a level order traversal of an N-ary tree:
Algorithm Visualization
The details are explained below.
Depth-First Search (DFS)
In the previous article Basic Graph Structures and Common Implementations, we mentioned that we typically do not use a class like Vertex
to store graphs. However, for comparison with tree traversal, we will use this class here first. Later, we will provide traversal code based on adjacency lists and adjacency matrices.
Traversing All Nodes (visited
array)
Let's compare the traversal framework of a graph with that of a tree:
// N-ary tree node
class Node {
int val;
List<Node> children;
}
// Traversal framework for N-ary tree
void traverse(Node root) {
// base case
if (root == null) {
return;
}
// Pre-order position
System.out.println("visit " + root.val);
for (Node child : root.children) {
traverse(child);
}
// Post-order position
}
// Graph node
class Vertex {
int id;
Vertex[] neighbors;
}
// Traversal framework for graph
// Need a visited array to record visited nodes
// Avoid going back to prevent infinite loops
void traverse(Vertex s, boolean[] visited) {
// base case
if (s == null) {
return;
}
if (visited[s.id]) {
// Prevent infinite loops
return;
}
// Pre-order position
visited[s.id] = true;
System.out.println("visit " + s.id);
for (Vertex neighbor : s.neighbors) {
traverse(neighbor, visited);
}
// Post-order position
}
// N-ary tree node
class Node {
public:
int val;
std::vector<Node*> children;
};
// traversal framework for N-ary tree
void traverse(Node* root) {
// base case
if (root == nullptr) {
return;
}
// pre-order position
std::cout << "visit " << root->val << std::endl;
for (auto child : root->children) {
traverse(child);
}
// post-order position
}
// graph vertex
class Vertex {
public:
int id;
std::vector<Vertex*> neighbors;
};
// traversal framework for graph
// a visited array is needed to record the visited vertices
// avoid going back and falling into an infinite loop
void traverse(Vertex* s, std::vector<bool>& visited) {
// base case
if (s == nullptr) {
return;
}
if (visited[s->id]) {
// prevent infinite loop
return;
}
// pre-order position
visited[s->id] = true;
std::cout << "visit " << s->id << std::endl;
for (auto neighbor : s->neighbors) {
traverse(neighbor);
}
// post-order position
}
# N-ary tree node
class Node:
def __init__(self, val=0, children=None):
self.val = val
self.children = children if children is not None else []
# Traversal framework for N-ary tree
def traverse(root):
# base case
if root is None:
return
# pre-order position
print(f"visit {root.val}")
for child in root.children:
traverse(child)
# post-order position
# Graph node
class Vertex:
def __init__(self, id=0, neighbors=None):
self.id = id
self.neighbors = neighbors if neighbors is not None else []
# Traversal framework for graph
# Need a visited array to record visited nodes
# Avoid going backwards and falling into infinite loop
def traverse_graph(s: Vertex, visited: List[bool]):
# base case
if s is None:
return
if visited[s.id]:
# Prevent infinite loop
return
# pre-order position
visited[s.id] = True
print(f"visit {s.id}")
for neighbor in s.neighbors:
traverse_graph(neighbor, visited)
# post-order position
// Node multi-way tree node
type Node struct {
val int
children []*Node
}
// Framework for traversing a multi-way tree
func traverse(root *Node) {
// base case
if root == nil {
return
}
// Pre-order position
fmt.Println("visit", root.val)
for _, child := range root.children {
traverse(child)
}
// Post-order position
}
// Vertex graph node
type Vertex struct {
id int
neighbors []*Vertex
}
// Framework for graph traversal
// Need a visited array to keep track of visited nodes
// Avoid going back and falling into an infinite loop
func traverse(s *Vertex, visited []bool) {
// base case
if s == nil {
return
}
if visited[s.id] {
// Prevent infinite loop
return
}
// Pre-order position
visited[s.id] = true
fmt.Println("visit", s.id)
for _, neighbor := range s.neighbors {
traverse(neighbor, visited)
}
// Post-order position
}
// N-ary tree node
class Node {
constructor(val) {
this.val = val;
this.children = [];
}
}
// N-ary tree traversal framework
var traverseTree = function(root) {
// base case
if (root === null) {
return;
}
// pre-order position
console.log("visit " + root.val);
for (var i = 0; i < root.children.length; i++) {
traverseTree(root.children[i]);
}
// post-order position
};
// graph node
class Vertex {
constructor(id) {
this.id = id;
this.neighbors = [];
}
}
// graph traversal framework
// need a visited array to record visited nodes
// avoid backtracking into a loop
var traverseGraph = function(s, visited) {
// base case
if (s === null) {
return;
}
if (visited[s.id]) {
// prevent infinite loop
return;
}
// pre-order position
visited[s.id] = true;
console.log("visit " + s.id);
for (var i = 0; i < s.neighbors.length; i++) {
traverseGraph(s.neighbors[i]);
}
// post-order position
};
As you can see, graph traversal introduces an extra visited
array compared to tree traversal. This array is used to record nodes that have already been visited, preventing infinite loops when encountering cycles.
Why do cycles cause infinite loops?
Consider the simplest case of a cycle. Suppose there is an edge 1 -> 2
and another edge 2 -> 1
, then nodes 1
and 2
form a cycle:
1 <=> 2
If we do not mark visited nodes, starting from 1
will lead to 2
, then back to 1
, then to 2
again, and so on. The traversal would go 1->2->1->2->...
and recurse infinitely.
With the visited
array, the first time we visit 1
, we mark it as visited. When we encounter 1->2->1
, we see that 1
has already been visited, so we return immediately, stopping the recursion and avoiding the infinite loop.
With this understanding, we can write graph traversal code based on adjacency lists or adjacency matrices. Although their underlying storage is different, they provide a unified API, so we can directly use the Graph
interface from Basic Graph Structures and Common Implementations:
// traverse all nodes of the graph
void traverse(Graph graph, int s, boolean[] visited) {
// base case
if (s < 0 || s >= graph.size()) {
return;
}
if (visited[s]) {
// prevent infinite loop
return;
}
// pre-order position
visited[s] = true;
System.out.println("visit " + s);
for (Edge e : graph.neighbors(s)) {
traverse(graph, e.to, visited);
}
// post-order position
}
// traverse all nodes of the graph
void traverse(const Graph& graph, int s, std::vector<bool>& visited) {
// base case
if (s < 0 || s >= graph.size()) {
return;
}
if (visited[s]) {
// prevent infinite loop
return;
}
// pre-order position
visited[s] = true;
std::cout << "visit " << s << std::endl;
for (const Graph::Edge& e : graph.neighbors(s)) {
traverse(graph, e.to, visited);
}
// post-order position
}
# traverse all nodes of the graph
def traverse(graph, s, visited):
# base case
if s < 0 or s >= len(graph):
return
if visited[s]:
# prevent infinite loop
return
# pre-order position
visited[s] = True
print("visit", s)
for e in graph.neighbors(s):
traverse(graph, e.to, visited)
# post-order position
// traverse all nodes of the graph
func traverse(graph Graph, s int, visited []bool) {
// base case
if s < 0 || s >= len(graph) {
return
}
if visited[s] {
// prevent infinite loop
return
}
// pre-order position
visited[s] = true
fmt.Println("visit", s)
for _, e := range graph.neighbors(s) {
traverse(graph, e.to, visited)
}
// post-order position
}
// traverse all nodes of the graph
var traverse = function(graph, s, visited) {
// base case
if (s < 0 || s >= graph.size()) {
return;
}
if (visited[s]) {
// prevent infinite loop
return;
}
// pre-order position
visited[s] = true;
console.log("visit " + s);
for (var e of graph.neighbors(s)) {
traverse(graph, e.to, visited);
}
// post-order position
};
You can open the visualization panel below and click the line containing console.log
multiple times to see the DFS traversal process on the graph:
Algorithm Visualization
Thanks to the pruning effect of the visited
array, this traversal function will visit every node in the graph once and attempt to traverse every edge once. Therefore, the time complexity of the algorithm is , where E
is the total number of edges and V
is the total number of nodes.
Why is the time complexity ?
Previously, when discussing Binary Tree Traversal, we said that the time complexity for traversing a binary tree is , where is the total number of nodes.
Since a graph is an extension of a tree, why is the time complexity for graph traversal , including the number of edges ? Why not just ?
This is a great question. Take a couple of minutes to think about it. The answer is below.
Click to see the answer
Actually, for binary trees and multi-way trees, the traversal function should also count the number of edges. However, in a tree structure, the number of edges and the number of nodes are approximately equal, so the time complexity remains .
In a tree, edges only go from parent to child. Except for the root node, each node can be paired with the edge from its parent, making it clear that the numbers of nodes and edges are similar.
In a graph, however, any two nodes can be connected by an edge. The number of edges and nodes are not necessarily related, so the time complexity for graph traversal is .
Traversing All Paths (onPath
array)
For tree structures, traversing all "paths" is essentially the same as traversing all "nodes." For graphs, however, traversing all "paths" is slightly different from traversing all "nodes."
In a tree, edges only go from parent to child, so there is only one unique path from the root root
to any target node targetNode
. In other words, after traversing all the nodes in a tree, you will always find the unique path from root
to targetNode
: