Binary Tree in Action (Serialization)
This article will resolve
LeetCode | Difficulty |
---|---|
297. Serialize and Deserialize Binary Tree | 🔴 |
Prerequisites
Before reading this article, you need to understand:
This article is the third part following Binary Tree Principles (Guidelines). The previous article Binary Tree Principles (Construction) taught you techniques for constructing binary trees. In this article, we will increase the difficulty and cover both "serialization" and "deserialization" of binary trees.
To understand serialization and deserialization, we must first talk about the JSON data format.
JSON is widely used. For example, we often serialize structures in programming languages into JSON strings, store them in caches, or send them over the network to remote services. The consumer receives the JSON string and deserializes it back into the original data.
The purpose of serialization and deserialization is to organize data in a specific format so that it can be independent of the programming language.
Now, suppose we have a binary tree implemented in Java, and we want to store it in some way, then read and reconstruct the binary tree structure in C++. How can we do this? This requires the serialization and deserialization of the binary tree.
Zero, Pre/In/Post-order and the Uniqueness of Binary Trees
Before discussing specific problems, let's consider a question: What kind of serialized data can uniquely deserialize into a binary tree?
For example, if you are given the preorder traversal result of a binary tree, can you reconstruct the binary tree from it?
The answer is maybe yes, maybe no. It depends on whether the preorder traversal result includes information about null pointers. If it includes null pointers, then a unique binary tree can be determined; otherwise, it cannot.
For instance, if given a preorder traversal result [1,2,3,4,5]
without null pointers, both of the following binary trees satisfy this preorder traversal:

Therefore, without null pointer information in the preorder traversal, a unique binary tree cannot be reconstructed.
However, if the preorder traversal includes null pointer information, a unique binary tree can be reconstructed. For example, using #
to represent a null pointer, the preorder traversal of the left binary tree is [1,2,3,#,#,4,#,#,5,#,#]
, and the right one is [1,2,#,3,#,#,4,5,#,#,#]
, which differentiates them.
Some clever readers might think: binary tree fundamentals. While this is a good thought process, unfortunately, the correct answer is that even if you include null pointer information, only preorder and postorder traversal results can uniquely reconstruct a binary tree, while inorder traversal cannot.
This will be discussed in detail later, but briefly, it's because preorder/postorder results can determine the position of the root node, whereas inorder results cannot.
For instance, the following two binary trees obviously have different structures, yet their inorder traversal results are both [#,1,#,1,#]
, which cannot differentiate them:

To sum up, when the values of nodes in a binary tree are unique:
If your serialization result does not include null pointer information, and you provide only one traversal order, you cannot reconstruct a unique binary tree.
If your serialization result does not include null pointer information, and you provide two traversal orders, according to the previous Binary Tree Fundamentals (Construction):
2.1. If you provide preorder and inorder, or postorder and inorder, you can reconstruct a unique binary tree.
2.2. If you provide preorder and postorder, you cannot reconstruct a unique binary tree.
If your serialization result includes null pointer information, and you provide only one traversal order, there are two cases:
3.1. If you provide preorder or postorder, you can reconstruct a unique binary tree.
3.2. If you provide inorder, you cannot reconstruct a unique binary tree.
I mention these summaries at the beginning for conceptual understanding, which can be revisited for deeper insights when encountering relevant problems in the future. Now, let's look at specific problems.
One, Problem Description
LeetCode Problem 297 "Serialize and Deserialize Binary Tree" provides the root node root
of a binary tree and requires you to implement the following class:
public class Codec {
// serialize a binary tree into a string
public String serialize(TreeNode root) {}
// deserialize a string into a binary tree
public TreeNode deserialize(String data) {}
}
class Codec {
public:
// serialize a binary tree into a string
string serialize(TreeNode* root);
// deserialize a string into a binary tree
TreeNode* deserialize(string data);
};
class Codec:
# serialize a binary tree into a string
def serialize(self, root: TreeNode) -> str:
pass
# deserialize a string into a binary tree
def deserialize(self, data: str) -> TreeNode:
pass
type Codec struct{}
// serialize a binary tree into a string
func (codec *Codec) serialize(root *TreeNode) string {}
// deserialize a string into a binary tree
func (codec *Codec) deserialize(data string) *TreeNode {}
var Codec = function () {
// serialize a binary tree into a string
this.serialize = function (root) {}
// deserialize a string into a binary tree
this.deserialize = function (data) {}
};
We can use the serialize
method to convert a binary tree into a string and the deserialize
method to convert the serialized string back into a binary tree. The format for serialization and deserialization is entirely up to you.
For example, given a binary tree like the one below:

The serialize
method might convert it into the string 2,1,#,6,#,#,3,#,#
, where #
represents a null
pointer. Then, by passing this string into the deserialize
method, the original binary tree can be reconstructed.
In other words, these two methods should be used together, and you only need to ensure that they are consistent with each other.
Imagine a binary tree as a structure within a two-dimensional plane, while the serialized string is a linear one-dimensional structure. Serialization essentially means "flattening" structured data, fundamentally examining the traversal methods of a binary tree.
What are the traversal methods for a binary tree? Recursive traversal methods include pre-order, in-order, and post-order traversal; the iterative method is generally level-order traversal. This article will explore each of these methods to implement the serialize
and deserialize
methods.
2. Pre-order Traversal Solution
In the previous article Basics of Binary Tree Traversal, we discussed several traversal methods for binary trees. The framework for pre-order traversal is as follows:
void traverse(TreeNode root) {
if (root == null) return;
// code in the pre-order position
traverse(root.left);
traverse(root.right);
}
void traverse(TreeNode* root) {
if (root == nullptr) return;
// code at pre-order position
traverse(root->left);
traverse(root->right);
}
def traverse(root):
if root is None:
return
# code at the pre-order position
traverse(root.left)
traverse(root.right)
func traverse(root *TreeNode) {
if root == nil {
return
}
// code at the pre-order position
traverse(root.left)
traverse(root.right)
}
var traverse = function(root) {
if (root == null) {
return;
}
// code in the pre-order position
traverse(root.left);
traverse(root.right);
};
It's really simple. The code written before recursively traversing the two subtrees is the pre-order traversal code. So, take a look at the following pseudocode:
LinkedList<Integer> res;
void traverse(TreeNode root) {
if (root == null) {
// temporarily use the number -1 to represent the null pointer
res.addLast(-1);
return;
}
// ****** pre-order position ********
res.addLast(root.val);
// ***********************
traverse(root.left);
traverse(root.right);
}
list<int> res;
void traverse(TreeNode* root) {
if (root == nullptr) {
// temporarily use the number -1 to represent the null pointer
res.push_back(-1);
return;
}
// ****** pre-order position ********
res.push_back(root->val);
// ***********************
traverse(root->left);
traverse(root->right);
}
res = []
def traverse(root):
if root is None:
# temporarily use the number -1 to represent the null pointer
res.append(-1)
return
# ****** pre-order position ******
res.append(root.val)
# **********************
traverse(root.left)
traverse(root.right)
var res *list.List
func traverse(root *TreeNode) {
if root == nil {
// temporarily use the number -1 to represent a null pointer
res.PushBack(-1)
return
}
// ****** pre-order position ********
res.PushBack(root.Val)
// ***********************
traverse(root.Left)
traverse(root.Right)
}
var res;
var traverse = function(root) {
if (root === null) {
// temporarily use the number -1 to represent the null pointer
res.push(-1);
return;
}
// ****** pre-order position ********
res.push(root.val);
// ***********************
traverse(root.left);
traverse(root.right);
}
After calling the traverse
function, can you immediately determine the order of elements in the res
list? For example, consider the following binary tree (where #
represents a null pointer). The pre-order traversal's actions are visually apparent:

Thus, res = [1,2,-1,4,-1,-1,3,-1,-1]
, which flattens the binary tree into a list, where -1 represents null.
Flattening the binary tree into a string follows the same logic:
// character representing separator
String SEP = ",";
// character representing null pointer
String NULL = "#";
// for concatenating strings
StringBuilder sb = new StringBuilder();
// flatten the binary tree to a string
void traverse(TreeNode root, StringBuilder sb) {
if (root == null) {
sb.append(NULL).append(SEP);
return;
}
// ***** pre-order position *****
sb.append(root.val).append(SEP);
// *********************
traverse(root.left, sb);
traverse(root.right, sb);
}
// represents the separator character
string SEP = ",";
// represents the null pointer character
string NULL_CHAR = "#";
// used for concatenating strings
std::ostringstream os;
// flatten the binary tree into a string
void traverse(TreeNode* root, std::ostringstream& os) {
if (root == nullptr) {
os << NULL_CHAR << SEP;
return;
}
// ***** pre-order position *****
os << root->val << SEP;
// *******************
traverse(root->left, os);
traverse(root->right, os);
}
# represents the separator character
SEP = ","
# represents the null pointer character
NULL = "#"
# used for concatenating strings
sb = []
# flatten the binary tree into a string
def traverse(root, sb):
if root == None:
sb.append(NULL).append(SEP)
return
# ***** pre-order position *****
sb.append(str(root.val)).append(SEP)
# *******************
traverse(root.left, sb)
traverse(root.right, sb)
const (
// character representing the separator
SEP = ","
// character representing null pointer
NULL = "#"
)
func traverse(root *TreeNode, sb *strings.Builder) {
if root == nil {
sb.WriteString(NULL + SEP)
return
}
// ***** pre-order position *****
sb.WriteString(strconv.Itoa(root.Val) + SEP)
// *******************
traverse(root.Left, sb)
traverse(root.Right, sb)
}
// character representing the delimiter
var SEP = ",";
// character representing null pointer
var NULL = "#";
// used for concatenating strings
var sb = "";
// flatten the binary tree into a string
var traverse = function(root) {
if (root == null) {
sb += NULL + SEP;
return;
}
// ***** pre-order position *****
sb += root.val + SEP;
// *******************
traverse(root.left);
traverse(root.right);
}
StringBuilder
can be used for efficient string concatenation, so you can also think of it as a list. Use ,
as a separator and #
to represent a null pointer. After calling the traverse
function, the string in sb
should be 1,2,#,4,#,#,3,#,#,
.
Now, we can write the code for the serialization function serialize
:
class Codec {
String SEP = ",";
String NULL = "#";
// main function, serialize the binary tree to a string
public String serialize(TreeNode root) {
StringBuilder sb = new StringBuilder();
_serialize(root, sb);
return sb.toString();
}
// helper function, store the binary tree into StringBuilder
void _serialize(TreeNode root, StringBuilder sb) {
if (root == null) {
sb.append(NULL).append(SEP);
return;
}
// ****** pre-order position ********
sb.append(root.val).append(SEP);
// ***********************
_serialize(root.left, sb);
_serialize(root.right, sb);
}
}
class Codec {
public:
string SEP = ",";
string NULLSYM = "#";
string serialize(TreeNode* root) {
string sb;
_serialize(root, sb);
return sb;
}
// Helper function to serialize the binary tree into the StringBuilder
void _serialize(TreeNode* root, string& sb) {
if (root == NULL) {
sb.append(NULLSYM).append(SEP);
return;
}
// ****** Pre-order position ********
sb.append(to_string(root->val)).append(SEP);
// ************************
_serialize(root->left, sb);
_serialize(root->right, sb);
}
};
class Codec:
SEP = ","
NULL = "#"
# Main function, serialize the binary tree into a string
def serialize(self, root):
sb = []
self._serialize(root, sb)
return "".join(sb)
# Helper function, store the binary tree into StringBuilder
def _serialize(self, root, sb):
if root is None:
sb.append(self.NULL)
sb.append(self.SEP)
return
# ****** Preorder position ********
sb.append(str(root.val))
sb.append(self.SEP)
# ***********************
self._serialize(root.left, sb)
self._serialize(root.right, sb)
type Codec struct {
SEP string
NULL string
}
func Constructor() Codec {
return Codec{
SEP: ",",
NULL: "#",
}
}
// Main function, serialize the binary tree to a string
func (this Codec) serialize(root *TreeNode) string {
var sb strings.Builder
this._serialize(root, &sb)
return sb.String()
}
// Helper function, store the binary tree into StringBuilder
func (this Codec) _serialize(root *TreeNode, sb *strings.Builder) {
if root == nil {
sb.WriteString(this.NULL)
sb.WriteString(this.SEP)
return
}
// ****** Preorder position ********
sb.WriteString(strconv.Itoa(root.Val))
sb.WriteString(this.SEP)
// ***********************
this._serialize(root.Left, sb)
this._serialize(root.Right, sb)
}
const SEP = ",";
const NULL = "#";
var serialize = function(root) {
var sb = [];
_serialize(root, sb);
return sb.join('');
};
var _serialize = function(root, sb){
if(root === null) {
sb.push(NULL);
sb.push(SEP);
return;
}
// ****** Preorder position ********
sb.push(root.val);
sb.push(SEP);
// ***********************
_serialize(root.left, sb);
_serialize(root.right, sb);
};
现在,思考一下如何写 deserialize
函数,将字符串反过来构造二叉树。
首先我们可以把字符串转化成列表:
String data = "1,2,#,4,#,#,3,#,#,";
String[] nodes = data.split(",");
string data = "1,2,#,4,#,#,3,#,#,";
vector<string> nodes;
stringstream ss(data);
string item;
while(getline(ss, item, ',')) {
nodes.push_back(item);
}
data = "1,2,#,4,#,#,3,#,#,"
nodes = data.split(",")
data := "1,2,#,4,#,#,3,#,#,"
nodes := strings.Split(data, ",")
var data = "1,2,#,4,#,#,3,#,#,";
var nodes = data.split(",");
Thus, the nodes
list represents the pre-order traversal of a binary tree. The problem now is: how to reconstruct a binary tree from its pre-order traversal results?
Tip
As mentioned in the previous article Binary Tree Mastery (Construction), typically, you need at least two types of traversal results (pre-order, in-order, or post-order) to reconstruct a binary tree. This is because the traversal results in the previous article did not include information about null pointers. However, the nodes
list here includes null pointer information, allowing us to reconstruct the binary tree using only the nodes
list.
Based on our analysis, the nodes
list represents a flattened binary tree:

Therefore, the deserialization process is similar: first determine the root node root
, and then follow the pre-order traversal rules to recursively generate the left and right subtrees: