Vue double-ended diff algorithm

diff algorithm


We know that the complexity of diffing two trees is O(n^3), because each node has to be compared with all the nodes of the other tree, which is n. If a node with change is found, Performing insertion, deletion, modification is also n complexity. This is true for all nodes, multiplied by n, so O(n * n * n) complexity.

Such complexity is unacceptable for front-end frameworks, which means that for 1000 nodes, it takes 1000 * 1000 * 1000 to render once, a total of 1 billion times.
Therefore, the diff of the front-end framework stipulates two processing principles: only the comparison of the same layer is performed, and the sub-nodes are no longer compared when the type changes.
Because it is rare for dom nodes to move across levels, in general, it is the addition, deletion and modification of dom at the same level.
In this way, you only need to traverse once and compare the type, which is O(n) complexity, and if the type changes, the child nodes will no longer be compared, which can save a large number of node traversal. In addition, because the associated dom nodes are recorded in vdom, the addition, deletion and modification of dom does not need to be traversed, which is O(1), and the overall diff algorithm complexity is O(n) complexity.

1000 nodes are rendered at most 1000 times at a time, and this complexity is an acceptable range.
However, although the complexity of such an algorithm is low, there are still problems.
For example, a group of nodes, suppose there are 5 nodes, the type is ABCDE, and the next rendering is EABCD. At this time, compare them one by one and find that the types are different, and the 5 nodes will be re-rendered.
And according to the principle of different types, the child nodes are no longer compared. If these nodes have child nodes, they will also be re-rendered.
The dom operation is relatively slow, so although the algorithm complexity of diff is low, the performance of re-rendering is not high.
Therefore, in addition to considering its own time complexity, the diff algorithm also considers a factor: the number of dom operations.
In the above example, ABCDE becomes EABCD, which obviously just needs to move the E a bit, without creating a new element at all.
But how to compare that the same node has moved?
Determine the type? That doesn't work. There may be many nodes of the same type, which cannot be distinguished.
Preferably each node is uniquely identified.
Therefore, when rendering a group of nodes, the front-end framework will ask the developer to specify the key, and use the key to determine whether some nodes have just moved, so that they can be reused directly.
In this way, when the diff algorithm processes the comparison of a group of nodes, it is necessary to optimize the algorithm again according to the key.
We will call the key-based diff algorithm of two sets of nodes a multi-node diff algorithm, which is part of the entire vdom diff algorithm.
Next, let's learn about the multi-node diff algorithm:

simple diff


Assuming that a set of nodes in ABCD is rendered, and DCAB is rendered again, what should be done at this time?
The purpose of the multi-node diff algorithm is to reuse nodes as much as possible, instead of creating them by moving nodes.
Therefore, for each node of the new vnode array, we have to find out whether there is a corresponding key in the old vnode array. If there is, move it to the new position, and if not, create a new one.
That's it:
const oldChildren = n1.children
const newChildren = n2.children

let lastIndex = 0
// loop over new children
for (let i = 0; i < newChildren.length; i++) {
const newVNode = newChildren[i]
let j = 0
let find = false
// loop over old children
for (j; j < oldChildren.length; j++) {
const oldVNode = oldChildren[j]
// If two nodes with the same key value are found, call the patch function to update
if (newVNode.key === oldVNode.key) {
find = true
patch(oldVNode, newVNode, container)

Handling mobile...

break //Break out of the loop and process the next node
}
}
// If it is not found, it is added
if (!find) {
const prevVNode = newChildren[i - 1]
let anchor = null
if (prevVNode) {
anchor = prevVNode.el.nextSibling
} else {
anchor = container.firstChild
}
patch(null, newVNode, container, anchor)
}
}

The role of the patch function here is to update the properties of the node and reset the event listener. If there is no corresponding old node, the node is inserted, and a node after it needs to be passed in as the anchor point anchor.
We iterate over the new vnodes:
First, find the corresponding node from the old vnode array. If you find it, it means that it can be reused, and then you just need to move it.
If not found, insert is performed, the anchor is the nextSibling of the previous node.

Then if a reusable node is found, where should it be moved?
In fact, the order of records in the new vnode array is the order of the targets. So just move the corresponding nodes in the order of the new vnode array.
const prevVNode = newChildren[i - 1]
if (prevVNode) {
const anchor = prevVNode.el.nextSibling
insert(newVNode.el, container, anchor)
}

To insert at the position i, then take the nextSibling of the node at position i-1 as the anchor point to insert the current node.

But not all nodes need to be moved. For example, when the second new vnode is processed, it is found that its subscript in the old vnode array is 4, which means that it is originally in the back, so there is no need to move. Conversely, if the corresponding old vnode found by the vnode needs to be moved before the current index.
That's it:
let j = 0
let find = false
// loop over old children
for (j; j < oldChildren.length; j++) {
const oldVNode = oldChildren[j]
// If two nodes with the same key value are found, call the patch function to update them
if (newVNode.key === oldVNode.key) {
find = true
patch(oldVNode, newVNode, container)

if (j < lastIndex) { // The index of the old vnode array is before the previous index and needs to be moved
const prevVNode = newChildren[i - 1]
if (prevVNode) {
const anchor = prevVNode.el.nextSibling
insert(newVNode.el, container, anchor)
}
} else {// no need to move
// update lastIndex
lastIndex = j
}
break
}
}

Find the subscript of the new vnode in the old vnode array. If found, it means that the corresponding dom can be reused. Patch it first, and then move it.
If you move, judge whether the subscript is after the lastIndex. If it is already behind, then you don't need to move, just update the lastIndex.
If the subscript is before lastIndex, it means that it needs to be moved, and the moved position has been analyzed before, which is the back of the new vnode array i-1.
In this way, we have completed the reuse and movement of dom nodes.
After the new vnode array is all processed, there may be some left over from the old vnode array that are no longer needed, so delete them:
// Traverse old nodes
for (let i = 0; i < oldChildren.length; i++) {
const oldVNode = oldChildren[i]
// Take the old VNode and go to the new children to find the same node
const has = newChildren.find(
vnode => vnode.key === oldVNode.key
)
if (!has) {
// if no identical node is found, remove
unmount(oldVNode)
}
}

In this way, we have completed the diff of the two groups of vnodes and the addition, deletion and modification of the corresponding dom.
To summarize:
The purpose of the diff algorithm is to reuse dom nodes based on keys, reducing dom operations by moving nodes instead of creating new ones.
For each new vnode, search according to the key in the old vnode. If it is not found, add a dom node. If it is found, it can be reused.
If it is reused, it is necessary to judge whether to move the subscript. If the subscript is after the lastIndex, there is no need to move it, because it is originally behind, otherwise it needs to be moved.
Finally, the nodes in the old vnode that are not in the new vnode are removed from the dom tree.
This is the implementation of a complete diff algorithm.

This diff algorithm is processed one by one from one end, which is called the simple diff algorithm.
In fact, the performance of the simple diff algorithm is not the best. For example, the old vnode array is ABCD, and the new vnode array is DABC. According to the simple diff algorithm, A, B, and C all need to be moved.
So how to optimize this algorithm?
There will be this problem in sequential processing from one direction. What about comparing from two directions at the same time?
This is the double-ended diff algorithm:

double-ended diff


Simple diff algorithm can realize the reuse of dom nodes, but sometimes it will make some unnecessary moves. The double-ended diff algorithm solves this problem, it compares from both ends.
We need 4 pointers to the head and tail of the new and old vnode arrays:

The head and tail pointers move to the middle until oldStartIdx <= oldEndIdx and newStartIdx <= newEndIdx, indicating that all nodes have been processed.
Each time the nodes pointed to by the two head pointers, the nodes pointed to by the two tail pointers, and the nodes pointed to by the head and the tail are compared, whether the keys are the same, that is, reusable.
If it is reusable, use it directly, call patch to update it, and if it is the head and tail, move the position.
That's it:
while (oldStartIdx <= oldEndIdx && newStartIdx <= newEndIdx) {
if (oldStartVNode.key === newStartVNode.key) { // header
patch(oldStartVNode, newStartVNode, container)
oldStartVNode = oldChildren[++oldStartIdx]
newStartVNode = newChildren[++newStartIdx]
} else if (oldEndVNode.key === newEndVNode.key) {//tail
patch(oldEndVNode, newEndVNode, container)
oldEndVNode = oldChildren[--oldEndIdx]
newEndVNode = newChildren[--newEndIdx]
} else if (oldStartVNode.key === newEndVNode.key) {//Head and tail, need to move
patch(oldStartVNode, newEndVNode, container)
insert(oldStartVNode.el, container, oldEndVNode.el.nextSibling)

oldStartVNode = oldChildren[++oldStartIdx]
newEndVNode = newChildren[--newEndIdx]
} else if (oldEndVNode.key === newStartVNode.key) {//The end needs to be moved
patch(oldEndVNode, newStartVNode, container)
insert(oldEndVNode.el, container, oldStartVNode.el)

oldEndVNode = oldChildren[--oldEndIdx]
newStartVNode = newChildren[++newStartIdx]
} else {

// No reusable node found at head and tail
}
}

The comparison between head and tail is relatively simple, and the comparison between head and tail needs to move the next node.
For example, if the head node of the old vnode is the tail node of the new vnode, then move it to the position of the tail node of the old vnode.
That is:
insert(oldStartVNode.el, container, oldEndVNode.el.nextSibling)

The anchor node of the inserted node is the nextSibling of the dom node corresponding to oldEndVNode.
If the tail node of the old vnode is the head node of the new vnode, move it to the position of the head node of the old vnode.
That is:
insert(oldEndVNode.el, container, oldStartVNode.el)

The anchor node of the inserted node is the dom node corresponding to oldStartVNode (because it is inserted before it).
Comparing from both ends can reduce the number of node moves as much as possible.
Of course, it is also necessary to deal with the case where there are no reusable nodes at both ends:
If there is no reusable node at both ends, then look for it in the old node array, move it over if it is found, and set the original position to undefined. If not found, insert a new node.
That's it:
const idxInOld = oldChildren.findIndex(
node => node.key === newStartVNode.key
)
if (idxInOld > 0) {
const vnodeToMove = oldChildren[idxInOld]
patch(vnodeToMove, newStartVNode, container)
insert(vnodeToMove.el, container, oldStartVNode.el)
oldChildren[idxInOld] = undefined
} else {
patch(null, newStartVNode, container, oldStartVNode.el)
}

Because there are some undefined nodes, we need to add the processing logic of empty nodes:
if (!oldStartVNode) {
oldStartVNode = oldChildren[++oldStartIdx]
} else if (!oldEndVNode) {
oldEndVNode = newChildren[--oldEndIdx]
}

In this way, the logic of multiplexing and moving of nodes is completed.
What about those nodes that really don't have reusable nodes?
After the previous move, the remaining nodes are moved to the middle. If the new vnode has remaining, it will be added in batches, and if the old vnode has remaining, it will be deleted in batches.
Because the judgment condition of the previous loop is oldStartIdx <= oldEndIdx && newStartIdx <= newEndIdx, so if there are too many old vnodes, finally newStartIdx will be smaller than newEndIdx. If there are too many new vnodes, the last oldStartIdx will be less than oldEndIdx.
So the judgment condition is as follows:
if (oldEndIdx < oldStartIdx && newStartIdx <= newEndIdx) {
// add new node
for (let i = newStartIdx; i <= newEndIdx; i++) {
patch(null, newChildren[i], container, oldStartVNode.el)
}
} else if (newEndIdx < newStartIdx && oldStartIdx <= oldEndIdx) {
// remove operation
for (let i = oldStartIdx; i <= oldEndIdx; i++) {
unmount(oldChildren[i])
}
}

This is a complete diff algorithm, including finding reusable nodes and moving nodes, adding and deleting nodes.
And because it looks for nodes from both sides, it will perform better than the simple diff algorithm.
For example, from ABCD to DABC, the simple diff algorithm needs to move ABC three nodes, while the double-ended diff algorithm only needs to move D one node.
To summarize:
Double-ended diff is when the head and tail pointers move to the middle, and compare whether the head, tail, head and tail can be reused, and if so, move the corresponding dom node.
If the reusable node is not found at the head and tail, it traverses the vnode array to find it, and then moves the node corresponding to the subscript to the head.
Finally, if there are still old vnodes left, they will be deleted in batches, and the remaining new vnodes will be added in batches.

The double-ended diff algorithm is the diff algorithm adopted by Vue2, and its performance is not bad.
Later, Vue3 upgraded the diff algorithm again, called the fast diff algorithm. More on this later.
Summarize
React and Vue are both front-end frameworks based on vdom. The component generates vdom, and the renderer updates the vdom to the dom through the added, deleted, and modified dom api.
When the vdom is rendered again, the new and old vdom trees are diffed, and only the changed dom nodes are updated.
The diff of the two trees is O(n^3), and the time complexity is too high. Therefore, the front-end framework stipulates that only the diff of the same layer is performed. If the type is different, the nodes are considered different, and the child nodes are no longer compared. In this way, the time complexity is suddenly reduced to O(n).
However, for the diff of multiple sub-byte points, it cannot be rudely deleted and added. It is necessary to reuse the existing nodes as much as possible, that is, to replace the new addition by moving.
So when there are multiple nodes, you need to specify the key, and then the diff algorithm finds and reuses nodes according to the key.
The simple diff algorithm finds the old node according to the key in turn. If it is moved, it is judged by the lastIndex. If it is greater than it, it does not need to be moved, and if it is smaller than it, it needs to be moved. The remaining nodes are deleted and added in batches.
However, the limitations of the simple diff algorithm are still relatively large. In some cases, the performance is not good, so vue2 uses the double-ended diff algorithm.
The double-ended diff algorithm moves the head and tail pointers to the middle, and judges whether the head and tail nodes can be reused. If no reusable node is found, traverse to find the subscript of the corresponding node, and then move. After all the processing is completed, the remaining nodes should be added and deleted in batches.
In fact, the most important thing for the diff algorithm is to find reusable nodes and move them to the correct position. It's just that the search order of different algorithms is different.
Vue2 uses the double-ended diff algorithm, while vue3 is further optimized by the longest increasing subsequence algorithm. We will talk about the optimized diff algorithm later.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00