Structure and balancing
In a 2-3-4 tree, every node has capacity to store 1, 2, or 3 keys. Internal nodes with \(k\) keys have \(k+1\) children (2, 3, or 4) and we call them accordingly, 2-nodes, 3-nodes, and 4-nodes:
All leaves must have the same depth - they have to be at the same level. This is achieved by splitting nodes that overflow and joining or reorganizing nodes that underflow.
Consequently, a 2-3-4 tree of height \(h\) stores at least \(2^h-1\) keys (if all nodes are 2-nodes) and at most \((4^h-1)/3\) keys (if all nodes are 4-nodes). This implies height between \(\log_4(3n+1)\approx\frac12\log n\) and \(\log (n+1)\).
When we insert a new key into a 2-node or 3-node, we just get a 3- or 4-node, which is OK.
However, when we insert a key into a 4-node, it overflows and we need to split it:
The procedure continues by inserting \(z\) to the parent node. This may even start a cascade of overflown 4-nodes which ends either when the parent has enough capacity or we get to the top of the tree and create a new root (at this point the height of the whole tree increases by 1).
Assume we are deleting a key from leaf. (If not, we replace the key by its successor, which is in a leaf - the same trick we used in BST). When we delete from a 3- or 4-node, we just get 2- or 3-node, which is OK.
A problem occurs, when deleting from a 2-node which only has a single key, so after deletion, the node is empty and this is not allowed. There are two ways to fix it.
If a sibling next to it is a 3- or 4-node, we can borrow a key from it:
(A node may have two adjacent siblings - left and right. If any one of them is a 3- or 4-node, i.e. is full enough, we can borrow a key from it.)
We can’t use this trick if an adjacent sibling is a 2-node. In this case, we join it with a key from the parent node and recursively deal with deleting from the parent node (this may start a cascade of joins):
Here is the situation with 2 adjacent siblings, both 2-nodes:
In this case we join \(x\) and \(y\); the parent node keeps \(z\) and we can stop.
Thus, the entire
delete operation may involve a series of joins. It ends either with a share, or with join when parent is a 3- or 4-node, or continues all the way up to the root. In that case the final join deletes root and height of the tree decreases by 1.