Red-black tree

Invented by	Bayer (1972), Guibas, Sedgewick (1978)
Idea:	mimic 2-3-4 tree by a binary tree with red and black nodes each black node may have 0, 1, or 2 red children (these correspond to 2-, 3-, and 4-nodes of a 2-3-4 tree respectively), but red nodes have black children only root is always black each path from root to leaf contains the same number of black vertices (this corresponds to the rule that every leaf of a 2-3-4 tree is at the same level) during insertions and deletions to a 2-3-4 tree, nodes may overflow or underflow and must be split or merged; this is mimicked by rotating and recoloring nodes in the red-black tree
Advantages:	guaranteed \(O(\log n)\) time (in the worst case) for each dictionary operation in each operation we do at most 3 rotations (and \(O(\log n)\) color changes, which are quick); this allows creation of efficient persistent data structures
Variations:	left leaning red-black tree: when mimicking a 3-node, the red node is a left child
Disadvantages:	hard to implement
Implementation:	STL, Java

[You can download the application for offline viewing.]

Correspondence with 2-3-4 trees

Red-black trees are sometimes defined with these three invariants:

The root is black.
Children of red nodes are black.
Every path from root to leaf has the same number of black nodes.

Where the hack did these come from?

Well, a red-black tree is just a binary search tree representation of a 2-3-4 tree.

Recall that in 2-3-4 tree, all nodes contain 1, 2, or 3 keys and have 2, 3, or 4 children, respectively.

In a red-black tree,

a 2-node corresponds to a black node with black children,
a 3-node corresponds to a black node with one red child,
and a 4-node corresponds to a black node with two red children.

This idea can be generalized. We could represent any B-tree (or any (a,b)-tree) with a binary tree by representing each B-node with \(k\) keys as a binary tree with \(k\) nodes and marking roots of these subtrees black. For example, here is a B-tree of order 8 represented as a binary search tree:

A red-black tree is just a special case for B trees of order 4, a.k.a. 2-3-4 tree.

The invariants make sure that this correspondence holds. Namely, the first two invariants make sure, that if we take a black node together with its red children and call it a “supernode”, then since red nodes can only have black children, these supernodes consist of 1, 2, or 3 regular nodes and their children are again supernodes. Invariant #3 says that each path from root to leaf goes through the same number of supernodes. This corresponds to the invariant of 2-3-4 trees (or B-trees in general) that all the leaves must be at the same level.

Balancing

Insertion

Nodes we insert are red so the only invariant we might break is #2 (red node with red parent). Let \(x\) be a red node with red parent \(y\). There are 3 cases we need to consider, however, if we keep in mind the correspondence with 2-3-4 tree, it’s actually easy.

Case 1: Red uncle

Uncle = parent’s brother. In the figure below, \(x\)’s uncle is node \(w\) (it’s the sibling of \(x\)’s parent, \(y\)). If this node is red, it means that the supernode \((x,y,z,w)\) overflowed. We simply recolor the nodes as follows:

The case when \(x\) is a right child is the same:

(You can think of it as pushing black from \(z\) down onto its children \(y\) and \(w\).)

Note that this corresponds exactly to splitting operation in 2-3-4 trees:

The 5-node \((x,y,z,w)\) is split into 3-node \((x,y)\) and 2-node \(w\) and \(z\) is moved to the supernode above. Now parent of \(z\) might be red, so we need to deal with that recursively. There might be a cascade of recolorings similar to a cascade of splittings in a 2-3-4 tree.

Case 2, 3: Black uncle

If \(x\)’s uncle is black (\(\delta\) in the figures below), the \((x,y,z)\) supernode is a 4-node, which is OK, we just need to rotate it so it has a proper shape:

Note that in these cases, we can stop.

A red-black tree insertion may involve several (at most \(O(\log n)\)) recolorings, but only 2 rotations at most.

References

R. Sedgewick. Left-leaning red-black trees. In Dagstuhl Workshop on Data Structures, page 17, 2008. [bib] [pdf]
R. Bayer. Symmetric binary B-trees: Data structure and maintenance algorithms. Acta informatica, 1(4):290-306, 1972. [bib]
L.J. Guibas and R. Sedgewick. A dichromatic framework for balanced trees. In Foundations of Computer Science, 1978., 19th Annual Symposium on, pages 8-21. IEEE, 1978. [bib]
R.E. Tarjan. Updating a balanced search tree in O(1) rotations. Information Processing Letters, 16(5):253-257, 1983. [bib]
A. Andersson, C. Icking, R. Klein, and T. Ottmann. Binary search trees of almost optimal height. Acta Informatica, 28(2):165-178, 1990. [bib] [pdf]