# AA tree

Invented by Bayer (1971), Andersson (1993) mimic a 2-3 tree by a binary tree each node has assigned a small integer called level level of a node is always $\geq$ level of children and when equal, the nodes belong to the same pseudonode for simpler implementation, only right child may have level equal to its parent pseudonodes consisting of 1 or 2 nodes correspond to nodes in a 2-3 tree when larger pseudonodes arise during the runtime, they must be split (mimic splitting overflow nodes in a 2-3 tree) much simpler implementation than red-black tree (or other worst-case logarithmic BSTs)

## Correspondence with 2-3 trees

Recall that in a 2-3 tree, all nodes contain 1 or 2 keys and have 2 or 3 children, respectively.

An AA tree is a binary search tree representation of a 2-3 tree. Here is an example of a 2-3 tree (left) and the corresponding AA tree (right):

For each node, we store its “level” (the small numbers next to nodes in the figure above), which corresponds to the height of a corresponding node in the 2-3 tree. In the example above, the trees have 3 levels.

We represent a 2-node by a single node and a 3-node by a node + its right child on the same level.

In other words, let’s call a connected set of nodes at the same level a pseudonode. In an AA tree, only two kinds of pseudonodes are allowed: a single node (corresponding to a 2-node) and a node + its right child (corresponding to a 3-node).

Note that all leaves in a 2-3 tree have the same depth. Leaves in an AA tree are at level 1 and every internal node at level $\ell$ has either two children at level $\ell-1$, or its left child is at level $\ell-1$ and the right child is at level $\ell$ and then its children must be at level $\ell-1$.

Skew: We only allow right-leaning pseudonodes. If $y$ has the same level as its left child, we do a right rotation to fix it. We call this operation skew($y$):

Split: If a pseudonode is too big (nodes $x$$y$$z$ in the figure below), we split it:

Operation split($x$): Let $y \gets x.right;$ $z \gets y.right;$ if $x.level=z.level$, then rotate($y$) and increase $y.level$.

This corresponds to a split operation in 2-3 trees:

## Balancing

### Insertion

First, insert a new node the same way as in BST (its level will be 1). Then follow the path from the new node back to the root and use skew and split to split large pseudonodes.

### Deletion

The simplicity of the AA tree is that during deletion, there are a couple of different cases, but all of them are treated the same way.

Remove a node as in BST. Then follow the path from the deleted node back to the root and:

1. if a node $x$ has a child on level $< x.level-1$, decrease $x$’s level; if $x.right$ had the same level, also decrease its level
2. call skew 3 times: $skew(x);$ $skew(x.right);$ $skew(x.right.right);$ and
3. split twice: $r\gets x.right.right;$ $split(x);$ $split(r);$
For example, imagine we just removed a node from the subtree $\alpha$ (nodes $\alpha,\ldots,\eta$ are at level $\ell-1$; note that there is a missing node at level $\ell$ above $\alpha$):

The first step (decreasing levels) corresponds to joining a pseudonode $u$$v$ with its children on a lower level ($w$$x$ and $y$$z$ in our example). In the worst case we get a pseudonode consisting of 6 nodes as in our example. In the second step, we use the skew operation to “flatten” the pseudonode:

Finally, in the third step, we split the right chain into:

Let us reiterate that the simplicity is in treating all the cases this same way: decrease level(s), $3\times$skew, $2\times$split. Compare this with 2-3 trees, where we need to check left and right siblings, share keys, if a sibling is a 3-node and join nodes, if a sibling is a 2-node. Also compare this with red-black trees, a binary tree representation of 2-3-4 trees, which has 3–4 cases for insert and 4–6 cases for delete.

## References

• A. Andersson. Balanced search trees made simple. Algorithms and Data Structures, pages 60-71, 1993. [bib] [pdf]
• R. Bayer. Binary B-trees for virtual memory. In Proceedings of the 1971 ACM SIGFIDET Workshop on Data Description, Access and Control, pages 219-235. ACM, 1971. [bib]