Treap

Invented by Aragon, Seidel (1989)
Idea:
  • each inserted node is assigned its priority, which is a random number
  • we require that priorities of nodes in a treap have heap order: the parent’s priority is always greater than or equal to the priorities of its children
  • we have mentioned that BST is balanced with high probability on a random input; however, we cannot simply choose in which order the items are inserted or deleted; the trick of maintaining a heap order of priorities is used to get a random BST, which is balanced on average (this way we turn average case into an expected case)
Advantages: once the rotations are implemented, the implementation is very easy; deletion from a treap is even simpler than deletion from a BST
Disadvantages: a minor one: the \(O(\log n)\) time is “only” expected case with high probability
[You can download the application for offline viewing.]

Randomizing trees

It is well known (see section below) that if we insert elements into BST in random order, on average (and even with high probability), its height will be logarithmic. In other words, with high probability, it will be fine and we don’t even need to rebalance it. The problem of course is that we usually don’t have any control of the elements and insertion order.

Idea: Assign random priority to each node. Treap (a tree+heap) is a binary search tree by key (i.e., smaller keys in a left subtree, bigger keys in a right subtree) and is heap-ordered by priorities (i.e., parent has always higher priority than its children).

Here is an example of a treap (letters are keys, the small numbers are priorities):

A treap is ordered from left to right by key (A, B, C, F, J, L) and heap-ordered by priorities (e.g., 87 > 59 and 77; 77 > 42 and 10).

Uniqueness: It turns out that there is exactly one treap for any given set of (distinct) keys and their (distinct) priorities.

Proof: The root has to be the node with the highest priority (e.g., node C in the example above). Now split the remaining nodes: smaller than root will go left, bigger than root right. Build left and right subtrees (sub-treaps) recursively.

Insert: Same as in BST + bubble up to restore heap order.

Delete: Bubble down until it’s a leaf - then just cut the parent link.

Analysing height of a random tree

Imagine a simple BST where we add \(n\) elements in random order. What will be the height of such a tree?

It turns out that with high probability, it will be logarithmic.

Proof: Let \(A^i_k\) be a random variable such that \(A^i_k=1\) if \(i\) is an ancestor of \(k\) and 0 otherwise. Let \(i\neq k\). Then \(i\) is ancestor of \(k\) if and only if \(i\) is the first value inserted from the interval \([i,k]\). (If \(j\) is between \(i\) and \(k\) and we insert it first, then \(i\) and \(k\) end up in different subtrees of \(j\).)

Since each element from \([i,k]\) has the same probability of going first, \(\Pr[A^i_k = 1] = E[A^i_k] = 1 / (|i-k|+1)\) (for \(i\neq k\)). Then \(E[depth(k)] = E[\sum_{i\neq k} A^i_k] = \sum_{i\neq k} E[A^i_k] = H_k + H_{n-k+1} - 2 \leq 2\ln n\) where \(H_n\) is the \(n\)-the Harmonic number.

Moreover, we can prove, that say \(\Pr[depth(k) > 8\ln n] < 2/n^2\) (it is very improbable, that a node would end up with depth more than 8 times \(\ln n\)). Consequently, \(\Pr[\text{tree height} > 8\ln n]\) \(\leq \Pr[\text{any node has depth} > 8\ln n]\) \(\leq n\times \Pr[\text{fixed node has depth} > 8\ln]\) \(\leq 2/n\). Let’s split \(depth(k)=\sum_{i\neq k} A^i_k\) into \(d_<(k)=\sum_{i<k} A^i_k\) and \(d_>(k)=\sum_{i>k} A^i_k\). Since variables \(A^i_k\) for \(i<k\) are independent, we can use Chernoff bound to prove \[ \Pr[d_<(k) > 4\ln n] < (e^{3}/4^4)^{\ln n} = n^{3-4\ln 4} < 1/n^2 \]

(Aside: Using much more advanced mathematics, it has been proven that \(E[H_n] = \alpha\ln n - \beta\ln\ln + O(1)\) where \(\alpha=4.311\cdots\) and \(\beta=1.953\cdots\), see Reed (2003).)

References

  • C.R. Aragon and R.G. Seidel. Randomized search trees. In Foundations of Computer Science, 1989, 30th Annual Symposium on, pages 540-545. IEEE, 1989. [bib] [pdf]
  • C. Martínez and S. Roura. Randomized binary search trees. Journal of the ACM (JACM), 45(2):288-323, 1998. [bib] [pdf]
  • B. Reed. The height of a random binary search tree. Journal of the ACM, 50(3):306-332, 2003. [bib] [pdf]