# B-tree

Invented by Bayer, McCreight (1972) a generalization of 2-3 and 2-3-4 trees: internal nodes of a B-tree of order t have between $t/2$ and $t$ children (for $t=3$ and 4, we get 2-3 and 2-3-4 trees) B-trees are often used as an external data structure, i.e., for data stored on disk since hard disks are 1000 – 100 000-times slower than RAM, it is crucial to minimize the number of disk accesses; this is achieved by choosing very large $t$ (usually so that one node fills one disk page) when a node overflows, it is split in two and the middle element is inserted to parent (a cascade of splits may occur) when a node underflows, it may borrow some elements from neighbouring nodes or, if the nodes are small, they may be merged great for storing large amounts of data which don’t fit into the main memory, while it minimizes the number of disk accesses B+ tree: items are only stored at leaves; the keys in internal vertices only serve as guides during the search; there are two advantages: 1. since internal nodes do not store any additional data, this allows for higher degree, smaller height, and less disk accesses; 2. data may be easily traversed sequentially from left to right (contrast this with in-order traversal of a B-tree) pre-emptive splitting and merging: split/merge nodes on the way down the tree; advantage: no need to return (and thus less disk accesses); disadvantage: sometimes nodes are split even when it was not necessary (which leads to slightly increased height and memory consumption) B* tree: non-root nodes have to be at least 2/3 full; advantage: better use of memory; disadvantage: more splitting and merging is needed
[You can download the application for offline viewing.]

## References

• R. Bayer and E.M. McCreight. Organization and maintenance of large ordered indexes. Acta informatica, 1(3):173-189, 1972. [bib] [pdf]
• D. Comer. Ubiquitous B-tree. ACM Computing Surveys (CSUR), 11(2):121-137, 1979. [bib]
• A.C.C. Yao. On random 2-3 trees. Acta Informatica, 9(2):159-170, 1978. [bib]
• R.A. Baeza-Yates. Fringe analysis revisited. ACM Computing Surveys (CSUR), 27(1):109-119, 1995. [bib] [pdf]