Chapter 18 - B-trees
|
Document last modified: |
The height of a B-Tree:
Number of disk accesses on a B-tree is often proportional to the B-tree height.
Theorem 18.1
If n ³ 1 then h ≤ logt (n + 1)/2 Note: log2t n ≤ logt n
for any n-key B-Tree T of height h, and minimum degree t ³ 2:
2t-1 is maximum number of keys, 2t is the maximum number of children
t-1 is the minimum number of keys, t is the minimum number of children
Proof
By B-tree definition, root contains at least one key and all other nodes at least t-1 keys.
A node of t-1 keys must have t children nodes.
Since the root at depth 0 must have at least 1 key
- depth 1 at least 2 nodes
- depth 2 at least 2t nodes
- depth 3 at least 2t2 nodes
:
- depth h at least 2th-1 nodes
The number n of keys satisfies (by A. 5 of text):
- 1 is the root key
- t-1 is minimum keys per node of non-root nodes
- summation of total number of non-root nodes
Theorem 18.1 Proof
n ³ 2th-1 th ≤ (n+1)/2 logt th ≤ logt (n+1)/2 h ≤ logt (n + 1)/2
Example
t = 2, n = 65536
h ≤ log2 (65536 + 1)/2 = log2 32768 = log2 215 = 15
t = 8, n = 65536
h ≤ log8 (65536 + 1)/2 = log8 32768 = log8 85 = 5
Question 18.6
- What is the practical significance of this proof?
- What is the approximate maximum height of a B-tree with 2000 keys and t=10? t=2?
- What is your estimate for upper bound on B-tree search? insertion? deletion?
- For question b. above, which t is better for reducing the upper bound?
- Normally, we are not concerned about O(3) versus O(10), should we be now? Why?
- Is bigger t better? What is the limiting factor on t?
- How many nodes would be the minimum working set (i.e. held in memory for active use)?