Chapter 18 Answers


Document last modified: 

Overview

Question 18.1 - See B-tree below.

18.1 Definition of B-Tree

Question 18.2 Does this hold for QTX? Yes.

Question 18.3

  1. Is the figure at right a B-tree? Yes. How do you know?
    • Root node has 1 key.
    • For internal nodes, t-1£ keys£ 2t-1 for t=2 or 3.
    • k ≤  x.key1 ≤  k ≤  x.key2  ≤...≤  x.keyx.n ≤  kx.n + 1 

    • All leaf nodes at same level.
  2. What is the minimum degree of the B-tree at right?  2
  3. Could FG have an additional key? No, unless DH were rearranged. JKL? No, unless M were rearranged.
  4. For minimum degree t=5, what is:
    • the fewest keys other than the root? t-1=4
    • the most keys? 2t-1=9
    • the fewest children of an internal node other than the root? 5
    • the most children? 10
  5. Why must the root be allowed to have as few as one key? For practical reasons: when building the tree would start with root and have no children.
  6. What must happen when a node is full (2t-1 keys) before another key can be added within the same range? Consider inserting I into JKL.
    • Must move the median key K to DH,
    • split remaining JL into two nodes, J and L,
    • insert I into J.
    • More details later.
  7. Give one factor used in determining t. Disk parameters such as single read/write capacity determine maximum size of a node. Suppose capacity is 70 bytes, a key and a child pointer each requires 10 bytes, could have t=2 for a maximum of 3 keys (2t-1) and 4 children (2t). 

Question 18.4

  1. Is this a B-tree? Yes
  2. What is t?

    Min of 2 keys (t-1=2), max of 5 (2t-1=5).

    Determine if equations equivalent (give same answer for t).

    t-1=2  or  t=3

    2t-1=5   or  t=3

    t=3
     

  3. What is the minimum number of keys/node? t-1=2
  4. What is the maximum number of keys/node? 2t-1=5
  5. The root cannot have only 1 child. Why? Node x with x.n keys has x.n + 1 children
  6. For a tree with 2t keys:
    1. How many keys must be at the root? 1
    2. How many children? 2
    3. How many keys at each child? t-1 and t
Root Node Internal Nodes
t # keys
[1..2t-1]
# children
[0, 2 .. 2t]
3 [1..5] [0, 2..6]
t # keys
[t-1 .. 2t-1]
# children
[t .. 2t]
3 [2..5] [3..6]

Question 18.5

  1. Is this a B-tree? Yes
  2. What are legal values of t? Min # keys in tree is 2, max is 4.

    t-1=2 or t=3

    2t-1=5 or t=é5/2ù =3
     

  3. Does the root satisfy the B-tree definition? Yes
  4. What is the maximum number of keys at the root? 5
  5. What is the maximum number of keys anywhere other than the root? 5
  6. What is the maximum number of children anywhere? 6
  7. What is the minimum number of children anywhere other than the root? 3

 

B-tree height Proof

Question 18.6

  1. What is the practical significance of this proof? Know the upper bound, based on n, of B-tree operations.
  2. What is the approximate maximum height of a B-tree with 2000 keys and t=10?

    h ≤ logt (n + 1)/2 = log10 1000 = 3

    t=2? log2 1000 < log2 210 = 10

  3. What is your estimate for upper bound on B-tree search? O(h) insertion? O(h) deletion? O(h)
  4. For question b. above, which t is better for reducing the upper bound? 10
  5. Normally, we are not concerned about O(3) versus O(10), should we be now? Yes. Why? Because there are 7 additional disk accesses/operation.
  6. Is bigger t better? Generally yes. What is the limiting factor on t? Buffer size used during disk read/write. Desirable to read/write all 2t-1 keys + 2t children in one access.
  7. How many nodes would be the minimum working set (i.e. held in memory for active use)? Root (2t-1 keys + 2t children) + current (2t-1 keys + 2t children)

B-tree Operations

Question 18.7

Question  18.8  For the B-tree T above that contains the 26 lower-case letter keys, what is:

  1. T.key1                                                      l
  2. T.c2.key3                                                 w
  3. T.c1.n                                                       2
  4. T.c1.c1.n                                                  3
  5. T.c1.c1.key3                                             c
  6. The minimum degree of the tree is t=3.
    • What is the actual height?                  2
    • What is n?                                         26
    • Is that consistent with h ≤ logt (n + 1)/2?    2 ≤ logt (26 + 1)/2 = logt 13 = 2.33 Yes

Searching

(Node, Integer) B_Tree_Search (preserves Node x, preserves Item k)
   i ¬ 1
   while i ≤ x.n  and  k > x.keyi do
      i ¬ i + 1

   if i ≤ x.n and k = x.keyi then
      return (x, i)

   if x.leaf then
      return NULL
   else
      Disk_Read (x.ci)
      return B_Tree_Search (x.ci, k)

Question 18.9

  1. Trace the algorithm for tree at right using B_Tree_Search (T, a).
    • Keep count of the number of comparisons made in the while. 2
    • How many times is Disk_Read (x.ci) executed? 2
  2. Trace the algorithm for above tree using B_Tree_Search (T, z).
    • Keep count of the number of comparisons made in the while. 7
    • How many times is Disk_Read (x.ci) executed? 2
  3. Give an upper-bounds estimate of recursions. O(h)
  4. Give an upper-bounds estimate of comparisons in each whileO(2t-1)
  5. Give an upper-bounds estimate of comparisons for the whiles in all recursions. O(h*2t-1)
  6. Which dominates where it matters - the Disk_Read (x.ci) or the whiles? Explain. Disk reads which are about an order of magnitude slower than memory.
  7. The greater the value of t the ___lower_____ tree depth and the _fewer__ the number of disk accesses.
  8. Search is a one-pass operation, what would you estimate the upper-bounds for other one-pass B-tree operations? O(h) disk reads.
  9. Explain finding the minimum B-tree key. Left-most leaf.
  10. Explain finding the successor of B-tree keys l, p and r.
    • l, having a child, is minimum of right subtree.
    • p, having no child, is next key.
    • r, having no child or next key, is parent.keyi.
  11. Explain finding the predecessor of B-tree keys n, p and m.
    • n, having no child, is previous key.
    • p, having no child or previous key, is parent.keyi-1         Note keyi-1 due to i advanced to right of predecessor.
    • m, having no child or previous key, is parent.keyi-1.keyi-1

Insertion

  1. The new key k must be inserted into an existing leaf node so that the search tree property is maintained.
  2. When inserting a new key k, a new leaf cannot be created (as is the case with a binary tree), because this would violate the property that all leaf nodes must be at the same depth.
  3. Splitting a Full Node:
  4. Aggressive Node Splitting

 

Question 18.14

  1. Try inserting a -1 and -2 without splitting the root node. End up with -2 -1 0 1 node which is too full.

  2. Repeat, inserting a -1 and -2 with Aggressive Node Splitting.

           

  3. What purpose is served by Aggressive Node Splitting in Step 10?

    Room exists to insert 4 with 3 but this is not known until node with 3 is input. Since a full child node may be several levels below a full ancestor, insertion would force backing up through parents, duplicating previous storage accesses. By aggressively splitting any full nodes encountered on the way down during insertion, ensures that backing up only occurs between direct parent of a full child, both of which are already in memory.

 

Question 18.15 What is the key difference between splitting the root and any other node? The tree height increases by one.

Question 18.16 t=2

  1. Explain why t=index of median key in node to split? The node to split is full with 2t-1 keys, always an odd number. 2t-1 = (t-1) + 1 + (t-1) so there will always be t-1 keys in the left and right node, the 1 median node is moved to the parent.
  2. Why the Disk_Writes? To update the 3 nodes changed or created by the split.
  3. Perform split necessary to insert 6.

  4. Perform the additional split necessary to insert 4.

Question 18.17

Question 18.18

  1. When are Disk_Reads performed? When accessing a child.
  2. When are Disk_Writes performed? When splitting or inserting into a leaf.
  3. Where is Aggressive Node Splitting performed? Anywhere a full (2t-1 keys) is encountered.
  4. What is the best-case execution? W(h) inserting the smallest key (at leaf). the worst-case? O((2t-1)*h) inserting the largest key (at leaf).

Deletion

Case 2a: If the key k is an internal node x and the child y that precedes k in node x has at least t keys then find the predecessor k' of k in the subtree rooted at y. Recursively delete k' and replace k by k' in x. Finding k' and deleting in a single pass.

k=M        x=CGM         y=JKL         k'=L predecessor of M        

Question 18.19 Draw the steps involving nodes x and y.    

Locate and delete k'=L

Case 2c: If both y and z have only t-1 keys, merge k and all of z into y, so x loses both k and the pointer to z, and y now contains 2t-1 keys. Then free z and recursively delete k from y.

k=G       x=CGL          y=DE        z=JK

Question 18.20 Draw the steps involving x, y and z nodes.

Merge y, k and z to y. DE, G, JK

Delete G from y=DEGJK

Question 18.21 Why is it necessary to only descend into nodes that have at least t keys? Consider Cases 1, 2a and 2c.

Descending into a node with only t-1 keys, it would not be possible to delete a key from that node without backing up. Consider what would occur when going down a chain of nodes with t-1 keys until finding the key to delete above the leafs, also having t-1 keys. Case 2c would need to merge two leaves with one of the parent's t-1 keys, leaving the parent with t-2 keys.

Question 18.22

Question 18.23

Question 18.24 What does the t describe? The 2t-1 maximum number of keys in each node.