Chapter 18 
B-tree
Operations


Document last modified: 

Overview

Question 18.7

 

Cormen Implementation as C++

class Node_Rep;
typedef Node_Rep * Node;

class Node_Rep {
    public:
        Item key[2t - 1];        -- keys
        Node c[ 2t ];               -- child nodes
        Boolean leaf;         
        Integer n;                   -- number of keys
};

variable Node T;

Example

Question 18.8 For the B-tree T at right that contains the 26 lower-case letter keys, what is:

  1. T.key1
     
  2. T.c2.key3
     
  3. T.c1.n
     
  4. T.c1.c1.n
     
  5. T.c1.c1.key3
     
  6. The minimum degree of the tree is t=3.
    • What is the height, h?
    • What is n?
    • Is that consistent with
       
          h ≤ logt (n + 1)/2?

Searching

(Node, Integer) B_Tree_Search (preserves Node x, preserves Item k)
   i ¬ 1
   while i ≤ x.n  and  k > x.keyi do
      i ¬ i + 1

   if i ≤ x.n and k = x.keyi then
      return (x, i)

   if x.leaf then
      return NULL
   else
      Disk_Read (x.ci)
      return B_Tree_Search (x.ci, k)

Question 18.9

  1. Trace the algorithm for tree at right using B_Tree_Search (T, a).
  2. Trace the algorithm for above tree using B_Tree_Search (T, z).
  3. Give an upper-bounds estimate of recursions. Use h.
     
  4. Give an upper-bounds estimate of comparisons in each while. Use t.
     
  5. Give an upper-bounds estimate of comparisons for the whiles in all recursions. Use t and h.
     
  6. Which dominates where it matters - the Disk_Read (x.ci) or the whiles? Explain.
     
  7. The greater the value of t the ____________ tree depth and the ________ the number of disk accesses.
     
  8. Search is a one-pass operation, what would you estimate the upper-bounds for other one-pass B-tree operations? Use h.
     
  9. Explain finding the minimum B-tree key.
     
  10. Explain finding the successor of B-tree keys l, p and r.
     
  11. Explain finding the predecessor of B-tree keys n, p and m.
(Node, Integer) B_Tree_Search (preserves Node x, preserves Item k)
   i ¬ 1
   while i ≤ x.n  and  k > x.keyi do
      i ¬ i + 1

   if i ≤ x.n and k = x.keyi then
      return (x, i)

   if x.leaf then
      return NULL
   else
      Disk_Read (x.ci)
      return B_Tree_Search (x.ci, k)

 

Insertion

Assume the B-tree at right has t=2.

  1. The new key k must be inserted into an existing leaf node so that the search tree property is maintained.
  2. When inserting a new key k, a new leaf cannot be created (as is the case with a binary tree), because this would violate the property that all leaf nodes must be at the same depth.
  3. Splitting a Full Node:
     
  4. Aggressive Node Splitting

Example of Insertion with Splitting

We will apply the stated rules above first and examine the insertion algorithms later.

For our sanity, let t = 2. A node is then full if it has 2(2)-1 = 3 keys in it, and each node can have up to 4 children.

Insert: 5 9 3 7 1 2 8 6 0 4:

Step 1: Insert 5

Step 2: Insert 9

Root node not full.
Put 9 to the right of 5

Step 3: Insert 3

Root node not full.
Put 3 to the left of 5

Step 4: Insert 7
Root node is full, do aggressive node splitting. Allocate a new (empty) node; make it the root, split pulling median key 5 into the new root:

Then insert 7; with 9

 

Step 5: Insert 1 with 3

Step 6: Insert 2 with 3

Step 7: Insert 8 with 9

 

Step 8: Insert 6
It would go in with 7 8 9, but that node is full.  So split it, bringing its median child 8 into the root:

Then insert 6 with 7:

 

Step 9: Insert 0
0 would go in with 1 2 3, which is full, so split it, sending the median child 2 up to the root:

Now put 0 in with 1

 

Step 10: Insert 4
It would be nice to just stick 4 in with 3, but the B-Tree algorithm requires splitting the full root (or any full node encountered - Aggressive Node Splitting). Note that, if we don't and one of the leaves becomes full, there would be nowhere to put the median key of that split since the root would be full, thus, this split of the root is necessary to prevent backing the median key up more than one level:

Now assured there is room for the median key in case of a lower-level split, we can insert 4, :

Question 18.14

  1. Try inserting a -1 and -2 without splitting the root node.

  2. Repeat, inserting a -1 and -2 with Aggressive Node Splitting.

  3. What purpose is served by Aggressive Node Splitting in Step 10?

 

 

Splitting a Full Node

Figure 18.5 for example of node splitting when t=4.

Figure 18.6 example of root node splitting when t=4

Question 18.15 What is the key difference between splitting the root and any other node?

 

                     t = 2, i = 2

void B_Tree_Split_Child (Node x, Integer i)
-- pre: x.n < (2t - 1)

-- i index where parent should receive median key from y
-- x pointer to parent node of y
-- y pointer to node to be split, y = x.ci 
-- z new node that receives "1/2" of y node's keys and children
-- t is the index of the median node in a full node

   z ¬ Allocate_Node ()                        -- initialize new node z
   y ¬ x.ci
   z.leaf ¬ y.leaf
   z.n ¬ (t - 1)

   for j ¬ 1 to (t - 1) do                       -- copy 1/2 keys from y (node being split) to new z node
      z.keyj ¬ y.key(j + t)

   if not y.leaf then
      for j ¬ 1 to t do                            -- y (node being split) is not a leaf, copy children to z
         z.cj ¬ y.c(j + t)

   y.n ¬ (t - 1)
   for j ¬ (x.n + 1) downto (i + 1) do  -- In parent node x, make room for child pointer to z node
      x.c(j + 1) ¬ x.cj                              -- i is index where new child goes, slide existing children up

   x.c(i + 1) ¬ z
   for j ¬ x.n downto i do                   -- In parent node x, make room for the median key from y node
      x.key(j + 1) ¬ x.keyj                     -- i is index where new key must go, slide existing keys up

   x.keyi ¬ y.keyt                                -- t is index of median node of full node y, copy to y's parent
   x.n ¬ x.n + 1                                  -- one more key in x

   Disk_Write (y)
   Disk_Write (z)
   Disk_Write (x)

Question 18.16 t=2

  1. Explain why t=index of median key in node to split?
  2. Why the Disk_Writes?
  3. Perform split necessary to insert 6.
  4. Perform the additional split necessary to insert 4.

Inserting a key in a single pass down the tree

Figure 18.7 Inserting keys

t = 3, the minimum degree, maximum of 5 keys

Question 18.17

 

t=2

B_Tree_Insert( T, 0)

void B_Tree_Insert (Tree T, Item k)
   r ¬ T.root
   if r.n = (2t - 1) then      -- root is full, create new root,
      s ¬ Allocate_Node ()   -- split old root into 2 leaves
      T.root ¬ s
      s.leaf ¬ FALSE
      s.n ¬ 0
      s.c1 ¬ r
      B_Tree_Split_Child (s, 1)
      B_Tree_Insert_Nonfull (s, k)
   else
      B_Tree_Insert_Nonfull (r, k)

B_Tree_Insert(T, 7)

void B_Tree_Insert_Nonfull (Node x, Item k)

   i ¬ x.n

   if x.leaf then                                  -- at a non-full leaf, insert k

      while i ³and  k < x.keyi do   -- search for k's insertion point,
         x.key(i + 1) ¬ x.keyi                -- and slide existing keys up
         i ¬ i - 1                                   
      x.key(i + 1) ¬ k
      x.n ¬ x.n + 1
      Disk_Write (x)

B_Tree_InsertNonfull(T, 2)

   else                                              
      while i ³and  k < x.keyi do   -- find index to correct child subtree key
         i ¬ i - 1

      i ¬ i + 1
      Disk_Read (x.ci)                          -- read child node into memory

      if x.ci.n = (2t - 1) then               -- if full child node
         B_Tree_Split_Child (x, i)           -- split before inserting

         if k > x.keyi then                    -- x.keyi holds median key from child x.ci
            i ¬ i + 1                                -- index to child where to insert key k 

      B_Tree_Insert_Nonfull (x.ci, k)    -- assured that child i is not full, insert k

B_Tree_InsertNonfull(T, 6)

Question 18.18

  1. When are Disk_Reads performed?
  2. When are Disk_Writes performed?
  3. Where is Aggressive Node Splitting performed?
  4. What is the best-case? the worst-case?

 

Deletion

Example

Figure 18.8 Deleting keys

t = 3, the minimum degree

maximum of 5 keys

minimum 2 keys


Case 1: If key k is in node x and x is a leaf, delete the key k from x. Assumes more than t-1 keys in leaf x.

k=F        x = DEF

Question 18.19a - Draw the tree after deleting K.


Case 2a: If the key k is an internal node x and the child y that precedes k in node x has at least t keys then find the predecessor k' of k in the subtree rooted at y.

  • Recursively delete k' and replace k by k' in x.
  • Finding k' and deleting in a single pass.

k=M        x=CGM         y=JKL         k'=L predecessor of M            

 

Question 18.19 Draw the tree (a) after deleting G.


Case 2c: If both y and z have only t-1 keys, merge k and all of z into y, so x loses both k and the pointer to z, and y now contains 2t-1 keys.

  • Then free z and recursively delete k from y.

k=G       x=CGL          y=DE        z=JK

Question 18.20a - Draw the tree (c) after deleting C.


Case 2b: Use symmetry of predecessor of 2a with successor.

If the key k is an internal node x and the child y that follows k in node x has at least t keys then find the successor k' of k in the subtree rooted at y.

  • Recursively delete k' and replace k by k' in x.

  • Finding k' and deleting in a single pass.

Question 18.20b - Draw the tree after deleting G.


Case 3: If the key k is not present in internal node x, determine root  x.ci  of the appropriate child subtree that must contain k, if k is in the tree at all.

  • If x.ci has only t-1 keys, execute steps 3a or 3b as necessary to guarantee descent to a node with at least t keys.
  • Then finish by recursing on the appropriate x.

Question 18.21 Why is it necessary to only descend into nodes that have at least t keys? Consider Cases 1, 2a and 2c.


Case 3b: Merge.

  • If x.ci and both of x.ci's immediate siblings have t-1 keys.
  • Merge x.ci with one sibling which involves moving a key for x down into the new merge node to become the median key for that node.

k=D     x=P     

Recursion cannot descend into CL with only t-1 keys. CL's immediate siblings, TX, also has t-1 keys.

  • Merge CL with one sibling, TX, creating node with 2t-2 keys.
  • Move key from x that was between the two siblings, P, to the median of the new node; now with 2t-1 keys or full.
  • New node:         x=CLPTX
  • Recursively descend into node, DEJK.

Case 1: If key k is in node x and x is a leaf delete the key k from x.

k=D    x=DEJK

Question 18.22

  • How do B-trees shrink?

  • Suppose there were 2 keys in the root, would the tree shrink?

  • Can we be certain deleting D from DEJK does not violate B-tree requirement: x.n ³ t-1?

  • Draw the tree after deleting N.


Case 3a: Borrow.

  • If x.ci has only t-1 keys but has an immediate sibling with at least t keys.

  • Give x.ci an extra key by moving a key from x down into x.ci.

  • Moving a key from x.ci immediate left or right sibling up into x and moving the appropriate child pointer from the sibling into x.ci.

k=B     x=CLPTX      CLPTX.c1=AB

Move key C (not the child pointers) into AB to give ABC, now has t keys, delete B.

Move key E from EJK (with t keys) to x giving ELPTX.

Note that child pointer ELPTX.c1 remains to AC.
ELPTX.c1=AC

Move child pointer of EJK to AC. The old pointer from the left of E is now the pointer from the right of C.
AC.c
3 = EJK.c1

Question 18.23

  • Does it make any difference whether borrow is from left or right sibling of x.ci in general. That is, do we ever need to change child pointers in x?
  • Why is the last step necessary?
  • Draw the tree (f) after deleting U.

 

Analysis

Question 18.24 What does the t describe?

 

Delete Implementation

The text does not provide an algorithm for deletion, below is one presented to provide some insight to delete operation.

void B_Tree_Delete ( Node root, Item k )
-- pre: root node of B-tree
-- post: if k in B-tree, remove corresponding node

   remove( root, k )

   if root ¹ NIL and root.n = 0 then
 
     root ¬ root.c0
      Disk_Write ( root )

void remove (Node x, Item k)
-- pre: x points to root node of subtree
-- post: if k in subtree, remove corresponding node from subtree

   i ¬ x.n
   while i ³and  k < x.keyi do    -- search for k
         i ¬ i - 1

   if i ³ 1 and k ¹ x.keyi then           -- not found, check child
 
     Disk_Read ( x.ci)
      remove( x.ci, k )
   else                                             -- found
     
      if x.leaf then                            -- at a leaf, delete k
         while i < x.n do                    -- copy over k
            x.keyi ¬ x.keyi+1
            i ¬ i + 1
         x.n ¬ x.n - 1
      else                                          -- not at a leaf, find pointer to correct 
     
   copy_predecessor( x, i)          -- subtree and recurse
    
     Disk_Read ( x.ci )
         remove( x.ci , x.keyi )

   if not x.leaf and x.ci.n < t - 1 then                   
         restore( x, i )                         -- child < minimum degree, restore

   Disk_Write (x)

 

void copy_predecessor (Node x, int i)
-- pre: x points to non-leaf node with entry at i
-- post: if k in subtree, remove corresponding node from subtree

   leaf ¬ x.ci                             -- x.ci is node to the left of x

   while leaf.cleaf.n ¹ NIL do    -- Go as far right to a leaf
         leaf ¬ leaf.cleaf.n
         Disk_Read ( leaf )

   x.keyi ¬ leaf.keyleaf.n-1       -- Copy predecessor key

   Disk_Write (x)

void restore (Node x, int i)
-- pre: x points to non-leaf node with entry at x.ci has one too few entries
-- post: An entry is taken from elsewhere to restore minimum number
--         of entries in the node to which x.ci points 

   if i = x.n then                           -- rightmost child
        if x.ci-1.n > t - 1 then
          
move_right( x, i - 1)
        else
          
combine( x, i )
   else
           if
i = 1 then                      -- leftmost child  
               if
x.c2.n > t - 1 then
                  
move_left( x, 2 )
               else
                  
combine( x, 2 )
           else                                  -- remaining cases: intermediate branches
                if x.ci-1.n > t - 1 then
                  
move_right( x, i-1 )
               else
                   if
x.ci+1.n > t - 1 then
                      
move_left(x, i+1)
                   else
                      
combine( x, i )

void move_left (Node x, int i)
-- pre: x points to node with more than minimum number of entries in child i
--        and one too few in child i-1
-- post: leftmost entry of child i is moved to x, which has sent an entry to child i-1

   left ¬ x.ci-1 
   right ¬ x.ci 

   left.keyleft.n ¬ x.keyi-1             -- take parent entry

   left.n ¬  left.n + 1

   left.cleft.n ¬ right.c1      

   left.keyi-1 ¬ right.key1             -- add right to parent  

   right.n ¬ right.n - 1

   for j ¬ 1 to right.n-1 do           -- move right entries to fill hole
         right.keyj ¬ right.keyj+1
         right.cj ¬ right.cj+1    

    right.cright.n ¬ right.cright.n+1

    Disk_Write( left )
    Disk_Write( right )

void move_right (Node x, int i)
-- pre: x points to node with more than minimum number of entries in child i
--        and one too few in child i+1
-- post: rightmost entry of child i is moved to x, which has sent an entry to child i+1

   left ¬ x.ci 
   right ¬ x.ci+1 

   right.cright.n+1 ¬ right.cright.n

   for j ¬ right.n downto 1 do         -- make room for new entry
         right.keyj ¬ right.keyj-1
         right.cj ¬ right.cj-1       

   right.n ¬ right.n + 1

   right.c1 ¬ left.cleft.n     

    left.n ¬  left.n - 1

   x.keyi ¬ left.keyleft.n

    Disk_Write( x )
    Disk_Write( left )
    Disk_Write( right )               

void combine (Node x, int i)
-- pre: x points to a node with child i and i-1 with too few entries to move
-- post: nodes at i and i-1 have been combined into one node 

   left ¬ x.ci-1 
   right ¬ x.ci 

   left.keyleft.n ¬ x.keyi-1     

    left.n ¬ left.n + 1

   left.cleft.n ¬ right.c1      

   for j ¬ 1 to right.n-1 do        
         left.keyleft.n ¬ right.keyj
         left.n ¬  left.n + 1
         left.cleft.n ¬ right.cj+1    

   x.n ¬ x.n - 1

   for j ¬ i-1 to x.n-1 do       
         x.keyj ¬ x.keyj+1
         x.ci+1 ¬ x.ci+2    

    right = NIL

    Disk_Write( x )
    Disk_Write( left )
    Disk_Write( right )