Chapter 18  Btrees 
Document last modified: 
Overview  Dealing with maps containing very large amounts of data
search, insert a mapping,
delete an
existing mapping in O(lg n) time.
Basic structure
 Similar to binarysearchtree but has nary children.
 All leafs at same depth (very important).
 Node x with x.n keys has x.n + 1 children
 Keys are stored in nondecreasing order within a node
 Keys at root node of each subtree divide children into ranges of keys.
For any key k_{i} in a subtree of x:
k_{1 } ≤ x.key_{1} ≤ k_{2 } ≤ x.key_{2} ≤...≤ x.key_{x.n} ≤ k_{x.n + 1}
Example: Nodes examined in search for key R.
Btrees useful when multilevel data storage required such as database that is too large to be in memory, some must be in some larger scale storage.
Example  memory for some working set of keys, magnetic for nonworking set.
Access time:
memory about 5*10^{9} seconds
magnetic about 13*10^{3} seconds
due to RPM of disk and positioning time of read/write head
typically read/write large number of bytes (2^{16} bytes) at a time
About Secondary Storage
 Currently disk drives are the most commonly used secondary storage.
 Hardware: spindle, platters, read/write heads
 Logical: track, cylinder, page
 Accessing data requires  head movement and rotational delay
 Average disk access time: 3 to 9 milliseconds
 Average main memory access time: 100 nanoseconds
 Parts of a second:
 millisecond  10^{3}, or 1000^{th} of a second
 microsecond  10^{6}, or 1,000,000^{th} of a second
 nanosecond  10^{9}, or 1,000,000,000^{th} of a second
 Page
 because disk access takes so long, reads and writes are done in big chunks called pages
 typical page sizes range: 2^{11} to 2^{14} bytes
 BTree node sizes are fixed to be equivalent to the machine's disk page size
Btrees design goals
 minimize disk accesses
 mary with large m for large branching factor but small tree height
 root always in memory
 children may be on disk, each level of Btree requires 1 disk access
 Insert, delete and search operations on Btree of height h should require at most h1 disk accesses, implies one pass with no backtracking.
Example: Btree of height two with more than one billion keys requires at most two disk reads/writes to access any key.
Question 18.1  See Btree below.
 How many nodes are examined to search for the R key?
 What do you estimate worstcase search to be in terms of height?
 Which keys are examined to search for the Z key?
 If accessing a new level requires a disk access, how should you design a tree?
 Why is it reasonable to always keep the root in memory?
18.1 Two definitions of BTree
From Data Structures and Program Design by Chapter 11 of Kruse:
A Btree of order m is an mway tree in which:
 All leaves are on the same level.
 All internal nodes except the root have at most m nonempty children and at least ⌈m/2⌉ nonempty children.
 The number of keys in each internal node is one less than the number of its nonempty children, and these keys partition the keys in the children in the fashion of a search tree.
 The root has at most m children but nodes may also have as few as 2 if it is not a leaf, or none if the tree consists of the root alone.
Not a Btree

From Chapter 18.1 of Corman (we'll use his terminology but the graphics aren't as pretty):
BTree T is a rooted tree of nodes:
 Every node x has the following fields:
 x.n is an integer representing the number of keys currently stored in node x (eg. JKL.n is 3)
 x.key_{i} is the i^{th} key of node x, i = 1, ..., x.n (eg. JKL.key_{2} is K)
the x.n keys are stored in nondecreasing order so that:
x.key_{1} ≤ x.key_{2} ≤ ... ≤ x.key_{x.n} (eg. J≤K≤L)
 x.leaf is type boolean, is TRUE if x is a leaf, is FALSE if x is an internal node (eg. JKL.leaf is true)
 x.c_{i} points to child nodes one level down tree, each node x has 1 .. x.n+1 child pointers (eg. DH.c_{2} is FG)
 Child Pointers:
 If x.leaf = FALSE, then x.c_{i} contain the pointers to x's children
 If x.leaf = TRUE, then x.c_{i} are undefined
 Keys x.key_{i} separate the ranges of keys stored in each subtree:
 let k_{i} be any key stored in the subtree with root x.c_{i} then:
k_{1 } ≤ x.key_{1} ≤ k_{2 } ≤ x.key_{2} ≤...≤ x.key_{x.n} ≤ k_{x.n + 1}
Example: At right, for x=DH
DH.c_{1}=BC
DH.c_{2}=FG
DH.c_{3}=JKL
BC_{1 } ≤ D_{1} ≤ FG_{2 } ≤ H_{2} ≤ JKL_{3}
Question 18.2 Does this hold for QTX?
 All leaves have the same depth, which is the tree's height. (eg. upperright Btree, height=2)
 Minimum Degree: t ≥ 2
 minimum number of keys/node:
 root node: 1 (for a nonempty tree)
 nonroot node: t  1
 maximum number of keys/node:
 all nodes: 2t  1
 minimum number of children/node:
 root node (see Figure 18.6):
 0 children, all the keys can be in the root when the number of keys in the tree is [1..2t  1]
 2 children if 2t or more keys in the tree: 1 key at root, t keys at one child, t1 keys at the other.
 nonroot internal nodes: always a minimum of t children
 leaf nodes: always 0 children
 maximum number of children:
 node is full when it contains 2t  1 keys
 internal nodes including root: 2t
 leaf nodes: always 0 children
Question 18.3
 Is the figure at right a Btree? How do you know?
 What is t, the minimum degree, of the Btree at right?
 Can FG have an additional key? Can JKL?
 For minimum degree t=5, what is:
 the fewest keys other than the root?
 the most keys?
 the fewest children of an internal node other than the root?
 the most children?
 Why must the root be allowed to have as few as one key?
 What must happen when a node is full (2t1 keys) before another key can be added within the same range? Consider inserting I into JKL.
 Give one factor used in determining t
Table of possible values for t:
Root Node  Internal Nodes  


Question 18.4
 Is this a Btree?
 What is the minimum number of keys/node? t1=?
 What is the maximum number of keys/node? 2t1=?
 What is t? Solve for t in the following:
t1 = Minimum number of keys/node
2t1 = Maximum number of keys/node
 The root cannot have only 1 child. Why?
 For a tree with 2t keys:
 How many keys must be at the root?
 How many children of root?
 How many keys at each child?
Root Node 
Internal Nodes 



Question 18.5
 Is this a Btree?
 What are legal values of t?
Minimum keys/node t1=?
Maximum keys/node 2t1=?
 Does the root satisfy the Btree definition?
 What is the maximum number of keys at the root?
 What is the maximum number of keys anywhere other than the root?
 What is the maximum number of children anywhere?
 What is the minimum number of children anywhere other than the root?