Chapter 12 - Binary Search Trees |
Document last modified: |
Overview
Binary tree - recursively defined on a finite set of nodes that either
- contains no nodes
- is composed of three disjoint sets of nodes:
root node
binary tree called the left subtree
binary tree called the right subtree
Examples
(a)
(b)
(c)
(d) 2
/
3
/
4
/
6
/
7(e) 4
/ \
2 7
\ /
3 6(f) 2
\
3
\
4
\
7
/
6(g) 4
/ \
2 6
/ \ / \
1 3 5 7(h) 4
/ \
2 6
/
1Question 12.1
- Are each of the trees binary?
- Are each in sorted order? What does that mean?
- What is the height of (d)? (e)?
- For full binary trees, decision trees, and heaps we made and proved general claims that the height h of a binary tree is related to the number of nodes. Why was that useful?
- What are some of the relationships between the height of an arbitrary binary tree such as those above and the number of nodes?
A few:
- For height h there are at least h+1 nodes (see d).
- For height h there are at most 2h+1-1 nodes (see g).
- For n nodes the height is at least ëlg nû (see g).
Binary search trees are an important data structure for dynamic-sets.
• Recall that a dictionary is a dynamic-set with delete, insert or search operations.
• Accomplish many dynamic-set operations in O(h) time, where h = height of tree.
Uses
- To implement a concept (component) that permits the following:
- search on a key value
- dynamic insertions and deletions
- updates of satellite data associated with a key value
- locating predecessors and successors to a particular key value
- maintain data in sorted order
- locate minimum and maximum
Question 12.2
- Can the minimum and maximum be found in the tree at right in O(h) time?
- Can the minimum and maximum be found in the heap at right in O(h) time?
- Which of the above operations can a hash table efficiently support, O(n) time?
- Which of the above operations can a sorted array efficiently support, O(n) time?
Representation
• Represent a binary tree by a linked data structure in which each node is an object.
• T.root points to the root of tree T
• Each node contains the fields:
- key: element (and possibly other satellite data).
- left: points to left child.
- right: points to right child.
- p: points to parent.
T.root.p = NIL
Binary-tree-search property
• Stored keys must satisfy the binary-search-tree property.
If l is any node in the left subtree of x,
then l .key ≤ x.key
If r is any node in the right subtree of x,
then x.key < r.key
Question 12.3 - Does the tree at right satisfy the binary-search-tree property?
Cormen Implementation
class Node_Rep;
typedef Node_Rep * Node;
class Node_Rep {
public:
Item key; // element
Node p; // parent
Node left; // left child
Node right; // right child
};
Node root;Note: the satellite data associated with the key is generally ignored in our discussion.
Java OO implementation
|
|
Binary-search-tree property
![]() |
Let x be a Node in a binary search tree.
|
Question 12.5
- Do both of the above trees satisfy the binary-search-tree property?
- What is the implication of having non-unique keys?
Inorder Traversal
void Inorder_Tree_Walk ( Node x)
if x ¹ NIL
Inorder_Tree_Walk ( x.left )
print x.key
Inorder_Tree_Walk ( x.right )Q(n) Performance:
If Inorder_Tree_Walk is called with the root of an n-node tree, then Inorder_Tree_Walk takes Q(n) time.
The recurrence is:
T(0) = c Empty tree T(n) = T(k) + T(n - k - 1) + d k left subnodes, n - k - 1 right subnodes
d recursion/print costwhere
- d = cost to recurse, not the time spent in the recursion down a branch
- c = cost of test for empty subtree
- n = # of nodes from the subtree root (11 at the tree root)
- k = # nodes in the left subtree (7 at the tree root)
- n - k - 1 = # nodes in the right subtree ( 11 - 7 - 1 = 3 at the tree root)
- T( k ) = cost to recurse on k nodes
- T( n-k-1 ) = cost to recurse on n-k-1 nodes
Example at root
T(n) = T(k) + T(n - k - 1) + d
T(11) = T(7) + T(11-7-1) + d = T(7) + T(3) + d
Theorem T(n) = Q(n)
Given (closed-end equation from the text)
T(n) = (c + d) n + c
(c + d) n = (time to test for NIL and recurse) n times
c = time to test for NIL
æ T(0) = c
T(n) = í
è T(k) + T(n - k - 1) + dProof: T(n) = (c + d) n + c by substitution using text provided closed-end equation
Base T(0) = (c + d) 0 + c = c Substitution T(n) = T(k) + T(n - k - 1) + d = (c+d)k + c + (c+d)(n-k-1)+c + d = ck+dk+c + cn-ck-c+dn-dk-d+c = (c+ d)n + c -(c+d) + (c+d) = (c+ d)n + c Proof: T(n) = (c + d) n + c = Q(n) of upper and lower bound
b1n ≤ (c + d) n + c ≤ b2n
b1 ≤ (c + d) + c/n ≤ b2
(c + d) + c/n maximum at n0 = 1 since limit c/n ® 0 as n ® ¥
(c + d) + c/1 = 2c + d
0 ≤ b1 ≤ 2c + d for n ³ n0 W(n)
0 ≤ 2c + d ≤ b2 for n ³ n0 O(n)
Question 12.6
- Are c and d > 0?
- Can we pick any b1 as long as we satisfy 0 ≤ b1 ≤ 2c + d?
- Does picking 0 ≤ b1 ≤ 2c + d satisfy the relation?
- Have we proven T(n) = Q(n)?
- Give the pre, post and in-order traversal of the tree at right.
- What is the maximum memory use for the tree (number of activation records) during in-order traversal?
- Is it better, worse or the same for pre and post order?
- What is the upper-bound of memory use of in-order traversal?
Search
| Node Tree_Search ( Node x, Item& k) if
x = NIL
or k = x.key |
Question 12.7
- What are the preconditions? postconditions?
- Keep track of the activation records while tracing the nodes searched for 9. For 8.
- Does the tree need to be balanced?
Balanced binary trees (AVL) subtrees of every node never differ in height by more than one. The tree above is not balanced because the left subtree of node 15 has height 4 and the right height 2.
- What is the number of activation records required for search of the trees below?
- What is the tree height?
- What is an upper-bound performance in terms of height?
- What is a lower-bound?
- Can you as easily state a reasonably accurate average performance for an arbitrary search tree? The text discusses in section 12.4
- Which of the following have the better search performance and why? What was the approximate order that data was inserted into each?
(a) 4
/ \
2 7
\ /
3 6(b) 2
\
3
\
4
\
7
/
6
Convert Tail Recursive operation to an Iterative Operation
| Iterative Node Tree_Search (Node x, Item& k) while x ¹
NIL
and
k ¹ x.key |
Recursive Node Tree_Search (Node x, Item& k) if
x = NIL
or k = x.key |
while not base case |
if base case return base
result |
O(h) Performance:
If Tree_Search is called with the root of an n-node tree, then Tree_Search takes O(h) time, where h is the height of the tree.
Question 12.8.1
- Trace below tree using iterative search, the nodes searched for 9. For 8.
- What is the best case for size n?
- What is the worst case for size n?
- What is to be gained by this conversion from recursive to iterative?
- If the tree is full, én/2ù nodes at bottom, can we state a tighter upper-bound performance in terms of n?
Question 12.8.2
Is the following tail-recursive?
void Inorder_Tree_Walk ( Node x)
if x ¹ NIL
Inorder_Tree_Walk ( x.left )
print x.key
Inorder_Tree_Walk ( x.right )
Successor
If all keys are distinct, the successor of node x is the node with the smallest key greater than x.key
Example: The successor of node with key 4 is the node with key 6.
Tree_Successor (Node 4 ) returns Node 6
O(h) Performance: Tree_Successor takes O(h) time, where h is the height of the tree.
Predecessor
If all keys are distinct, the predecessor of node x is the node with with the greatest key less than x.key.
Example: The predecessor of node with key 6 is the node with key 4.
Question 12.9
- What is the successor of 7, 15, 13, 20?
- Trace Tree_Minimum starting at 15. Starting at 7.
- What is the predecessor of 7, 15, 9, 2?
- What does y ¬ x.p do at x=4?
- *Trace Tree_Successor starting at 15, 7, and 13.
- What is the precondition for Tree_Minimum?
- Convert Tree_Minimum to a tail-recursive algorithm.
- Give Tree_Maximum.
- Give a tail-recursive Tree_Successor.
| Node Tree_Successor ( Node x ) if
x.right ¹
NIL |
Node Tree_Minimum ( Node x ) while
x.left ¹
NIL
|
Insertion and Deletion
Insertion and Deletion must maintain the binary-search-tree property.
Insertion 
Tree_Insert ( T, new Node ( left=NIL, right=NIL, parent=NIL, element) )
| void Tree_Insert (Tree T, Node z) -- pre: z ¹ NIL, T ¹ NIL, z.key not in T -- post: z inserted into T maintaining binary-tree-search property variable Node x, y; z.p ¬ y -- parent of z to be y if y = NIL -- inserting into empty tree |
Question 12.10
- Trace inserting 13, 8, 11, 2, 5 into tree at right.
- What happened in the attempt to insert a key that was already in the tree?
- Do insertions occur at internal nodes?
- What is the worst-case number of key comparisons for the tree at right after inserting 13? Give an example.
- What is the best case? In what cases?
- Insert 1, 2, 3, 4, 5, 6, 7, 8, 9 into an empty binary search tree. Give two different trees and the search upper-bound of each.
O(h) Performance:
Tree_Insert takes O(h) time, where h is the height of the tree.
Deletion - Must maintain binary-search-tree property.
Example: (a) Delete a node z with 0 children - 13
| Node Tree_Delete (Tree T, Node z) variable Node x, y; -- initially NIL y ¬ z
-- (a) 0 children if y = y.p.left -- y is left child of parent return y |
Example: (b) Delete a node z with 1 child - 16
| Node Tree_Delete (Tree T, Node z) variable Node x, y; -- initially NIL y ¬ z -- (b) 1 child x ¬ y.right -- on right node 20 if x ¹
NIL -- x is a child if y = y.p.left -- y is left child of parent return y |
Example: (c) Delete a node z with 2 children - 5
Note the y, Node 6, was successor of Node 5.
The 6 was copied to Node 5, original Node 6 returned.
| Node Tree_Delete (Tree T, Node z) -- pre: z ¹ NIL, T ¹ NIL -- post: z deleted from T maintaining binary-tree-search property -- return pointer to node deleted for recycling memory variable Node x, y; -- initially NIL if z.left = NIL or z.right = NIL if y.left ¹
NIL -- locate child node of y if x ¹ NIL --
x is a child (b) or (c) if y.p = NIL -- deleting root |

Question 12.11 - From Figure (a)
- Trace deleting 7.
- Trace deleting 10.
- Trace deleting 15.
O(h) Performance:
Tree_Delete takes O(h) time, where h is the height of the tree,
because it calls Tree_Successor for case (c), and Tree_Successor runs in O(h) time.