Chapter 11 Answers

powered by FreeFind

Modified: 

Specialized Universe of Keys

Question

The load factor a of a direct-addressed data structure is often given as: a = n/m

n = number of keys used, K = { 2, 3, 5, 8 }

m = number of keys in universe, U = { 0, 1, .., 9 }

What is the load (utilization) of the example.    a = n/m = 4/10 = .4

What is the minimum load factor?                    a = n/m = 0/m = 0

What is the maximum load factor?                    Items may not occupy same slot: a = n/m = 10/10= 1.

 

Question

Diagram a data structure for direct-addressing to implement a dictionary of dates translating from historical dates to English representation, for example: 1492 ® fourteen ninety two

T

1490 1491 1492 1493 1494 1495 1496
NIL NIL 319 NIL NIL NIL NIL

Address 319

english key
fourteen ninety two 1492

Which is the key? 1492 satellite data? fourteen ninety two

What does T, direct address table, hold? Addresses to satellite data.

Give a call to insert 1492 ® fourteen ninety two.  

Year x = new Year( 1492, "fourteen ninety two")

Insert( T,  x )

Give a call to search for the translation of 1492 ® fourteen ninety two.

x = Search( T, 1492)

Give a call to delete 1492 ® fourteen ninety two.

Delete( T, x );

Question

Diagram T. See above. What is stored at each index of T? Address of a Year.

Are the keys totally ordered? Yes.

Is the key field of Year redundant? No. It is needed in Delete.

Performance

Question

Describe a procedure to find the maximum key used by a dynamic set S represented by a direct-address table T of length m.

MaximumKey( T )

-- pre: T is not empty

-- post: return maximum key in T

i=m

while i>1 and T[i] º NIL do i ¬ i-1

return i

What is the best and worst case performance? W(1) and O(m)

Describe a procedure to find the maximum element of dynamic set S represented by a direct-address table T of length m.

MaximumElement( T )  -- Assumes > property on elements

-- pre: T is not empty

-- post: return maximum element in T

max = MaximumKey( T )

for i=1 to m do

if T[i] ¹ NIL and T[i] > T[max] then max ¬ i

return T[ max ]

What is the best and worst case performance?  O(m)
 

Question

Give one other situation when direct addressing as implemented does not work. When the key is not an integer. Suggest a solution. Define a mapping of key to 0..m where m is the number of keys.

What is U for the Java program? {0..9999}

What is K for the Java program? {1492, 2006}

What is the maximum load factor? a = n/m = 2/10000 = .0002

What are the minimum storage requirements for the direct-addressing? 10000+2

Hashing

Question

Does knowing the hash provide much useful information about the message? Not for industrial strength hashes, those with a large number of possible hash values. If the message vocabulary was restricted, say to a 2-digit number and the hash algorithm were known, the hash value could be reversed mapped back to the message.

Can one include the hash as part of the hashed message or must it be separate? It must be separate from the hashed message.

Collision resolution by chaining

Insertion: Always add to front of list, worst case time of O(1).

Question - Why O(1)? Hash is O(1), insertion is O(1).

Figure at below shows singly linked lists. If we want to delete elements, it’s better to use doubly linked lists.

Question - Why? Need previous and next pointers for deletion. In doubly linked lists each node has both. Deleting k2 requires changing next pointer of previous node, k5 to k7.

Implementation

Question - What are the keys in one of the buckets above? k2, k5, k7
 

Doubly linked

Question - Use h( k ) = k mod 2

Diagram a data structure for hashing to implement a dictionary of dates translating from historical dates to English words, for example:

1492 ® fourteen ninety two

previous
   NIL
key
1492
data
fourteen ninety two
next
NIL

Which is the key? satellite data? What does T hold? Address of the node.

Construct the hash table for keys: 2005, 1492, 1984, 1776, 2013

 

 


®
previous
   NIL
key
1492
data next
®
¬
previous
key
1984
data next
®
¬
previous
key
1776
data next
 NIL
 

 


®
previous
   NIL
key
2005
data next
®
¬
previous
key
2013
data next
 NIL

Insert a different node with key 2005. Any problems? Although "y: Node_Ptr, y Î T's nodes: x ¹ y, there are two identical keys. Only the first will be found by the search.

 

 


®
previous
   NIL
key
1492
data next
®
¬
previous
key
1984
data next
®
¬
previous
key
1776
data next
 NIL
 

 


®
previous
   NIL
key
2005
data next
®

previous
¬

key
2005
data next
®
¬
previous
key
2013
data next
 NIL

What about inserting the same node twice? Violates precondition: "y: Node_Ptr, y Î T's nodes: x ¹ y

Search for 1492, 2002 and 2005. Any problems? 2002 returns NIL, 2005 returns first key found.

Delete 1492, 2005 and 2002. Any problems? Deleting 2002 violates preconditions: $y: Node_Ptr, y Î T's nodes: x = y

 

 


®

previous
NIL

key
1984
data next
®
¬
previous
key
1776
data next
 NIL
 

 


®
previous
   NIL
key
2005
data next
®
¬
previous
key
2013
data next
 NIL

Performance

Question - What is implied by: a = 1, a < 1, a > 1?

a is the average load. While Simple Uniform Hashing means that each element is equally likely to hash into any of the m slots, the load factor does not provide any information about the actual distribution; all actual elements may hash to the same slot.

If keys are uniformly distributed then a implies the average length of the chains.

Chained-Hash-Delete

Question - What is the problem with singly linked lists?

To locate the previous node of the one to delete, must start at the slot and iterate to node to delete, an O(n) operation versus Q(1) for doubly linked.

Chained-Hash-Search:

Question - Would using a binary search reduce search to Q(lg n)? If representing using an array. Linked lists are preferable, why? For Q(1) insert.

Hash functions

Question Give an example where this might not satisfy simple uniform hashing. Student ID or other numbers that have persistent patterns in certain digits.

Question

What are some of the criteria for key selection?

Would a key based on the year of birth for first-graders be useful? No, nearly all would be the same.

Would a hash function with m = 26 and a hash function that extracted the first vowel of a name satisfy the Simple Uniform Hashing assumption?

For example: h( k ) = (first vowel of k) mod 26

No, there are only 5 vowels (aeiou) so that all names would hash to 5 slots. This is not a hash function but direct addressing.

A hash that extracted the month of birth from the key?

For example: h( k ) = month of birth in k

Yes, even though births are not precisely uniformly distributed.