Chapter 11 - Hash functions
Multiplication Method


Document last modified: 

OVERVIEW

What makes a good hash function?

Ideally, the hash function satisfies the assumption of simple uniform hashing.

In practice, it’s not possible to satisfy this assumption, since we don’t know in advance the probability distribution that keys are drawn from, and the keys may not be drawn independently.

Often use heuristics, based on the domain of the keys, to create a hash function that performs well.

Keys as natural numbers

Example

Division method for h( k )

h(k) = k mod m

Example:

m = 20 and k = 91 ®

            h(91) = 91 mod 20
                     = 11

Advantage: Fast, since requires just one division operation.

Disadvantage: Have to avoid certain values of m

  • Powers of 2 are bad. If m = 2p for integer p, then h(k) is just the least significant p bits of k.

    128 = 27

    CLRS as (67 · 1283)+ (76 · 1282)+ (82 · 1281)+ (83 · 1280) mod 27 = 83 = 010100112

    ABCS as (65 · 1283)+ (66 · 1282)+ (67 · 1281)+ (83 · 1280) mod 27 = 83 = 010100112

Question 11.17 - Are powers of 10 for integer keys bad? Give an example.

  • If k is a character string interpreted in radix 2p (as in CLRS example), then m = 2p − 1 is bad: permuting characters in a string does not change its hash value (Exercise 11.3-3).
     
    • Means  CLRS mod 27-1 = RLCS mod 27-1

      27 = 128 radix

      27-1 = 127 is m

    • CLRS as (67 · 1283)+ (76 · 1282)+ (82 · 1281)+ (83 · 1280) mod 127 = 54
       
    • SRLC as (83 · 1283)+ (82 · 1282)+ (76 · 1281)+ (67 · 1280) mod 127 = 54

Question 11.18 - Show that m=9 is a bad choice for integer keys (i.e. radix 101).

Try keys 123, 321, 231

 

Good choice for m:

A prime not too close to an exact power of 2.

Generally, a prime not too close to the radix.

 

Multiplication method for h( k )

  1. Choose constant A in the range 0 < A < 1.
     
  2. Multiply key k by A.
     
  3. Extract the fractional part of k·A
     
  4. Multiply the fractional part by number of slots, m.
     
  5. Take the floor of the result.

Mathematically

h( k ) = ëm · (k·A mod 1)û

where k·A mod 1 = k·A − ëk·Aû  = fractional part of k·A

Example

m = 8 (implies m =  23, p = 3)

w = 5

k = 21

Question 11.19 - Find h( k ) for:

m = 16 (implies m =  24, p = 4)

w = 8

k = 21

s = 32                            Must have 0 < s < 28; choose s = 32 ® A = s / 28

  1. A = ?
     
  2. k·A = ?
     
  3. k·A mod 1 = ?
     
  4. m · (k·A mod 1) = ?
     
  5. h( k ) = ?                 h(k) = ëm · (k·A mod 1)û  

 

Implementation

Computing the hash key used four operations but can be implemented with fewer, mostly bit operations.

0 < A < 1

Example

m = 8 (implies m =  23, p = 3)

w = 5

k = 21

s = 13

k · s = 21 · 13
  = 273
  = 8 · 25 + 17
  = r1 . r0

r1
r0 

= 8 · 25
= 17 = 100012

Question 11.20 - Find h( k ) for:

m = 4 (implies m = 22, p = 2)

w = 3

k = 12

s = 5                        0 < s < 2w = 23 = 8

k · s = 12 · 5
  = ?
  = ? · 23 + ?
  = r1 . r0

r1
r0 

= ? · 23
= ? = ?2

How to choose A:

Example:

h(k) = ëm * ((k * A) mod 1)û

  • w - word size of the machine is w bits
     
  • 0 < A < 1
    • restrict A to be a fraction of the form s/2w  
    • where 0 < s < 2w 
    • Knuth recommends choosing A ≈ (50.5 - 1) / 2 
       
  • m = 2p - for some integer p
     
  • (k * A) mod 1 = (k * A) -  ëk * Aû
  • k = 123456
  • p = 14
  • m = 2p = 214 = 16384
  • w = 32 bits

Determine:

Then:

Question 11.21 - What is the final bit operation needed to recover 14 bit h(123456) from 32 bit r0?

 

The Multiplication Method Algorithm for h(k)

  1. compute (k · s) - using multiplication resulting in a 2·w bit value of the form: r1 · 2w + r0 
     
  2. compute r1 = (k · s)/232  by shifting bits 32 to the right
     
  3. compute r0 = (k · s) - (r1 * 232 ) by 1st shifting r1 to the left 32 bits, and then by doing a subtraction
     
  4. compute h(k) = the p most significant bits of r0 by shifting r0 w-p (e.g. 32-14) bits to the right or rotating p bits to the left.

Comparison

    Multiplication Method Division Method
  m 1000

1000

  A 0.618033988749895  
       
  key h(key) =   ëm * (k * A) mod 1û h(key) = mod(key, 1000)
sample key  from p. 264: 123456 4 456
  123459 858 459
These keys are a 123496 725 496
one digit modification 123956 21 956
of the sample key: 129456 208 456
123456 193456 383 456
  923456 195 456