HASHING
HASHING
HASHING
• A collision occurs when two values in the set hash to the same
value. 0
• It means we want to add new record r with key k to our file F, 14
1
but memory location address h(k) is already occupied. 21
• Collision: That event when2 hash table elements map into the 2
same slot in the array. 3
• Example: add 41, 34, 7, 18, then 21 4 34
• 21 hashes into the same slot as 41! 5
• 21 should not replace 41 in the hash table; they should both be there
6
7 7
Collision resolution: means for fixing collisions in a hash table. 8 18
9
Solution for Collision
(Collision Resolution)
• There are two general ways to resolve collisions :
h(K) : 5
18
0 1 2 3 4 5 6 7 8 9 10 11 12
Linear Probing
h(K) = K % 13
h(K) : 5, 2,
41 18
0 1 2 3 4 5 6 7 8 9 10 11 12
Linear Probing
h(K) = K % 13
h(K) : 5, 2, 9,
41 18 22
0 1 2 3 4 5 6 7 8 9 10 11 12
Linear Probing
h(K) = K % 13
h(K) : 5, 2, 9, 7,
41 18 59 22
0 1 2 3 4 5 6 7 8 9 10 11 12
Linear Probing
h(K) = K % 13
h(K) : 5, 2, 9, 7, 6,
41 18 32 59 22
0 1 2 3 4 5 6 7 8 9 10 11 12
Linear Probing
h(K) = K % 13
h(K) : 5, 2, 9, 7, 6, 5,
41 18 32 59 22
0 1 2 3 4 5 6 7 8 9 10 11 12
Linear Probing
h(K) = K % 13
h(K) : 5, 2, 9, 7, 6, 5,
41 18 32 59 22
0 1 2 3 4 5 6 7 8 9 10 11 12
Linear Probing
h(K) = K % 13
h(K) : 5, 2, 9, 7, 6, 5,
41 18 32 59 22
0 1 2 3 4 5 6 7 8 9 10 11 12
Linear Probing
h(K) = K % 13
h(K) : 5, 2, 9, 7, 6, 5,
41 18 32 59 31 22
0 1 2 3 4 5 6 7 8 9 10 11 12
Linear Probing
h(K) = K % 13
h(K) : 5, 2, 9, 7, 6, 5, 8
41 18 32 59 31 22
0 1 2 3 4 5 6 7 8 9 10 11 12
Linear Probing
h(K) = K % 13
h(K) : 5, 2, 9, 7, 6, 5, 8
41 18 32 59 31 22 73
0 1 2 3 4 5 6 7 8 9 10 11 12
Open Addressing: Quadratic probing &
Double Hashing
• Disadvantage of linear probing: Records tend to cluster, i.e. , appear next
to one another, when the load factor > 50%.
0 49
The two technique that minimize the clustering are : 1
2 58
1.Quadratic probing: Suppose the record R with key K has the hash
address H(k) = h. Then instead of searching the locations with address h, 3 79
h+1, h+2, …….. , we search the location with address 4
h, h+1, h+4, h+9, ………….., h+i2. 5
6
2. Double hashing: Here the second hash function H’ is used for resolving a
collision. Suppose a record R with key k has a hash address H(k) = h and 7
H’(k) = h’ ≠ m. Then, we linearly search the locations with address 8 18
h, h+h’, h+2h’, h+3h’, …………, h+nh’ 9 89
26
Separate Chaining
• It involves maintaining 2 tables in memory.
• First of all, as before, there is a table T in memory
which contains the records in F, except that T now has
an additional field LINK which is used so that all record
in T with same hash address h may be linked together
to form a linked list. Second, there is a hash address
table LIST which contain pointers to linked lists in T.
• Suppose a new record R with key k is added to the file
F. We place R in the first available location in the
table T and then add R to the linked list with pointer
LIST[H(k)].
• All keys that map to the same hash value are kept in a
linked list.
27
Difference between Open Addressing and
Chaining
SEPARATE CHAINING OPEN ADDRESSING
Chaining is Less sensitive to the hash function Open addressing requires extra care for to avoid
3.
or load factors. clustering and load factor.
Cache performance of chaining is not good as Open addressing provides better cache performance
5.
keys are stored using linked list. as everything is stored in the same table.
Wastage of Space (Some Parts of hash table in In Open addressing, a slot can be used even if an
6.
chaining are never used). input doesn’t map to it.