DSA Hash Maps
Hash Maps
One type of hash table data structure that typically has a lot of entries is a hash map.
We can quickly search, add, edit, and remove entries when we use a hash map.
Utilizing hash maps, one can locate comprehensive details on a subject.
In the simulation that follows, individuals are kept in a hash map. A person’s name (the Hash Map value) can be displayed once a person’s distinct social security number (the Hash Map key) has been used to look them up.
==== exa mukavu ====
Note: It should be noted that the Hash Map would be more beneficial if each person’s full name, birthdate, address, and possibly other details were linked to the associated social security number. However, this Hash Map simulation is meant to be as basic as possible.
Examining the two preceding pages on hash tables and hash sets will help you better grasp how hash maps operate. It’s also critical to comprehend the definitions of the terms below.
- Entry: Is made up of a key and a value that together make up a key-value pair.
- Key: Distinctive for every Hash Map entry. used to provide a hash code that identifies the bucket of the entry in the hash map. This guarantees that each entry may be found quickly.
- Hash Code: A number that is produced using the key of an entry to identify which bucket the Hash Map entry is associated with.
- Bucket: To store entries, a hash map is made up of numerous buckets, or containers.
- Value: Can be almost any type of data, such as an individual’s name, birthdate, and address. The value may consist of a variety of information types put together.
Finding The Hash Code
A hash function produces a hash code.
The social security number’s numbers—not the dash—are added together using the hash function in the simulation above, and the total of the characters is subjected to a modulo 10 operation (% 10) to produce the hash code, which ranges from 0 to 9.
This indicates that based on the hash code of their social security number, an individual is kept in one of ten buckets that are possible in the Hash Map. When we wish to look up or remove someone from the Hash Map, the same hash code is generated and used.
We can access the appropriate bucket instantly as long as there is only one person using the hash code.
Charlotte’s social security number in the aforementioned simulation is 123-4567. The total of the numbers is 28, and the modulo 10 of that is 8. She therefore fits into bucket 8.
Modulo: A mathematical operation; in most programming languages, represented as % (or m o)d in the field of mathematics). A modulo operation yields the reminder that results from dividing a number by another number. Thus, for instance, 7% 3 will provide us with the reminder 1. (Dividing 7 apples among 3 individuals results in 2 apples for each person, plus 1 extra apple.)
Direct Access in Hash Maps
In order to find Charlotte in the Hash Map, we need to enter the Hash Map key, which is social security number 123-4567. This will provide the hash code 8, as previously mentioned.
This means that instead of looking through other items in the Hash Map, we may go directly to bucket 8 to obtain her name (the Hash Map value).
In cases like this we say that the Hash Map has constant time O(1) for searching, adding, and removing entries, which is really fast compared to using an array or a linked list.
In the worst situation, though, all the individuals are kept in one bucket, and if the one we are looking for is the last one in there, we will have to check their social security number with every other person’s in the bucket before we can locate the one we want.
In such a worst case scenario the Hash Map has time complexity O(n), which is the same time complexity as arrays and linked lists.
It is crucial to have roughly the same number of buckets as Hash Map entries and a hash function that will split the entries equally across the buckets in order to maintain Hash Maps’ speed.
It is a waste of memory to have many more buckets than Hash Map entries, and it is a waste of time to have few buckets than Hash Map entries.
Note: It is feasible for a social security number to have up to 11 digits, meaning that 100 billion distinct social security numbers might be stored. This surpasses the population of any nation and surpasses the total number of people on Earth.
It is therefore a massive waste of space (mainly empty buckets) to use an array where each person’s social security number is the index in the array where this person is stored.
It makes more sense to use a hash map (or a database with comparable features) since the number of buckets can be changed to correspond with the number of users.
Hash Map Implementation
Although Python’s dictionary data type is usually used to create hash maps, we won’t utilize it here to better understand how hash maps operate.
We construct a class called SimpleHashMap in Python to implement a hash map.
The SimpleHashMap class contains methods for the put, get, and delete fundamental hash map operations in addition to a method called hash_function for the hash function. The method __init__ initializes the hash map.
To have a better idea of the Hash Map’s appearance, we also develop the print_map method.
Example
class SimpleHashMap:
def __init__(self, size=100):
self.size = size
self.buckets = [[] for _ in range(size)] # A list of buckets, each is a list (to handle collisions)
def hash_function(self, key):
# Sum only the numerical values of the key, ignoring non-numeric characters
numeric_sum = sum(int(char) for char in key if char.isdigit())
return numeric_sum % 10 # Perform modulo 10 on the sum
def put(self, key, value):
# Add or update a key-value pair
index = self.hash_function(key)
bucket = self.buckets[index]
for i, (k, v) in enumerate(bucket):
if k == key:
bucket[i] = (key, value) # Update existing key
return
bucket.append((key, value)) # Add new key-value pair if not found
def get(self, key):
# Retrieve a value by key
index = self.hash_function(key)
bucket = self.buckets[index]
for k, v in bucket:
if k == key:
return v
return None # Key not found
def remove(self, key):
# Remove a key-value pair
index = self.hash_function(key)
bucket = self.buckets[index]
for i, (k, v) in enumerate(bucket):
if k == key:
del bucket[i] # Remove the key-value pair
return
def print_map(self):
# Print all key-value pairs in the hash map
print("Hash Map Contents:")
for index, bucket in enumerate(self.buckets):
print(f"Bucket {index}: {bucket}")
We can construct the same hash map that appears at the top of this page by using the SimpleHashMap class:
Example
class SimpleHashMap:
def __init__(self, size=100):
self.size = size
self.buckets = [[] for _ in range(size)] # A list of buckets, each is a list (to handle collisions)
def hash_function(self, key):
# Sum only the numerical values of the key, ignoring non-numeric characters
numeric_sum = sum(int(char) for char in key if char.isdigit())
return numeric_sum % 10 # Perform modulo 10 on the sum
def put(self, key, value):
# Add or update a key-value pair
index = self.hash_function(key)
bucket = self.buckets[index]
for i, (k, v) in enumerate(bucket):
if k == key:
bucket[i] = (key, value) # Update existing key
return
bucket.append((key, value)) # Add new key-value pair if not found
def get(self, key):
# Retrieve a value by key
index = self.hash_function(key)
bucket = self.buckets[index]
for k, v in bucket:
if k == key:
return v
return None # Key not found
def remove(self, key):
# Remove a key-value pair
index = self.hash_function(key)
bucket = self.buckets[index]
for i, (k, v) in enumerate(bucket):
if k == key:
del bucket[i] # Remove the key-value pair
return
def print_map(self):
# Print all key-value pairs in the hash map
print("Hash Map Contents:")
for index, bucket in enumerate(self.buckets):
print(f"Bucket {index}: {bucket}")
# Creating the Hash Map from the simulation
hash_map = SimpleHashMap(size=10)
# Adding some entries
hash_map.put("123-4567", "Charlotte")
hash_map.put("123-4568", "Thomas")
hash_map.put("123-4569", "Jens")
hash_map.put("123-4570", "Peter")
hash_map.put("123-4571", "Lisa")
hash_map.put("123-4672", "Adele")
hash_map.put("123-4573", "Michaela")
hash_map.put("123-6574", "Bob")
hash_map.print_map()
# Demonstrating retrieval
print("\nName associated with '123-4570':", hash_map.get("123-4570"))
print("Updating the name for '123-4570' to 'James'")
hash_map.put("123-4570","James")
# Checking if Peter is still there
print("Name associated with '123-4570':", hash_map.get("123-4570"))