HashMaps

Maps that Hash

What is a HashMap?

  • Key-Value Storage: Maps unique keys to specific values
  • Efficient Access: Keys hashed for fast lookups
  • Common Uses: Dictionaries, caches, and associative arrays

Key HashMap Concepts

  • Hash Function: Converts keys to indices
  • Buckets: Array slots where key-value pairs are stored
  • Collision Handling: Manages multiple keys mapping to the same index

Minimal HashMap Requirements

  • Internal Array: Use numpy for fixed-size storage
  • Basic Hashing: Python’s built-in hash() function
  • Collision Handling: Chaining to handle collisions
import numpy as np

# Using numpy for internal storage in HashMap

Setting Up the HashMap Class

  • Attributes:
    • size: Number of slots in the array
    • table: Numpy array of lists to store key-value pairs
class HashMap:
    def __init__(self, size: int = 10):
        self.size: int = size
        self.table: np.ndarray = np.empty(self.size, dtype=object)
        for i in range(self.size):
            # Initialize each slot with an empty list for keys and for values
            self.table[i] = ([],[])

Hash Function

  • Method: _hash(key: str) -> int
  • Uses Python’s hash() and modulo operation to ensure the key maps within array bounds
    def _hash(self, key: str) -> int:
        return hash(key) % self.size

Adding Key-Value Pairs

  • Method: put(key: str, value: int) -> None
  • Hashes the key and appends to the correct bucket, handling collisions by chaining
    def put(self, key: str, value: int) -> None:
        index: int = self._hash(key)
        bucket: list = self.table[index]
        bucket[0].append(key)
        bucket[1].append(value)

Retrieving Values

  • Method: get(key: str) -> int | None
  • Hashes the key and searches the bucket; returns None if key not found
    def get(self, key: str) -> int | None:
        index: int = self._hash(key)
        keys, values = self.table[index]
        if key in keys:
            return values[keys.index(key)]
        return None  # Key not found

Removing Key-Value Pairs

  • Method: remove(key: str) -> None
  • Searches and removes the key-value pair from the bucket if it exists
    def remove(self, key: str) -> None:
        index: int = self._hash(key)
        keys, values = self.table[index]
        if key in keys:
            values.pop(keys.index(key))
            keys.pop(keys.index(key))

Example Usage

  • Create a HashMap instance
  • Add, retrieve, and remove values
h = HashMap()
print(h, h.size, h.table)
h.put(261, "Software Development")
h.put(262, "Web Development")
h.put(276, "Cryptographic Systems")
print(h.get(261), h.get(262))
h.remove(261)
print(h.get(261), h.get(262))

We see:

Software Development Web Development
None Web Development

Performance Considerations

  • Load Factor: Ratio of items to array size; impacts performance
  • Resizing: Option to expand array and rehash items if needed

Future Enhancements

  • Dynamic Resizing: Expand the array size automatically
  • Better Hashing: Improve distribution of keys
  • Enhanced Collision Handling: Options like open addressing

Conclusion

  • We’ve implemented a basic HashMap in Python using numpy
  • Key components: Hashing, storing key-value pairs, and handling collisions
  • Good foundation for further hashmap learning

Live Code

  • Any questions?
# Thanks for attending!