Hashing, Hash Tables and Scatter Tables

1. Hashing

The process of transforming input data (a Key) of any size into a fixed-size value (a Hash). This value is usually a single number.

2. Hash Table

A data structure that stores Key-Value pairs. It uses Hashing to calculate an index into an array, allowing you to find any entry instantly (O(1)).

3. Scatter Table

A specific implementation of a Hash Table that stores all elements directly within the array itself. Elements “scatter” into open slots if their target index is already occupied.

1. Hash Functions

A Hash Function is any function that maps data of arbitrary size to a fixed size. In a Hash Table, the input is your data Key, and the output is the array index where you store it. A simple yet effective function uses the modulo operator (%):

Index = Key % TableSize

Try it below! Enter any numerical Key (e.g., 523) and see how it maps to an index between 0 and 9 in our small TableSize = 10 visualization.

2. Methods & The Collision Problem

The great O(1) dream breaks when two different keys generate the same hash index. This is a Collision. Good Hash Tables are defined by how they resolve this inevitable conflict.

Watch what happens when you attempt to insert keys 7, 17, and 27 into a table of size 5 (meaning H(k) = k % 5). All three resolve to index 2.

Method A: Chaining (Separate Chaining)

Each array bucket holds a linked list. The first collision (17) appends to 7. The second collision (27) appends to 17.

Method B: Open Addressing (Linear Probing)

This is the standard Scatter Table approach. When Key 17 hits index 2, it looks for the next open slot (index 3). When 27 hits index 2, it probes indexes 3, then 4, finding 4 is empty.

3. Advanced: The Scatter Table

A true Scatter Table uses Open Addressing. It maintains O(1) average lookup only as long as the table isn’t full. It scatters data into available spots.

This implements Linear Probing. We define an insertion sequence where keys will inevitably collide. Watch the probing process (the orange highlight) automatically finding the next ‘scattered’ slot if the target index is full.

Ready to insert. Sequential keys will collide at Index 5.

Applications

Hashing and Hash Tables are ubiquitous in software engineering:

Database Indexing: For fast O(1) lookups on primary keys.
Symbol Tables (Compilers): To quickly store variable names and their memory addresses.
Caches: Key-Value caches (like Memcached or browser storage) rely on hash-based storage.
Cryptography (Hashing Only): MD5 or SHA-256 are complex non-modulo functions used to generate unique “fingerprints” for files or passwords.

Quiz: Test Your Hash Table Knowledge

Score: 0 / 10

References & Further Reading

Book Data Structures and Algorithms with Object-Oriented Design Patterns in C++
Preiss, Bruno R. (August 1998). John Wiley & Sons, Inc., United States. ISBN: 0471241342
Lecture Notes Lecture 11: Hashing For Efficient Look-up Tables
National University of Singapore (NUS). Available: https://www.comp.nus.edu.sg/~stevenha/cs1020e/lectures/L11%20-%20Hashing.pdf
Web Resource The Hash Table Library
Utah.edu. Available: https://www.math.utah.edu/~beebe/software/hash/hash.html
Article Hash Function
Wikipedia, The Free Encyclopedia. Available: https://en.wikipedia.org/wiki/Hash_function