Matrix Hashing: Two-Level Collision Resolution for Scalable Lookups

Executive Summary

Matrix Hashing is a novel hashing strategy that combines two hash functions and a small 2D array to dramatically reduce collisions before falling back to a sorted “overflow” list. It delivers average O(1) lookup/insertion (with worst-case O(log n) from a tiny secondary list) by placing each key in a matrix cell (row,col) computed by two mod-based hashes. When a collision occurs, the key is placed in a nearby “neighbor” cell or a sorted collision block, minimizing probing. This approach bridges academic rigor and engineering practice: it was validated in an IEEE conference paper and can be built in modern systems with modest memory.

We provide here an in-depth exposition: from the concise idea to detailed algorithms, performance analysis, and code sketches. We discuss real-world use cases (e.g. product catalog lookups, user profiling, fraud checks, telemetry deduplication) and show how Matrix Hashing can be adapted for partial-key or fuzzy queries. We compare it to Cuckoo, Hopscotch, Consistent Hashing, and B-Tree indexing on key attributes. We outline enhancements (adaptive resizing, hardware acceleration, learned-hash hybrids), evaluation metrics (throughput, load factor, hit rate), and deployment advice (sharding, monitoring, fallbacks). Diagrams illustrate the architecture and algorithm flow. This article equips system designers and engineers to understand, implement, and evaluate Matrix Hashing in practice.

Introduction

Hash tables underpin nearly every large-scale data system: from caches and key–value stores to database indexes and in-memory analytics. Engineers treasure hashing for its O(1) lookup, but collisions and clustering can still be performance hurdles. Traditional methods (chaining, linear or quadratic probing, cuckoo hashing, etc.) each have tradeoffs in speed, memory and complexity.

“Matrix Hashing” is a fresh approach introduced by Agrawal et al. (2018) that uses two-dimensional open addressing. In plain terms, we pick two prime moduli p1,p2 , make a table of size (p1+1)×p2, and hash each key twice. If the first cell (i1,i2) is free, we insert; if not, we try a neighboring cell and finally a small sorted overflow. The idea is to spread keys over a “matrix” rather than a single array to reduce clustering. The formal analysis shows constant expected time for lookups and inserts, with an extra log n factor only if a rare multi-way collision forces use of the sorted list.

In practice, this means high lookup rates with modest memory. For example, imagine using Matrix Hashing in an online shopping system to map product IDs to inventory records: a 64k-cell matrix can index millions of products, with only a few entries ever colliding into a tiny fallback list. Similarly, it can speed up user lookup in large systems or help deduplicate telemetry events at ingestion (few sensors reporting the same ID). Throughout this article, we’ll mix intuition and technical detail, with code snippets (in Python/Java), diagrams, and citations to related methods. We assume readers have some database and distributed systems background (e.g. familiarity with hash tables, caches and index structures). We’ll explain each concept clearly, but also dive deep: describing APIs, concurrency issues, advanced query variations (prefix, range, fuzzy), and deployment considerations. The goal is an actionable, comprehensive guide: think Hacker Noon narrative meets systems deep-dive.

Core Algorithm

Matrix Hashing uses a small 2D array (matrix) plus a sorted “collision block.” We pick two primes, (rows) and p2 (cols). Each key k yields two indexes:

- i1 = h(k) = k mod p1 (row)

- i2 = h'(k) = k mod p2 (col)

We interpret (i1,i2) in a 0-based matrix of size (p1+1)×p2 (the extra row handles wrap or rehash). If matrix[i1][i2] is empty, we insert k there. If occupied, we probe a “neighbor”: for example, move one row up or down (modulo the row count) to (i1-1, i2) or (i1+1, i2), checking for space. If that’s full, we repeat or try adjacent columns. Only when both the primary and secondary positions are unavailable do we place the key into a global collision removal block, typically a small array or tree, keeping its keys in sorted order. This final block provides a last resort with O(log n) search, but it’s rarely needed if matrix size is chosen well.

Non-technical summary: Picture a spreadsheet where each product or user is hashed into a unique row and column. If two items collide in the same cell, you simply try a nearby cell. Only very rarely will all nearby spots be full—in that case you slip the item into a small “overflow” list. This way, most lookups hit a constant-time array access.
Technical details: Let Matrix[0..p1][0..p2-1] be initialized to empty. For insertion of key k:
Compute i1 = k mod p1, i2 = k mod p2 . (One-indexing in description is equivalent.)
If Matrix[i1][i2] is empty, store k there. Else, probe a neighbor: e.g. try Matrix[(i1-1 mod p1+1)][i2] , or Matrix[(i1+1 mod p1+1)][i2] , or shift column. (The original paper describes scanning the “first neighbor.”)
If a neighbor slot (j,i2) is empty, insert k there.
Otherwise, insert into the collision list: a separate sorted container (e.g. a tree or sorted array). For searching, repeat the same probes: check (i1,i2) , then neighbors, then binary-search the collision list.

This two-hash, matrix-plus-list design is reminiscent of Cuckoo hashing, which also uses two hash functions to give each key two possible locations. The difference is that Cuckoo displaces existing keys (potentially cascading), whereas Matrix Hashing simply uses a second position or a list without “kicking out” others. Similarly, it resembles Hopscotch hashing in keeping keys within a small neighborhood 2, but Matrix Hashing’s 2D layout is simpler conceptually.

Time/Space Complexity: Average lookup and insert are O(1). In the ideal (and typical) case, only one or two array probes are needed. Only in the pathological case where many keys all collide on the same row/col do we hit the overflow list. Because that list is kept sorted, lookup there is O(log m) where m is small. The original analysis showed “best case time complexity ... O(1) and in the worst case ... we

retrieve in O(log n)” 3. (In effect, like many hashing methods, we trade a bit of extra memory to make worst-case very rare.)

Memory overhead is roughly (p1+1)*p2 array slots plus the occasional overflow entry. One can tune p1,p2 to be proportional to the expected number of keys, keeping load factor comfortably below 1 (e.g. size ≈1.2×N). Because keys only live once in the matrix or list, space is linear. Unlike chained hashing, there’s no per-key pointer overhead for the matrix part, just a tight 2D array.

Implementation Guide

Data Structures: Use a 2D array (e.g. Key[][] matrix = new Key[p1+1][p2]; ) for the main table. Also maintain a small sorted container for overflow (e.g. a binary search tree or array list with binary insert). The array can be pre-filled with a sentinel (like null or -1 ) to mark empty cells. In code, wrap these in a class with insert() and find() methods.
Hash Functions: Pick two independent hash functions. A simple choice: h(k) = k mod p1, h'(k) = k mod p2 , where p1,p2 are prime and on the order of √N (if N is total keys). In practice one can use more sophisticated hash mixing (e.g. multiply-shift) before mod to avoid patterns. The key is to distribute keys roughly uniformly across the 2D space.
Insertion API:

def insert(matrix, overflow, key):
    i1 = key % p1
    i2 = key % p2
    if matrix[i1][i2] is empty:
        matrix[i1][i2] = key
    else:
        # first level collision: try neighbor
        j = (i1 - 1) % (p1+1) # e.g. go up one row
        if matrix[j][i2] is empty:
            matrix[j][i2] = key
        else:
            # second level collision: use sorted overflow
            overflow.insert_sorted(key)

The actual neighbor strategy (up/down/left/right) can be tuned. The paper’s approach was to try one “neighbor” (e.g. the next row) and then if still occupied, go to the block. One could extend to try both up and down, or multiple cells, before fallback.

Lookup API:

def find(matrix, overflow, key):
    i1 = key % p1
    i2 = key % p2
    if matrix[i1][i2] == key: return True
    # check neighbor
    j = (i1 - 1) % (p1+1)
    if matrix[j][i2] == key: return True
    # finally check overflow (binary search)
    return overflow.contains(key)

This corresponds to Algorithm 2 in the paper. If you allow multiple neighbors, you’d loop: check (i1+1,i2), (i1,i2-1) , etc. In practice one or two checks usually suffice if the (i1-1,i2), table is not near full.

Concurrency: In a multi-threaded system, we can lock by row or use atomic operations. A strategy inspired by Hopscotch hashing is to partition the matrix into segments (e.g. groups of consecutive rows) and protect each segment with a separate lock 4. That way, one thread can insert into rows 0–99 while another works on 100–199, etc. If the keys are well-distributed, contention stays low. Alternatively, use lock-free techniques (CAS) on individual cells if supported.
Persistence and Sharding: The matrix fits naturally into memory but could also be sharded across machines. For example, in a distributed cache, you could place a Matrix Hash table on each shard (node) handling a range of keys (by consistent hashing). Since each shard sees only its subset of keys, choose p1,p2 per shard size. For persistence, store the 2D array as a file or serialize it; on restart, rebuild with logs or checkpoints.
Resizing: Like any hash table, if load factor exceeds a threshold (say >70%), you should rebuild with larger primes. A safe approach: double one prime (keeping the other fixed) and re-insert all keys. Because the algorithm is simple, rehashing is straightforward.

Integration Patterns and Use Cases

Search Engines: For exact-match document or user lookups (e.g. mapping a query ID to precomputed results), Matrix Hashing can act as an in-memory index. It is fast and memory-efficient. Unlike a B-tree, it isn’t ordered, so range queries must fall back to scanning; but it excels at point queries.
Recommendation Systems: Frequently we need to quickly fetch a user profile or product details by key. Matrix Hashing can underlie a cache or short-term store for these hot lookups. It can also be used to deduplicate recent events or refresh counts (e.g. keep only one entry per user per time window), since overflow is small and sorted.
Web Caches/CDNs: In an edge caching scenario, keys (URLs or chunk IDs) are often hashed to lookup content. Matrix Hashing could be a local in-memory map to cache these keys. It’s faster than disk-based indexes and can complement consistent hashing for node distribution (consistent hashing picks the server; matrix hashing picks the offset within the server’s cache).
Databases (In-memory Index): Think of using Matrix Hashing as the in-memory index for a key–value store or NoSQL table. It provides extremely low-latency lookups. For range scans, onemight link the matrix to a B-tree or skip-list: e.g. every cell also points into a sorted structure of values (this is an extension).
Telemetry Deduplication: In logging or event ingestion (IoT, monitoring), one often needs to ignore duplicate keys within a short span. A Matrix Hash table can quickly detect if an event’s ID is already seen (O(1) probe), and only a few duplicates hit the overflow list. Since keys are removed after a time window, the matrix is periodically reset.
Fraud Detection / Security: Lookups of suspect IDs or tokens must be ultra-fast. Matrix Hashing can store the watchlist or session tokens; it guarantees constant-time checks. The sorted overflow block can even allow efficient range or prefix matching if keys are numeric.

Flexible Lookups

Matrix Hashing can be adapted beyond exact keys:

Non-Member/Partial-Key: If you want to query “give me all keys with prefix X” (in string keys) or partial match, the matrix itself can’t do that directly. But you can store pointers: maintain, say, an inverted index from prefix to list of possible full keys. Or use the collision list (sorted) to do range scans: e.g. for numeric keys, scanning the collision list from the first key ≥ start to ≥ end yields a range result.
Fuzzy Matching: For approximate string matching, one could combine Matrix Hashing with a Bloom filter or permutation index. For example, store trigrams of each key in the matrix, or use the matrix’s collision list (which is sorted) to support nearest-neighbor lookups (binary search nearby keys). These are non-trivial extensions, but the small size of the overflow list (only colliding keys) means fuzzy lookups only examine few candidates.
Range Queries: Unlike B-trees, plain hashing doesn’t support range queries. However, one could maintain a parallel structure (like a small B-tree) on top of the matrix. Alternatively, store keys in the collision list in order and exploit that: if all collisions of similar keys ended in the list, you could scan it. Generally, Matrix Hashing is best for equality/lookups, not ranged scans.

Implementation Sketch (Python)

class MatrixHash:
    def __init__(self, p1, p2):
        self.p1 = p1
        self.p2 = p2
        self.matrix = [[None] * p2 for _ in range(p1 + 1)]
        self.collision = []  # sorted list for overflow

    def insert(self, key):
        i1 = key % self.p1
        i2 = key % self.p2

        # Primary position
        if self.matrix[i1][i2] is None:
            self.matrix[i1][i2] = key
            return

        # First collision: try neighbor row (wrap-around)
        i1_alt = (i1 - 1) % (self.p1 + 1)
        if self.matrix[i1_alt][i2] is None:
            self.matrix[i1_alt][i2] = key
            return

        # Second collision: insert sorted into overflow
        import bisect
        idx = bisect.bisect_left(self.collision, key)
        if idx == len(self.collision) or self.collision[idx] != key:
            self.collision.insert(idx, key)

    def find(self, key):
        i1 = key % self.p1
        i2 = key % self.p2

        # Check primary position
        if self.matrix[i1][i2] == key:
            return True

        # Check alternate row
        i1_alt = (i1 - 1) % (self.p1 + 1)
        if self.matrix[i1_alt][i2] == key:
            return True

        # Binary search in overflow
        import bisect
        idx = bisect.bisect_left(self.collision, key)
        return idx < len(self.collision) and self.collision[idx] == key

This code demonstrates the core idea. In practice, you’d handle deletion, resizing, and more probing strategies. In Java or a systems language, you would use fixed-size arrays and maybe a tree or skip list for collisions.

Real-World Scenarios

Product Catalogs & Inventory: E-commerce platforms often store millions of product SKUs. Looking up a SKU must be blazing fast. Matrix Hashing can index SKUs in memory: compute two 6 hash mods, probe at most two cells, then a tiny overflow list. This yields near-constant lookup time even at large scale. If SKUs have hierarchical structure (e.g. category bits), you could use those bits as one hash to improve locality. In catalogs, partial-key lookups (like “find all products with category prefix”) normally need a B-tree; with Matrix Hashing, one could quickly filter exact matches and then scan the few collisions.
User Profile Service: Services like social networks need to fetch user profiles by ID for every action. A Matrix Hash table of user IDs fits in RAM (with size tuned to expected user count). A lookup to see if a profile exists (or to get its pointer) takes O(1). If the profile object is large, you store only the key in the hash and store pointers/offsets as values. Concurrency is easy: partition by row group so multiple threads can read/write different parts (similar to segment-locking in hopscotch) 4. Updates (e.g. new user signups) just insert new keys.
Fraud/Threat Detection: Systems that check millions of login attempts or transactions against a blacklist benefit from constant-time checks. Matrix Hashing can hold the blacklist keys. Moreover, if the blacklist is updated incrementally, the matrix can grow with few collisions. In fraud detection, you might also want partial matching (e.g. IP prefixes); one could encode IP segments into the two hash functions or keep a small trie on the side.
Telemetry Event Dedup: Suppose IoT sensors send readings identified by a device ID; occasional duplicates arrive due to retries. You can keep a Matrix Hash set of recently seen IDs (say last 1 hour). On ingest, check find(ID) . If not found, process the event and insert(ID) . If found, ignore as duplicate. This avoids expensive database checks. Because duplicates are rare, the overflow list stays tiny, and lookup/insertion are constant-time.
Search Engine Indexing: Many search systems use inverted indexes (term→doc list). The term→posting lookup could be a Matrix Hash table mapping term-hash to a pointer. Exact term queries are fast. For wildcard queries (fuzzy prefix), one could store n-gram hashes or use the overflow list to test adjacent keys.

Extensions and Enhancements

Adaptive Collision Resolution: You can make the matrix dynamic. For example, if a row becomes too full (detected by collision count), split it or grow p2 for that row. One could even use a small secondary 2D array for the busiest row. The collision block itself could be made self-adjusting: e.g. use a min-heap for fastest detection of common colliding keys.
Hardware Acceleration: Modern CPUs and GPUs offer SIMD and many-core parallelism. You could accelerate Matrix Hashing by computing multiple hashes and comparisons in parallel (vectorized mod operations). On FPGAs or ASICs, a 2D array lookup can be pipelined: one can compute h and h' in hardware, then check memory for collisions without CPU intervention. For example, a network appliance checking millions of packets could use hardware Matrix Hash to detect blacklisted signatures at line rate.
ML-Hybrid Hashing: Recent research on learned indexes suggests using ML models as hashes. One could train a small model that predicts (i1,i2) from k (especially if keys follow a pattern). The model output replaces one of the hash functions, skewing keys to reduce collisions. Alternatively, an ML model could predict when collisions will occur and proactively resize or rehash. This “learned hashing” is experimental but could reduce memory or speed up lookup for non-uniform key distributions.
Adaptive Resizing: Instead of static primes, the table could monitor load and pick new primes or a larger matrix. For example, start with small p1,p2 , and if overflow grows, expand one dimension. This is similar to rehashing but done incrementally.
Combination with Other Structures: Matrix Hashing can be layered. E.g., each cell could point to a small bucket list (making it like hybrid chaining). Or each row could itself be a mini-hash. These hybrid designs may help special cases (like storing multi-field keys).

Performance Tradeoffs and Metrics

To evaluate Matrix Hashing, measure: throughput (ops/sec for insert/find), latency (mean and percentile lookup time), memory overhead, and overflow fraction (fraction of keys in the collision list). A typical experiment might compare Matrix Hashing vs. alternatives under varying load factors. We would expect:

At low-to-moderate load, all methods have ~O(1) average lookup, but difference lies in worst-case and overhead.
As load → 1, linear probing and simple open addressing degrade (due to long chains) . Cuckoo and Hopscotch maintain constant probes on average, but with more relocation work. Matrix Hashing will start to push keys into the overflow list, making lookups O(log n) for those keys.

A performance vs load factor chart (not shown) would likely rise sharply for simple hashing as α→1, stay flat longer for hopscotch/cuckoo, and remain flat until the overflow list size grows. One could also chart memory vs latency: Matrix Hashing trades a small extra array size (the 2D matrix) for very predictable latency under high load, whereas chaining trades pointers (higher memory) for consistent latency.

For metrics, use: - Hit rate: (exact vs fallback).

- Collisions per insert: track how many probes on average, compare to linear/cuckoo.

- Space efficiency: memory used per key.

- Comparisons: e.g. measure number of array accesses (should be ≤2 on average).

Architecture and Flow

graph TD
  Client --> API[API Server]
  API --> Service[Service Layer]
  Service --> MatrixCache[Matrix Hashing Cache]
  MatrixCache --> Database[Persistent Storage]
  MatrixCache --> Monitoring[Metrics/Logs]

Figure: Example deployment architecture. The service layer issues lookups/inserts on the Matrix Hashing cache. Cache hits avoid the database. Monitoring tracks latency and collision rate.

graph LR
    Key[Key to insert or find] --> Compute{Compute (i1,i2) = (h, h')}
    Compute --> |Empty| Store[Insert in matrix[i1][i2]]
    Compute --> |Collision| Probe[Check neighbor (i1±1,i2)]
    Probe --> |Found empty| Store
    Probe --> |Still occupied| Block[Insert into sorted overflow list]

Figure: Insertion/lookup flow for Matrix Hashing. A key hashes to (i1,i2). If the cell is free, we use it; otherwise we try a neighbor. If that fails, we fall back to the collision block.

The first diagram shows how Matrix Hashing integrates into a service: the cache layer uses it as a fast in-memory index. The second shows the algorithmic steps of a lookup/insert.

Comparison with Other Methods

Method

Lookup

Avg/Worst

Insert Avg/

Worst

Memory

Overhead

Supports

Range

Queries

Ideal Use Case

Matrix Hashing

O(1) /

O(log n) (if

in overflow)

O(1) /

O(log n)

~N slots +

small overflow

Not native

(exact-match only)

High-throughput

key–value

lookups, caching

Cuckoo Hashing

O(1) / O(1)

O(1) avg, can

be rehashed

if cycle

2N table

(two arrays)

Constant-time

lookup, simple

keys

Hopscotch

O(1) / O(1)

O(1)

amortized

~1.2N slots

+ bitmaps

No (exact-match)

High concurrency,

multi-threaded maps

Consistent Hashing

O(1)

(lookup node)

O(1) (node changes)

~N (with virtual nodes)

No (used for

partitioning)

Distributed

caching/

partitioning

B-Tree Index

O(log n) /

O(log n)

O(N) + tree

pointers

Yes (range

queries)

Databases

(range scans,

sorted data)

Matrix Hashing vs. Cuckoo: Both use two hashes. Cuckoo provides worst-case constant lookup (keys are either in h1 or h2 location). However, insertion can trigger multiple displacements or even a full rehash. Matrix Hashing never displaces old keys; it only probes a neighbor then uses an overflow list. This makes insertion simpler (no loops), at the cost of a small log n worst- case lookup if in overflow.
Matrix vs. Hopscotch: Hopscotch guarantees each key is within H neighbor slots of its home bucket. Lookups are then constant-time with a small fixed neighborhood. It has excellent concurrency (with segment locks). Matrix Hashing also effectively limits how far keys can move (one row up/down) and then offloads to an overflow list. Both achieve ~O(1) average, but Matrix Hashing uses a secondary sorted block for any long chains, whereas hopscotch uses bitmap walking.
Matrix vs. Consistent Hashing: Consistent hashing is not an in-memory lookup structure but a scheme for distributing keys across servers. It guarantees that when servers change, only O(1/N) of keys move. In contrast, Matrix Hashing is a single-machine index. One could combine them: use consistent hashing to pick the node, then Matrix Hashing on each node for local lookup. They solve different problems: consistent is about scaling out, Matrix is about fast single- node lookup.
Matrix vs. B-Tree: B-trees excel at ordered data and range queries . Lookups take O(log n) disk or memory accesses. Hashing (of any kind) usually beats B-trees for exact-match: average O(1) vs O(log n). Matrix Hashing is purely for equality queries; it cannot do efficient range scans (unlike B-tree). However, Matrix Hashing has simpler data structures (no pointers or node splits) and often better cache behavior for large in-memory sets.

In summary, Matrix Hashing is most similar to open addressing schemes like linear/quadratic probing or cuckoo, but with a twist of a second hash dimension and a fallback chain. It shines when you need extremely fast key lookups with low memory overhead and are willing to sacrifice ordered traversal. It complements rather than replaces B-trees or range indexes.

Deployment Considerations

Scaling: In a distributed system, partition keys by consistent hashing or range and deploy a Matrix Hash instance per shard. Ensure each shard’s size is tuned ( p1,p2 ) for its expected keys. For hot shards, you can dynamically add a new shard and use consistent hashing to move ~1/k of the data (with minimal rehash) .
Monitoring: Instrument metrics like “table load factor” (filled slots / total slots), “overflow count,” and query latency percentiles. If overflow percentage creeps up, that signals need to resize or rebalance. Monitor CPU/memory too: the 2D array may reside in RAM or cache.
Fallbacks: Decide what to do if even overflow is full (unlikely if sized well). Possibly fall back to a standard hash map or write queries to a slower database. Also, if an insertion fails (e.g. no memory), you should handle it gracefully (e.g. log an error or trigger rehash).
Data Persistence: If used for caching, treat it as ephemeral (rebuild on restart). If used for primary data, either periodically snapshot the matrix or replay updates from logs. On crash, you might lose the matrix contents unless logging is implemented.
Safety and Concurrency: Use thread-safe operations (locks or atomic CAS) around each cell access if concurrent threads may modify the same row. As noted, partitioning by segments/rows reduces contention. Also guard the overflow list with its own lock or concurrent structure (e.g. a thread-safe skiplist).
Limitations & Risks: The matrix must fit in memory; very large datasets may exceed cache if primes are too large.

If keys are pathological (many sharing same i1,i2), performance degrades. Use strong hashes to avoid this.
Not suitable for heavy range or prefix scanning natively.
Overflow sorted list can become a hotspot if not managed (e.g. many collisions hashing to the
same cell will all go to list). Keep the matrix sparsely populated to mitigate.
Choosing primes: Poor choices (e.g. too small) cause high load factor or clustering.

Conclusion and Next Steps

Matrix Hashing offers a sweet spot between theoretical hashing schemes and practical engineering. It guarantees constant-time lookup in almost all cases, with a simple fallback for rare collisions. It bridges research and practice by being easy to implement in code and highly performant in real systems.

To apply Matrix Hashing in your project, start by profiling your workload: estimate your key count and access pattern. Choose p1,p2 to keep load factor around 0.7–0.8. Implement the 2D table with neighbor probing as shown above, and carefully handle threading. Monitor key performance metrics (throughput, latency, overflow rate) and tune as needed.

Next steps: benchmark Matrix Hashing on representative data against your current solution (e.g. std::unordered_map or Redis). Check how it scales with growing data and load. Explore hybrid extensions: maybe combine it with a small B-tree for prefix queries, or plug in a learned hash function. If deploying in distributed caching, integrate with consistent hashing to balance shards.

Matrix Hashing is not a magic bullet, but for the right use cases (large-scale exact lookups, high performance, distributed caches) it can be a powerful tool. By understanding its mechanics and tradeoffs – and comparing to alternatives like cuckoo, hopscotch, and B-trees you can make informed decisions. We hope this deep dive has given you both the conceptual understanding and practical guidance to experiment with Matrix Hashing in your systems.

References:

[1], [6] Cuckoo hashing - Wikipedia
[2] Hopscotch hashing - Wikipedia
[3] Collision Resolution Techniques in Hash Table: A Review
[4] Overview of Hopscotch Hashing | Medium
[5] Understanding Consistent Hashing: The Key to Scalable Distributed Systems
[7] Understanding B-Tree and Hash Indexing in Databases
[8] A. Agrawal, S. Bhyravarapu and N. Venkata Krishna Chaitanya, "Matrix Hashing with Two Level of Collision Resolution," 2018 8th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 2018, pp. 14-15, doi: 10.1109/CONFLUENCE.2018.8442607. keywords: {Indexes;Arrays;Two dimensional displays;Probes;Cloud computing;Data science;Open Addressing;Separate Chaining;Linear Probing;Quadratic Probing;Double hashing;Universal Hash Function},