Elastic Hashing: Beating Uniform Probing Without Reordering
Open-addressed hash tables are one of those “so simple they’re scary” data structures: store everything in one array, and if two keys collide, just keep probing until you find an empty slot. They are fast, cache-friendly, and everywhere (databases, language runtimes, networking stacks… you name it).
But at high load factors, open addressing becomes a game of musical chairs. If the table is 1 − δ full (so only a δ-fraction of slots are free), probing can get expensive—especially if you care about the worst insertion/search and not only the average.
A recent paper by Farach-Colton, Krapivin, and Kuszmaul revisits this classic topic and shows something I honestly didn’t expect: even if you never move elements after they are inserted (no reordering), you can still do dramatically better than the “folk wisdom” implied by decades of results.
Two headline results:
(1) A non-greedy scheme with O(1) amortized expected search probes and O(log(1/δ)) worst-case expected probes.
(2) A greedy scheme with worst-case expected probes O(log²(1/δ)), disproving a long-standing conjecture by Yao.
The classic story: uniform probing and the coupon-collector wall
If you learned hashing from Knuth, you likely met uniform probing: each key gets a random permutation of the array as its probe order, and insertion takes the first free slot in that order. At load factor 1 − δ, the amortized expected probe complexity is Θ(log(1/δ)).
This is not an accident. There’s a “coupon collector” flavor here: to fill almost all slots you need to “discover” a lot of distinct empty positions, and the last few free slots become increasingly hard to hit. Yao famously proved that among greedy strategies (always take the first free slot), Θ(log(1/δ)) amortized is optimal—so many people internalized “open addressing without reordering can’t really beat log(1/δ).”
The paper’s first twist is: that intuition is too pessimistic once you allow the insertion algorithm to be non-greedy (still no moving old elements—just smarter decisions at insertion time).
Elastic hashing: probe far, then “snap back”
The first construction (called elastic hashing) answers a simple but deep question: Can we get amortized expected probe complexity better than log(1/δ) without reordering? The answer is yes—down to O(1).
The intuition in one paragraph
Elastic hashing splits the array into progressively smaller “levels” (think: halves, quarters, eighths…). When inserting a key, the algorithm may spend time probing a nearly-full level (to “collect coupons” and learn about occupancy), but if it doesn’t quickly find a good spot, it redirects the key into the next level, which is intentionally kept less full. The key idea is that the algorithm decouples insertion work from search cost: it can afford many probes while deciding, yet still place the key in a position that will be found with very few probes later.
What you get (and why it matters)
- Amortized expected search probes: O(1). Over the whole build-up to load factor 1 − δ, the average key becomes cheap to find.
- Worst-case expected search probes: O(log(1/δ)). Even the “unlucky” insertions (late in the process) are only logarithmic in the inverse slack.
- Worst-case expected insertion time: O(log(1/δ)). The scheme spreads the coupon-collecting effort so no single insertion is forced into a δ−1-style disaster.
There is also a matching lower bound: if you prohibit reordering entirely, you can’t beat Ω(log(1/δ)) for worst-case expected probe complexity in general—so elastic hashing is optimal in that sense.
Funnel hashing: greedy can be better than we thought
The second part of the paper is about the greedy world, where the insertion must take the first free slot in whatever probe sequence you give it. This is the regime where Yao’s earlier results live, and it is also where a notorious conjecture lingered for decades: that worst-case expected probe complexity should be basically Θ(1/δ) for greedy open addressing near full tables.
The paper disproves this with a construction called funnel hashing.
The funnel idea
Instead of one monolithic probe sequence, the table is organized into multiple geometrically shrinking regions,
each further divided into buckets of size about Θ(log(1/δ)). A greedy insertion “attempts” to insert into one
bucket in each region (scan the bucket; if there’s an empty slot, take the first one), and only if it fails does it
move to the next region. If all regions fail, it falls back to a small “special” area designed to behave nicely.
The outcome is a surprisingly clean bound:
- Worst-case expected probes: O(log²(1/δ)). (And the same for insertion time.)
- High-probability worst-case probes: O(log²(1/δ) + log log n). So even the tail is well-controlled with very high probability.
- Amortized expected probes: O(log(1/δ)). Which matches the classic greedy barrier.
Importantly, the authors also provide matching lower bounds showing these “weird-looking” log-squared and log-log terms are not artifacts of the proof: they are essentially the right answers for the model.
So… what changed, conceptually?
If I had to summarize the conceptual upgrade in one sentence:
Stop treating “the probe sequence” as a single linear destiny.
Once you allow structure (levels, buckets) and you carefully manage how full each region gets,
you can distribute the pain of high load factors without paying for it on every lookup.
Elastic hashing shows that “no reordering” is not the same as “stuck with uniform probing performance.” Funnel hashing shows that even within greedy rules, uniform probing isn’t the end of the story for worst-case behavior.
Why should practitioners care?
I don’t expect production hash tables to copy these constructions line-by-line tomorrow. But results like these have a way of changing what we consider “possible,” and that matters:
- High-load design space: If your system must run hot (high occupancy), it’s valuable to know that low average lookup cost and good tail behavior are not inherently incompatible with “no moving elements.”
- Latency tails: The difference between δ−1 and polylog(1/δ) worst-case expectations can be the difference between “rare annoyance” and “SLO violation.”
- Theory-to-systems inspiration: Leveling, bucketing, and “fallback zones” show up in real systems all the time. This paper gives a sharp theoretical lens for why those patterns work.
Closing remarks
Hashing is one of those areas where “it’s already solved” is almost always wrong. This paper is a great example: by slightly reframing what the insertion algorithm is allowed to do (non-greedy choices without reordering), and by engineering structure into the probe process, the authors get optimal bounds that also settle (and overturn!) long-standing beliefs.
Reference: Farach-Colton, Krapivin, Kuszmaul, “Optimal Bounds for Open Addressing Without Reordering,” arXiv:2501.02305v2 (2025).
You can find it on arXiv by searching the identifier 2501.02305.