points by aappleby 7 years ago

A good secondary hash would be the fmix methods from Murmur, or multiply-byteswap-multiply.

However, having seen a lot of terrible user-defined hashes (I do a bit of consulting on an internal Google mailing list), I strongly advise against rolling your own.

flurrything 7 years ago

The secondary hash must primarily be good at mapping the hashed value into a slot: that is, be as cheap as possible (ideally 0 cycles, fib hashing is ~5 cycles) if the primary hash was good while preventing catastrophic performance if the primary hash was bad. All of this without knowing anything about the primary hash.

vidarh 7 years ago

It'd be interesting seeing comparisons there given that the issue here was that none of the widespread implementations he surveyed, including a Google one, even tries to address this, and several of them use methods that are both worse and slower.

Seems like it's something that has gotten little attention.

thomasmg 7 years ago

As a secondary (supplemental) hash, I found the Murmur finalizer methods are ok, but you can get better if you use different constants. See my result in this answer: https://stackoverflow.com/questions/664014/what-integer-hash...

I measured the avalanche effect (the number of output bits that change if a single input bit is changed; should be nearly 16 on average for a 32 bit hash), independence of output bit changes (output bits should not depend on each other), and the probability of a change in each output bit if any input bit is changed.