(Start at the beginning of the series – and all the source can be found in my github repo)
Now that I was used to writing C++11-style constexpr
, I decided to try some compile-time string hashing. It turns out that FNV1 is very easy to express in a move-down-the-string recursive style:
namespace detail { constexpr uint64_t fnv1(uint64_t h, const char* s) { return (*s == 0) ? h : fnv1((h * 1099511628211ull) ^ static_cast<uint64_t>(*s), s+1); } } constexpr uint64_t fnv1(const char* s) { return true ? detail::fnv1(14695981039346656037ull, s) : throw err::fnv1_runtime_error; } |
That wasn’t difficult enough
So far so good. How about some other hash functions? On a previous project I had used MurmurHash, so I decided to give that a go.
(Coincidentally, a blog post has recently surfaced implementing 32-bit Murmur3 using C++14 constexpr
. But I was sticking to C++11 constexpr
.)
Implementing such a big function from cold in C++11 constexpr
was bound to be difficult, so I scaffolded. I used the freely-available runtime implementation for testing, and initially I converted it to C++14 constexpr
, which was fairly easy. After doing that, and with Wikipedia as a pseudocode guide I was ready to start breaking down (or building up) the functionality, C++11-style. At the base of 32-bit Murmur3 is the hash round function which is done for each 4-byte chunk of the key:
constexpr uint32_t murmur3_32_k(uint32_t k) { return (((k * 0xcc9e2d51) << 15) | ((k * 0xcc9e2d51) >> 17)) * 0x1b873593; } constexpr uint32_t murmur3_32_hashround( uint32_t k, uint32_t hash) { return (((hash^k) << 13) | ((hash^k) >> 19)) * 5 + 0xe6546b64; } |
And this can easily be put into a loop, with a helper function to constitute 4-byte chunks (the somewhat strange formulation of word32le
here is because it is used differently in some other code):
constexpr uint32_t word32le(const char* s, int len) { return (len > 0 ? static_cast<uint32_t>(s[0]) : 0) + (len > 1 ? (static_cast<uint32_t>(s[1]) << 8) : 0) + (len > 2 ? (static_cast<uint32_t>(s[2]) << 16) : 0) + (len > 3 ? (static_cast<uint32_t>(s[3]) << 24) : 0); } constexpr uint32_t word32le(const char* s) { return word32le(s, 4); } constexpr uint32_t murmur3_32_loop( const char* key, int len, uint32_t hash) { return len == 0 ? hash : murmur3_32_loop( key + 4, len - 1, murmur3_32_hashround( murmur3_32_k(word32le(key)), hash)); } |
So this is the first part of Murmur3 that will process all the 4-byte chunks of key. What remains is to deal with the trailing bytes (up to 3 of them) and then finalize the hash. To deal with the trailing bytes, I chain the end functions (_end0
through _end3
) and “jump in” at the right place. There is probably a more elegant way to do this, but…
constexpr uint32_t murmur3_32_end0(uint32_t k) { return (((k*0xcc9e2d51) << 15) | ((k*0xcc9e2d51) >> 17)) * 0x1b873593; } constexpr uint32_t murmur3_32_end1(uint32_t k, const char* key) { return murmur3_32_end0( k ^ static_cast<uint32_t>(key[0])); } constexpr uint32_t murmur3_32_end2(uint32_t k, const char* key) { return murmur3_32_end1( k ^ (static_cast<uint32_t>(key[1]) << 8), key); } constexpr uint32_t murmur3_32_end3(uint32_t k, const char* key) { return murmur3_32_end2( k ^ (static_cast<uint32_t>(key[2]) << 16), key); } constexpr uint32_t murmur3_32_end(uint32_t hash, const char* key, int rem) { return rem == 0 ? hash : hash ^ (rem == 3 ? murmur3_32_end3(0, key) : rem == 2 ? murmur3_32_end2(0, key) : murmur3_32_end1(0, key)); } |
Finalizing the hash is a very similar affair, with 3 stages:
constexpr uint32_t murmur3_32_final1(uint32_t hash) { return (hash ^ (hash >> 16)) * 0x85ebca6b; } constexpr uint32_t murmur3_32_final2(uint32_t hash) { return (hash ^ (hash >> 13)) * 0xc2b2ae35; } constexpr uint32_t murmur3_32_final3(uint32_t hash) { return (hash ^ (hash >> 16)); } constexpr uint32_t murmur3_32_final(uint32_t hash, int len) { return murmur3_32_final3( murmur3_32_final2( murmur3_32_final1(hash ^ static_cast<uint32_t>(len)))); } |
And that’s all there is to it: all that remains is to stitch the pieces together and provide the driver function:
constexpr uint32_t murmur3_32_value(const char* key, int len, uint32_t seed) { return murmur3_32_final( murmur3_32_end( murmur3_32_loop(key, len/4, seed), key+(len/4)*4, len&3), len); } constexpr uint32_t murmur3_32(const char *key, uint32_t seed) { return true ? murmur3_32_value(key, strlen(key), seed) : throw err::murmur3_32_runtime_error; } |
Doing strlen
non-naively
One more thing: strlen
crept in there. This is of course a constexpr
version of strlen
. The naive way to do this is to step down the string linearly, as seen in the FNV1 algorithm. With a max recursive depth of 512, I thought this fairly limiting… it might not be uncommon to have 1k string literals that I want to hash?
So the way I implemented strlen
was to measure the string by chunks, constraining a max recursion depth of say 256. Instead of a plain linear recursion, it’s basically recursing for every 256 bytes of string length, and then recursing on the chunk inside that. So the max depth is 256 + 256 = 512, and we can deal with strings close to 64k in size (depending on how deep we are already when calling strlen
). There’s a chance that strlen
might be made constexpr
in the future – I think it’s already implemented as an intrinsic in some compilers. So maybe I can throw away that code someday.
Anyway, now we have a complete implementation of 32-bit Murmur3 in C++11 constexpr
style, and the tests go something like this (and can be checked online):
static_assert(murmur3_32("hello, world", 0) == 345750399, "murmur3 test 1"); static_assert(murmur3_32("hello, world1", 0) == 3714214180, "murmur3 test 2"); static_assert(murmur3_32("hello, world12", 0) == 83041023, "murmur3 test 3"); static_assert(murmur3_32("hello, world123", 0) == 209220029, "murmur3 test 4"); static_assert(murmur3_32("hello, world1234", 0) == 4241062699, "murmur3 test 5"); static_assert(murmur3_32("hello, world", 1) == 1868346089, "murmur3 test 6"); |
OK. That was a lot of code for a blog post. But after the trivial FNV1 and the only slightly harder Murmur3, I wanted more of a challenge. What about MD5? Or SHA-256? That’s for the next post.