C++11 compile-time string hashing – Why is a raven like a writing desk?

(Start at the beginning of the series – and all the source can be found in my github repo)

Now that I was used to writing C++11-style constexpr, I decided to try some compile-time string hashing. It turns out that FNV1 is very easy to express in a move-down-the-string recursive style:

namespace detail
{
  constexpr uint64_t fnv1(uint64_t h, const char* s)
  {
    return (*s == 0) ? h :
      fnv1((h * 1099511628211ull) ^
           static_cast(*s), s+1);
  }
}
constexpr uint64_t fnv1(const char* s)
{
  return true ?
    detail::fnv1(14695981039346656037ull, s) :
    throw err::fnv1_runtime_error;
}

That wasn’t difficult enough

So far so good. How about some other hash functions? On a previous project I had used MurmurHash, so I decided to give that a go.

(Coincidentally, a blog post has recently surfaced implementing 32-bit Murmur3 using C++14 constexpr. But I was sticking to C++11 constexpr.)

Implementing such a big function from cold in C++11 constexpr was bound to be difficult, so I scaffolded. I used the freely-available runtime implementation for testing, and initially I converted it to C++14 constexpr, which was fairly easy. After doing that, and with Wikipedia as a pseudocode guide I was ready to start breaking down (or building up) the functionality, C++11-style. At the base of 32-bit Murmur3 is the hash round function which is done for each 4-byte chunk of the key:

constexpr uint32_t murmur3_32_k(uint32_t k)
{
  return (((k * 0xcc9e2d51) << 15)
          | ((k * 0xcc9e2d51) >> 17)) * 0x1b873593;
}

constexpr uint32_t murmur3_32_hashround(
    uint32_t k, uint32_t hash)
{
  return (((hash^k) << 13)
          | ((hash^k) >> 19)) * 5 + 0xe6546b64;
}

And this can easily be put into a loop, with a helper function to constitute 4-byte chunks (the somewhat strange formulation of word32le here is because it is used differently in some other code):

constexpr uint32_t word32le(const char* s, int len)
{
  return
    (len > 0 ? static_cast(s[0]) : 0)
    + (len > 1 ? (static_cast(s[1]) << 8) : 0)
    + (len > 2 ? (static_cast(s[2]) << 16) : 0)
    + (len > 3 ? (static_cast(s[3]) << 24) : 0);
}

constexpr uint32_t word32le(const char* s)
{
  return word32le(s, 4);
}

constexpr uint32_t murmur3_32_loop(
    const char* key, int len, uint32_t hash)
{
  return len == 0 ? hash :
    murmur3_32_loop(
        key + 4,
        len - 1,
        murmur3_32_hashround(
            murmur3_32_k(word32le(key)), hash));
}

So this is the first part of Murmur3 that will process all the 4-byte chunks of key. What remains is to deal with the trailing bytes (up to 3 of them) and then finalize the hash. To deal with the trailing bytes, I chain the end functions (_end0 through _end3) and "jump in" at the right place. There is probably a more elegant way to do this, but...

constexpr uint32_t murmur3_32_end0(uint32_t k)
{
  return (((k*0xcc9e2d51) << 15)
          | ((k*0xcc9e2d51) >> 17)) * 0x1b873593;
}

constexpr uint32_t murmur3_32_end1(uint32_t k,
                                   const char* key)
{
  return murmur3_32_end0(
      k ^ static_cast(key[0]));
}

constexpr uint32_t murmur3_32_end2(uint32_t k,
                                   const char* key)
{
  return murmur3_32_end1(
      k ^ (static_cast(key[1]) << 8), key);
}

constexpr uint32_t murmur3_32_end3(uint32_t k,
                                   const char* key)
{
  return murmur3_32_end2(
      k ^ (static_cast(key[2]) << 16), key);
}

constexpr uint32_t murmur3_32_end(uint32_t hash,
                                  const char* key, int rem)
{
  return rem == 0 ? hash :
    hash ^ (rem == 3 ? murmur3_32_end3(0, key) :
            rem == 2 ? murmur3_32_end2(0, key) :
            murmur3_32_end1(0, key));
}

Finalizing the hash is a very similar affair, with 3 stages:

constexpr uint32_t murmur3_32_final1(uint32_t hash)
{
  return (hash ^ (hash >> 16)) * 0x85ebca6b;
}
constexpr uint32_t murmur3_32_final2(uint32_t hash)
{
  return (hash ^ (hash >> 13)) * 0xc2b2ae35;
}
constexpr uint32_t murmur3_32_final3(uint32_t hash)
{
  return (hash ^ (hash >> 16));
}

constexpr uint32_t murmur3_32_final(uint32_t hash, int len)
{
  return
    murmur3_32_final3(
        murmur3_32_final2(
            murmur3_32_final1(hash ^ static_cast(len))));
}

And that's all there is to it: all that remains is to stitch the pieces together and provide the driver function:

constexpr uint32_t murmur3_32_value(const char* key, int len,
                                    uint32_t seed)
{
  return murmur3_32_final(
      murmur3_32_end(
          murmur3_32_loop(key, len/4, seed),
          key+(len/4)*4, len&3),
      len);
}

constexpr uint32_t murmur3_32(const char *key, uint32_t seed)
{
  return true ?
    murmur3_32_value(key, strlen(key), seed) :
    throw err::murmur3_32_runtime_error;
}

Doing strlen non-naively

One more thing: strlen crept in there. This is of course a constexpr version of strlen. The naive way to do this is to step down the string linearly, as seen in the FNV1 algorithm. With a max recursive depth of 512, I thought this fairly limiting... it might not be uncommon to have 1k string literals that I want to hash?

So the way I implemented strlen was to measure the string by chunks, constraining a max recursion depth of say 256. Instead of a plain linear recursion, it's basically recursing for every 256 bytes of string length, and then recursing on the chunk inside that. So the max depth is 256 + 256 = 512, and we can deal with strings close to 64k in size (depending on how deep we are already when calling strlen). There's a chance that strlen might be made constexpr in the future - I think it's already implemented as an intrinsic in some compilers. So maybe I can throw away that code someday.

Anyway, now we have a complete implementation of 32-bit Murmur3 in C++11 constexpr style, and the tests go something like this (and can be checked online):

static_assert(murmur3_32("hello, world", 0) == 345750399,
              "murmur3 test 1");
static_assert(murmur3_32("hello, world1", 0) == 3714214180,
              "murmur3 test 2");
static_assert(murmur3_32("hello, world12", 0) == 83041023,
              "murmur3 test 3");
static_assert(murmur3_32("hello, world123", 0) == 209220029,
              "murmur3 test 4");
static_assert(murmur3_32("hello, world1234", 0) == 4241062699,
              "murmur3 test 5");
static_assert(murmur3_32("hello, world", 1) == 1868346089,
              "murmur3 test 6");

OK. That was a lot of code for a blog post. But after the trivial FNV1 and the only slightly harder Murmur3, I wanted more of a challenge. What about MD5? Or SHA-256? That's for the next post.

C++

Comments (3)

Alan Wolfe says:

14 October, 2015 at 1:18 pm

This is neat! It’s neat to see that it’s possible in C++ 11. Have you noticed how this impacts compile time at all?

I hope MS puts a “compile time profiler” in VS soon. This would be a great excuse to get that in to help with constexpr stuff, as well as in general compile times 😛
elbeno says:

14 October, 2015 at 1:29 pm

Some of the math functions do impact compile time, as you’d expect: they work by iterating to convergence, and that can be slowish sometimes. I haven’t tested the string hashing with large strings, but for the ones I tested there is no humanly-noticeable difference to compile time. Hash functions (at least noncryptographic ones) are generally made to be fast, using cheap operations.
Oleg says:

20 October, 2015 at 1:00 am

Is this achievable in C99?
Probably it’s can be done only with C-macros.
Any idea?

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Posts

Comments (3)

Leave a Reply