{"id":1284,"date":"2015-10-14T23:17:47","date_gmt":"2015-10-15T06:17:47","guid":{"rendered":"http:\/\/www.elbeno.com\/blog\/?p=1284"},"modified":"2015-10-15T08:51:06","modified_gmt":"2015-10-15T15:51:06","slug":"compile-time-counters-revisited","status":"publish","type":"post","link":"https:\/\/www.elbeno.com\/blog\/?p=1284","title":{"rendered":"Compile-time counters, revisited"},"content":{"rendered":"<p>(Start at the <a href=\"https:\/\/www.elbeno.com\/blog\/?p=1206\">beginning of the series<\/a> &#8211; and all the source can be found in <a href=\"https:\/\/github.com\/elbeno\/constexpr\">my github repo<\/a>)<\/p>\n<p>Some time ago I read <a href=\"http:\/\/b.atch.se\/posts\/constexpr-counter\/\">a blog post<\/a> showing how to make a compile-time counter: a <code>constexpr<\/code> function that would return monotonically increasing integers. When I first read it I didn&#8217;t really take the time to understand it fully, but now that I was on a compile-time computation kick, I decided to grok it fully.<\/p>\n<p>Without getting too far into the nitty-gritty (go read the other blog post if you&#8217;re interested), the technique relies on using template instantiations to &#8220;bring-to-life&#8221; functions that affect future template instantiations. Thus we have a <code>flag<\/code> that declares (but does not yet define) a friend function:<\/p>\n<pre lang=\"cpp\">\r\ntemplate <int N>\r\nstruct flag\r\n{\r\n  friend constexpr int adl_flag (flag<N>);\r\n};\r\n<\/pre>\n<p>And then a recursive <code>reader<\/code> function template that uses ADL and SFINAE to match against the (as-yet-undefined) friend function(s), bottoming out at zero:<\/p>\n<pre lang=\"cpp\">\r\ntemplate <int N, int = adl_flag(flag<N>{})>\r\nconstexpr int reader(int, flag<N>)\r\n{\r\n  return N;\r\n}\r\n\r\ntemplate <int N>\r\nconstexpr int reader(\r\n    float, flag<N>, int R = reader(0, flag<N-1>{}))\r\n{\r\n  return R;\r\n}\r\n\r\nconstexpr int reader(float, flag<0>)\r\n{\r\n  return 0;\r\n}\r\n<\/pre>\n<p>And finally, a <code>writer<\/code> that, when instantiated with the result of calling <code>reader<\/code>, instantiates the friend function, making the next call to <code>reader<\/code> terminate at one level higher:<\/p>\n<pre lang=\"cpp\">\r\ntemplate <int N>\r\nstruct writer\r\n{\r\n  friend constexpr int adl_flag (flag<N>)\r\n  {\r\n    return N;\r\n  }\r\n  static constexpr int value = N;\r\n};\r\n<\/pre>\n<p>This is a curiosity, right? A foible of C++, a fairy tale told by wizened programmers to fresh graduates to simultaneously impress and revolt them, no? Could anything useful be done with this? Well, C++ is full of such tales, and it&#8217;s a short hop from can&#8217;t-look-away-revolting to established feature &#8211; after all, template metaprogramming was discovered pretty much by accident&#8230;<\/p>\n<p>In fact, while at CppCon, I met up with Ansel and Barbara from <a href=\"http:\/\/www.copperspice.com\/\">CopperSpice<\/a>, who are using a very similar technique to do away with the Qt Metaobject Compiler.<\/p>\n<p><strong>Max recursion depth, we meet again<\/strong><\/p>\n<p>My first thought was that this technique suffers at the hands of my old enemy, maximum recursion depth. In this case, maximum template instantiation depth, which despite a standard-recommended 1024, is frequently just 256 &#8211; lower than the recommended <code>constexpr<\/code> recursion depth of 512. So let&#8217;s do something about that.<\/p>\n<p>Well, one quick-and-dirty way to do this is to compute the count in two halves: lower bits and upper bits, and then stick them together. When we reach the max on the lower bits, we&#8217;ll roll over one of the upper bits. So we have two flags representing the high and low, with the low flag also parameterized on the high bits:<\/p>\n<pre lang=\"cpp\">\r\ntemplate <int H, int L>\r\nstruct flag1\r\n{\r\n  friend constexpr int adl_flag1(flag1<H, L>);\r\n};\r\ntemplate <int H>\r\nstruct flag2\r\n{\r\n  friend constexpr int adl_flag2(flag2<H>);\r\n};\r\n<\/pre>\n<p>And two readers: the low bits reader is in a struct to avoid partial specialization of a function, because it&#8217;s effectively parameterized on the high bits as well as the low bits.<\/p>\n<pre lang=\"cpp\">\r\ntemplate <int H>\r\nstruct r1\r\n{\r\n  template <int L, int = adl_flag1(flag1<H, L>{})>\r\n  static constexpr int reader(int, flag1<H, L>)\r\n  {\r\n    return L;\r\n  }\r\n  template <int L>\r\n  static constexpr int reader(\r\n      float, flag1<H, L>, int R = reader(0, flag1<H, L-1>{}))\r\n  {\r\n    return R;\r\n  }\r\n  static constexpr int reader(float, flag1<H, 0>)\r\n  {\r\n    return 0;\r\n  }\r\n};\r\n\r\ntemplate <int H, int = adl_flag2(flag2<H>{})>\r\nconstexpr int reader(int, flag2<H>)\r\n{\r\n  return H;\r\n}\r\ntemplate <int H>\r\nconstexpr int reader(\r\n    float, flag2<H>, int R = reader(0, flag2<H-1>{}))\r\n{\r\n  return R;\r\n}\r\nconstexpr int reader(float, flag2<0>)\r\n{\r\n  return 0;\r\n}\r\n<\/pre>\n<p>The low bits writer looks much the same as before, and the high bits writer is specialized on a <code>bool<\/code> indicating whether or not to instantiate the friend function, which we only do when the low bits roll over:<\/p>\n<pre lang=\"cpp\">\r\ntemplate <int H, bool B>\r\nstruct writehi\r\n{\r\n  friend constexpr int adl_flag2(flag2<H>)\r\n  {\r\n    return H;\r\n  }\r\n  static constexpr int value = H;\r\n};\r\n\r\ntemplate <int H>\r\nstruct writehi<H, false>\r\n{\r\n  static constexpr int value = H;\r\n};\r\n<\/pre>\n<p>The writer can then write both the high and low bits accordingly:<\/p>\n<pre lang=\"cpp\">\r\ntemplate <int H, int L>\r\nstruct writer\r\n{\r\n  static constexpr int hi_value =\r\n    writehi<H+1, L == MAX>::value;\r\n  static constexpr int lo_value =\r\n    writelo<H, (L &#038; BIT_MASK)>::value;\r\n  static constexpr int value = (H << BIT_DEPTH) + L;\r\n};\r\n<\/pre>\n<p>Using this approach we can easily increase the maximum number that we can get out of our counter from 256 to 16k or so, which is enough for one translation unit, for me.<\/p>\n<p><strong>Random acts of compiler abuse<\/strong><\/p>\n<p>Now, a counter is OK, as far as that goes, but something more useful might be nice... how about random numbers? But surely only a madman would try to implement a Mersenne Twister in C++11 <code>constexpr<\/code>-land. (This despite the fact that I did SHA256 string hashing.) No, these days when I think of random numbers, I think of Melissa O'Neill and her excellent <a href=\"http:\/\/www.pcg-random.org\/\">PCG32<\/a>. If you haven't seen <a href=\"https:\/\/www.youtube.com\/watch?v=45Oet5qjlms\">her video<\/a>, go watch it. I'll wait here. PCG32 has some distinct advantages for <code>constexpr<\/code> implementation:<\/p>\n<ul>\n<li>It's easy to implement<\/li>\n<li>It's fast<\/li>\n<li>It's not a lot of code (seriously, ~10 lines)<\/li>\n<li>It's easy to implement<\/li>\n<li>It's understandable<\/li>\n<li>It's easy to implement<\/li>\n<\/ul>\n<p>Here's a simple implementation of the whole 32-bit affair:<\/p>\n<pre lang=\"cpp\">\r\nconstexpr uint64_t pcg32_advance(uint64_t s)\r\n{\r\n  return s * 6364136223846793005ULL\r\n    + (1442695040888963407ULL | 1);\r\n}\r\n\r\nconstexpr uint64_t pcg32_advance(uint64_t s, int n)\r\n{\r\n  return n == 0 ? s : pcg32_advance(pcg32_advance(s), n-1);\r\n}\r\n\r\nconstexpr uint32_t pcg32_xorshift(uint64_t s)\r\n{\r\n  return ((s >> 18u) ^ s) >> 27u;\r\n}\r\n\r\nconstexpr uint32_t pcg32_rot(uint64_t s)\r\n{\r\n  return s >> 59u;\r\n}\r\n\r\nconstexpr uint32_t pcg32_output(uint64_t s)\r\n{\r\n  return (pcg32_xorshift(s) >> pcg32_rot(s))\r\n    | (pcg32_xorshift(s) << ((-pcg32_rot(s)) &#038; 31));\r\n}\r\n<\/pre>\n<p>And now we can use exactly the same pattern as the <code>constexpr<\/code> counter, except the writer, instead of giving us an integer, will give us a random number (and we also plumb through the random seed <code>S<\/code> as a template parameter):<\/p>\n<pre lang=\"cpp\">\r\ntemplate <uint64_t S, int H, int L>\r\nstruct writer\r\n{\r\n  static constexpr int hi_value =\r\n    writehi<S, H+1, L == MAX>::value;\r\n  static constexpr int lo_value =\r\n    writelo<S, H, (L &#038; BIT_MASK)>::value;\r\n  static constexpr uint32_t value =\r\n    pcg32_output(pcg32_advance(S, (H << BIT_DEPTH) + L));\r\n};\r\n<\/pre>\n<p>There is probably a way to improve this by storing the actual PCG-computed value in a template instantiation, rather than simply using the integer to pump the PCG every time as I am doing. But it's a proof-of-concept and works well enough for now. A simple macro will give us a suitable seed for our compile-time RNG by hashing a few things together:<\/p>\n<pre lang=\"cpp\">\r\n#define cx_pcg32 \\\r\n  cx::pcg32<cx::fnv1(__FILE__ __DATE__ __TIME__) + __LINE__>\r\n<\/pre>\n<p>And now we have a compile-time RNG that is seeded differently every compile, every file, every line. There are potentially a lot of template instantiations - maybe using a lot of memory in the compiler - but we can <a href=\"https:\/\/www.elbeno.com\/blog\/?p=1288\">do some useful things<\/a> with this.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>(Start at the beginning of the series &#8211; and all the source can be found in my github repo) Some time ago I read a blog post showing how to make a compile-time counter: a constexpr function that would return monotonically increasing integers. When I first read it I didn&#8217;t&#8230;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[22],"tags":[],"class_list":["post-1284","post","type-post","status-publish","format-standard","hentry","category-cpp"],"_links":{"self":[{"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1284","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1284"}],"version-history":[{"count":5,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1284\/revisions"}],"predecessor-version":[{"id":1299,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1284\/revisions\/1299"}],"wp:attachment":[{"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1284"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1284"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1284"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}