(Start at the beginning of the series – and all the source can be found in my github repo)
To ease into constexpr programming I decided to tackle some floating-point maths functions. Disclaimer: I’m not a mathematician and this code has not been rigorously tested for numeric stability or convergence in general. I wrote it more for my own learning than for serious mathematical purposes, and anyway, it’s perhaps likely that constexpr
maths functions will be in the standard library sooner or later (and possibly implemented by intrinsics). However, I did subject it to many ad-hoc tests using static_assert
. One of the nice things about compile-time computation: the tests are compile-time too, and if it compiles, one can be reasonably sure it works (at least for the values tested)!
Anyway, I went to cppreference.com’s list of common mathematical functions to see what to tackle. Starting with abs
is trivial:
Starting with abs
is trivial:
template
constexpr T abs(T x)
{
return x >= 0 ? x :
x < 0 ? -x :
throw err::abs_runtime_error;
}
For the sake of clarity, I'm omitting some enable_if
boilerplate here that ensures that the argument has the right type.
Iterative methods FTW
Now, other functions take a little thought. But remembering calculus, many of them are susceptible to iterative methods, so my general plan formed: formulate a recursive calculation which converges to the real answer, and terminate the recursion when the value is within epsilon of the answer.
For example, to compute sqrt
, we can use the well-known Newton-Raphson method, where we start with a guess g0 as an approximation to √x, then:
[latex]g_{n+1} = \frac{(g_n + \frac{x}{g_n})}{2}[/latex]
First, we need a function to determine termination, which I put in a detail
namespace:
namespace detail
{
template
constexpr bool feq(T x, T y)
{
return abs(x - y) <=
std::numeric_limits::epsilon();
}
}
And then sqrt
is straightforward to express (using the value itself as the initial guess), with a driver function handling the domain of the argument and the throw
pattern, and delegating the recursion to a function in the detail
namespace.
namespace detail
{
template
constexpr T sqrt(T x, T guess)
{
return feq(guess, (guess + x/guess)/T{2}) ? guess :
sqrt(x, (guess + x/guess)/T{2});
}
}
template
constexpr T sqrt(T x)
{
return x == 0 ? 0 :
x > 0 ? detail::sqrt(x, x) :
throw err::sqrt_domain_error;
}
This ends up being a useful pattern for many functions. For example, cbrt
is practically identical; only the equational details change. And once we have sqrt
, hypot
is trivial of course.
Next I tackled trigonometric functions. Remembering Taylor series and turning to that Oracle of All Things Maths, Wolfram Alpha, we can find the Taylor series expansions:
[latex]sin(x) = \sum_{k=0}^{\infty} \frac{(-1)^k x^{1+2k}}{(1+2k)!}[/latex]
and
[latex]cos(x) = \sum_{k=0}^{\infty} \frac{(-1)^k x^{2k}}{(2k)!}[/latex]
These are basically the same formula, and with some massaging, they can share a common recursive function with slightly different initial constants:
namespace detail
{
template
constexpr T trig_series(T x, T sum, T n, int i, int s, T t)
{
return feq(sum, sum + t*s/n) ?
sum :
trig_series(x, sum + t*s/n, n*i*(i+1), i+2, -s, t*x*x);
}
}
template
constexpr T sin(T x)
{
return true ?
detail::trig_series(x, x, T{6}, 4, -1, x*x*x) :
throw err::sin_runtime_error;
}
template
constexpr T cos(T x)
{
return true ?
detail::trig_series(x, T{1}, T{2}, 3, -1, x*x) :
throw err::cos_runtime_error;
}
Once we have sin
and cos
, tan
is trivial of course. And exp
has almost the same series as sin
and cos
- easier, actually:
[latex]e^x = \sum_{k=0}^{\infty} \frac{x^k}{k!}[/latex]
Which yields in code:
namespace detail
{
template
constexpr T exp(T x, T sum, T n, int i, T t)
{
return feq(sum, sum + t/n) ?
sum :
exp(x, sum + t/n, n * i, i+1, t * x);
}
}
template
constexpr T exp(T x)
{
return true ? detail::exp(x, T{1}, T{1}, 2, x) :
throw err::exp_runtime_error;
}
No peeking at the floating-point details
Now, how about floor
, ceil
, trunc
and round
? They are all variations on a theme: solving one basically means solving all, with only slight differences for things like negative numbers. I thought about this for a half hour - there wasn't an obvious solution to me at first. We can't trivially cast to an integral type, because there is no integral type big enough to hold a floating point value in general. And in the name of strict portability, we can't be sure that the floating point format is IEEE 754 standard (although I think I am prepared to accept this as a limitation). So even if we could fiddle with the representation, portability goes out the window. Finally, constexpr
(including C++14 constexpr
) forbids reinterpret_cast
and other tricks like access through union
members.
So how to compute floor
? After a little thought I hit upon an approach. In fact, I am relying on the IEEE 754 standard in a small way - the fact that any integer power of 2 is exactly representable as floating point. With that I formed the plan:
- Start with a guess of zero.
- Start with an increment: 2(
std::numeric_limits
- 1).::max_exponent - If (guess + increment) is larger than x, halve the increment.
- Otherwise, add the increment to the guess, and that's the new guess.
- Stop when the increment is less than 2.
This gives a binary-search like approach to finding the correct value for floor
. And the code is easy to write. First we need a function to raise a floating point value to an integer power, in the standard O(log n) way:
namespace detail
{
template
constexpr T ipow(T x, int n)
{
return (n == 0) ? T{1} :
n == 1 ? x :
n > 1 ? ((n & 1) ? x * ipow(x, n-1) :
ipow(x, n/2) * ipow(x, n/2)) :
T{1} / ipow(x, -n);
}
}
Then the actual floor
function, which uses ceil
for the case of negative numbers:
namespace detail
{
template
constexpr T floor(T x, T guess, T inc)
{
return guess + inc <= x ? floor(x, guess + inc, inc) :
inc <= T{1} ? guess : floor(x, guess, inc/T{2});
}
}
constexpr float floor(float x)
{
return x < 0 ? -ceil(-x) :
x >= 0 ? detail::floor(
x, 0.0f,
detail::ipow(2.0f,
std::numeric_limits::max_exponent-1)) :
throw err::floor_runtime_error;
}
The other functions ceil
, trunc
and round
are very similar.
That's some deep recursion
Now, there is a snag with this plan. It works well enough for float
, but when we try it with double
, it falls down. Appendix B of the C++ standard (Implementation quantities) recommends that compilers offer a minimum recursive depth of 512 for constexpr
function invocations. And on my machine, std::numeric_limits<float>::max_exponent
is 128. But for double
, it's 1024. And for long double
, it's 16384. Since by halving the increment each time, we're basically doing a linear search on the exponent, 512 recursions isn't enough. I'm prepared to go with what the standard recommends in the sense that I don't mind massaging switches for compilers that don't follow the recommendations, but I'd rather not mess around with compiler switches for those that do. So how can I get around this?
Well, I'm using a binary search to cut down the increment. What if I increase the fanout so it's not a binary search, but a trinary search or more? How big of a fanout do I need? What I need is:
max_recursion_depth > max_exponent/x, where 2x is the fanout
So to deal with double
, x = 3 and we need a fanout of 8. An octary search? Well, it's easy enough to code, even if it is a nested-ternary-operator-from-hell:
template
constexpr T floor8(T x, T guess, T inc)
{
return
inc < T{8} ? floor(x, guess, inc) :
guess + inc <= x ? floor8(x, guess + inc, inc) :
guess + (inc/T{8})*T{7} <= x ? floor8(x, guess + (inc/T{8})*T{7}, inc/T{8}) :
guess + (inc/T{8})*T{6} <= x ? floor8(x, guess + (inc/T{8})*T{6}, inc/T{8}) :
guess + (inc/T{8})*T{5} <= x ? floor8(x, guess + (inc/T{8})*T{5}, inc/T{8}) :
guess + (inc/T{8})*T{4} <= x ? floor8(x, guess + (inc/T{8})*T{4}, inc/T{8}) :
guess + (inc/T{8})*T{3} <= x ? floor8(x, guess + (inc/T{8})*T{3}, inc/T{8}) :
guess + (inc/T{8})*T{2} <= x ? floor8(x, guess + (inc/T{8})*T{2}, inc/T{8}) :
guess + inc/T{8} <= x ? floor8(x, guess + inc/T{8}, inc/T{8}) :
floor8(x, guess, inc/T{8});
}
(Apologies for the width, but I really can't make it much more readable.) For the base case, I revert to the regular binary search implementation of floor
. What about for long double
? Well, since max_exponent is 16384, x = 33 and we need a fanout of 233. That's not happening! For long double
, I have no choice but to revert to the C++14 constexpr
rules:
constexpr long double floor(long double x)
{
if (x < 0.0) return -ceil(-x);
long double inc = detail::ipow(
2.0l,
std::numeric_limits::max_exponent - 1);
long double guess = 0.0l;
for (;;)
{
while (guess + inc > x)
{
inc /= 2.0l;
if (inc < 1.0l)
return guess;
}
guess += inc;
}
throw err::floor_runtime_error;
}
And that's enough of such stuff for one blog post. Next time, I'll go through the rest of the maths functions I implemented, with help from Leonhard Euler and Wikipedia.