Archive for the ‘Programming’ Category

Exercising Ranges (part 8)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

Extras from the Haskell Prelude

Having implemented iterate, there were a couple of other useful things from the Haskell Prelude that I wanted to have available with ranges. First, there’s cycle. It simply takes a finite range and repeats it ad infinitum. The functionality of the cursor is fairly simple, and uses patterns we’ve already seen, so here it is in its entirety:

template <bool IsConst>
struct cursor
{
private:
  template <typename T>
  using constify_if = meta::apply<meta::add_const_if_c<IsConst>, T>;
  using cycle_view_t = constify_if<cycle_view>;
  cycle_view_t *rng_;
  range_iterator_t<constify_if<Rng>> it_;
 
public:
  using single_pass = std::false_type;
  cursor() = default;
  cursor(cycle_view_t &rng)
    : rng_{&rng}
    , it_{begin(rng.r_)}
  {}
  constexpr bool done() const
  {
    return false;
  }
  auto current() const
  RANGES_DECLTYPE_AUTO_RETURN_NOEXCEPT
  (
    *it_
  )
  void next()
  {
    if (++it_ == end(rng_->r_))
      it_ = begin(rng_->r_);
  }
  CONCEPT_REQUIRES((bool) BidirectionalRange<Rng>())
  void prev()
  {
    if (it_ == begin(rng_->r_))
      it_ = end(rng_->r_);
    --it_;
  }
  bool equal(cursor const &that) const
  {
    return it_ == that.it_;
  }
};

Like all cursors, it implements current(), next() and equal(). Additionally, if the range passed in is bidirectional, so is the cursor, using the CONCEPT_REQUIRES macro to conditionally enable prev(). There is no sentinel type needed, and the cursor implements done() in a way that makes the range infinite.

In Haskell, cycle is occasionally useful, although it is often used with zip and filter to achieve the same effect that stride – a function already provided by the range library – offers. (I think this combo use of cycle, zip and filter was hinted at briefly in Eric’s ranges keynote when discussing the motivation that lead to stride.) But it’s sufficiently easy to implement and makes a nice example.

One more from the Prelude

A slightly more useful function from the Prelude is scan (in Haskell, scanl and scanr). Despite its somewhat obscure name, it’s easy to understand: it’s a fold that returns all the partial results in a sequence. For example:

Prelude> scanl (+) 0 [1,2,3,4]
[0,1,3,6,10]

In C++, I’m not sure what to call this… perhaps accumulate_partials? I left it as scan, for want of a better name. (Edit: It’s partial_sum and it exists in the STL and the ranges library. So I’ll let this section just stand as a further example.)

One immediate thing to notice about scan is that the range it produces is one element longer than the input range. That’s easily done:

namespace detail
{
  template<typename C>
  using scan_cardinality =
    std::integral_constant<cardinality,
      C::value == infinite ?
        infinite :
        C::value == unknown ?
          unknown :
          C::value >= 0 ?
            static_cast<ranges::cardinality>(C::value + 1) :
            finite>;
} // namespace detail

In general, the action of the cursor is going to be the same as we saw with iterate: keep a cached value that is updated with a call to next(), and return it with a call to current().

A nice addition to accumulate

Eric made a nice addition to the range version of accumulate which I also used for scan: in addition to the binary operation and the initial value, there is the option of a projection function which will transform the input values before passing them to the binary operation. So when next() updates the cached value, that code looks like this:

void update_value()
{
  val_ = rng_->op_(val_, rng_->proj_(*it_));
}

It’s like having transform built in instead of having to do it in a separate step: a nice convenience. And that’s all there is to scan.

Conclusions

This has been a fun series of experiments with ranges, and I’ve learned a lot. I’ve made mistakes along the way, and I’m sure there are still parts of my code that reflect extant misunderstandings and general lack of polish. But as an experiment, this has been a success. The code works. As a user of the ranges library with some functional programming experience, it wasn’t too hard to put together solutions. And the Concept messages really helped, along with a knowledge transfer of iterator categories from the STL.

As an extender of the library adding my own views, it took a little effort to establish correct assumptions and understandings, but Eric’s code is clear and offers good patterns to build from. I’m sure a little documentation would have steered me more quickly out of problem areas that I encountered. There are some things that are still problematic and I’m not yet sure how to tackle: chiefly involving recursion. And at the moment, Concept messages don’t always help – I think digging into compiler diagnostics is going to be a fact of life as a range implementer.

C++ will never be as syntactically nice as Haskell is, but ranges offer our best chance yet at high-performance, composable, declarative, correct-by-construction code.

Exercising Ranges (part 7)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

Calculus operations

It’s relatively simple to do differentiation and integration of power series, since they’re just regular polynomial-style. No trigonometric identities, logarithms or other tricky integrals here. Just the plain rule for positive powers of x. For differentiation, we have:

\frac{d}{dx}kx^{n} = knx^{n-1}

Which is straightforwardly expressed with ranges: we simply take the tail of the power series and pairwise multiply with n, the power:

template <typename Rng>
auto differentiate(Rng&& r)
{
  return ranges::view::zip_with(
      std::multiplies<>(),
      ranges::view::iota(1),
      ranges::view::tail(std::forward<Rng>(r)));
}

Integration is almost as easy. The fly in the ointment is that up until now, we have been working entirely with integers, or whatever underlying arithmetic type is present in the range. Here is where Haskell wins with its built in rational numbers and polymorphic fromIntegral for conversion from integral types. Integration brings division into the picture:

\int_{}{} kx^n dx = \frac{kx^{n+1}}{n+1}

Which means we need to turn a range of ints into a range of floats or doubles. Or use a rational number library. But anyway, sweeping these concerns under the rug, the actual code for integration is as easy as that for differentiation:

template <typename Rng>
auto integrate(Rng&& r)
{
  return ranges::view::concat(
      ranges::view::single(0),
      ranges::view::zip_with(
          [] (auto x, auto y) { return (float)x / (float)y; },
          std::forward<Rng>(r),
          ranges::view::iota(1)));
}

We prepend a zero as the constant of integration.

Calculus: in real life, more difficult than arithmetic, but with ranges, considerably easier. That was a welcome break from extreme mental effort. In the next part, I round out my range experiments for now, with a look at a couple more functions from the Haskell Prelude.

Exercising Ranges (part 6)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

Multiplying power series

Now we come to something that took considerable thought: how to multiply ranges. Doug McIlroy’s Haskell version of range multiplication is succinct, lazily evaluated and recursive.

(f:ft) * gs@(g:gt) = f*g : ft*gs + series(f)*gt

Which is to say, mathematically, that if we have:

F(x) = f + xF'(x)
G(x) = g + xG'(x)

then the product is given by:

F(x)G(x) = fg + xF'(g + xG') + xfG'

In C++, this function can be naively written:

template <typename R1, typename R2>
auto mult(R1&& r1, R2&& r2)
{
  auto h1 = *ranges::begin(std::forward<R1>(r1));
  auto h2 = *ranges::begin(std::forward<R2>(r2));
  return view::concat(
      view::single(h1 * h2),
      add(mult(view::tail(r1), r2),
          view::transform(
              view::tail(r2),
              [] (auto x) { return x * h1; } )));
}

If this could be compiled, it might even work. I think the lazy evaluation of the resulting range together with the monoidal_zip behaviour of add would terminate what looks like infinite recursion. But this doesn’t compile, because we don’t just have to deal with recursion at runtime: there is no termination of the recursion at compile-time resulting from the compiler trying to do type inference on the return value. Clang initially complained about exceeding the max template depth (default 256) and when I upped this to 1024, clang laboured over it for an impressive 50 seconds before crashing.

Back to pencil and paper

So I needed to think harder. It seemed that I would need to write a view representing multiplication of two series, and I started playing with polynomials – for the moment, assuming they were of the same degree. After a while I noticed something useful:

F = f_0 : f_1 : f_2 : ...
G = g_0 : g_1 : g_2 : ...
FG = f_0g_0 : f_0g_1 + f_1g_0 : f_0g_2 + f_1g_1 + f_2g_0 : ...

Each successive term is the sum of the terms for which multiplication produces the appropriate power of x. And the computation of any given term is:

f_0g_n + f_1g_{n-1} + f_2g_{n-2} + ... + f_ng_0

This is inner_product of a range with a reversed range! And we know we can reverse the range as long as it has a bidirectional iterator, because it’s just what we’ve already counted up to from the beginning – we already have the end iterator in hand. Next I had to think about how to deal with the cardinality of the product of two polynomials: it’s not enough to iterate to the end of both, even if they’re of the same degree, because there are higher powers of x that get generated in the “tail” of the product. The cardinality is actually the sum of the input cardinalities, minus one.

template<typename C1, typename C2>
using series_mult_cardinality =
  std::integral_constant<cardinality,
    C1::value == infinite || C2::value == infinite ?
      infinite :
      C1::value == unknown || C2::value == unknown ?
        unknown :
        C1::value >= 0 && C2::value >= 0 ?
          static_cast<ranges::cardinality>(C1::value + C2::value - 1) :
          finite>;

A trick of the tail

And once we reach the end of both range iterators we are multiplying, we need to go an extra distance equal to the length we’ve just covered, to obtain the higher power coefficients. I do this with a length_ variable that tracks the distance advanced, and a tail_ variable that increments after the ranges are exhausted until it reaches length_, at which point we are at the end. The end sentinel comparison looks like this:

bool equal(cursor<IsConst> const &pos) const
{
  return detail::equal_to(pos.it1_, end1_)
    && detail::equal_to(pos.it2_, end2_)
    && pos.tail_ == pos.length_;
}

The only remaining thing is to deal with different degree polynomials, for which I use the same approach as monoidal_zip, keeping a diff_ variable and doing some bookkeeping. Computing the current value for the iterator becomes the inner product of one range leading up to the current position with the reverse of its counterpart in the other range (with appropriate adjustments to deal with the tail and the different degrees of the ranges):

auto compute_current() const
{
  auto r1 =  make_range(
      begin(rng_->r1_) + tail_ + (diff_ > 0 ? diff_ : 0),
      it1_);
  auto r2 = view::reverse(
      make_range(
          begin(rng_->r2_) + tail_ + (diff_ < 0 ? -diff_ : 0),
          it2_));
  return inner_product(r1, r2, 0);
}

So view::series_mult works. It isn’t perfect, because it imposes a bidirectional constraint on the arguments. It is O(n2), which can’t be avoided (we must ultimately multiply every coefficient by every other coefficient). To avoid the bidirectional constraint and just traverse the sequences once, I would need to go back to the recursive definition and reformulate it, somehow avoiding the recursive type, and I can think of a few possible ways to do that:

  • Explicitly write a recursive type function, which includes a base case.
  • Type-erase the multiplied range somehow (this would involve throwing away cardinality and maybe some other things, so it might not lead to a good outcome).
  • Break the type recursion with a technique akin to recursive_variant.
  • Somehow collapse the type of the range, how I’m not sure… the only real way to do this I think is by copying to a container and making a fresh range from it.
  • Write my own system for lazy evaluation… which obviates ranges, really.

None of these are easy (and some have undesirable consequences). I continue to think on the matter, because Doug McIlroy’s Haskell formulations are recursive and beautiful, and multiplication is the easiest of the recursive definitions. Things get even more tricky with division and composition…

Tackling exponentiation

As a footnote to multiplication, I tried to define exponentiation. (I recently read the excellent From Mathematics to Generic Programming, and Chapter 7 is very instructive in the process of generalizing exponentiation from the basis of multiplication.) But I ran into basically the same issue: writing recursive functions to produce ranges is tricky because of the recursive type expansion. Maybe in this case, the problem is defined enough that I can write a recursive type function counterpart to avoid the compiler’s non-terminating type inference.

For the next part of this series, I’ll look at an easier topic: differentiation and integration of power series.

Exercising Ranges (part 5)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

Generalising iota

To pretty-print a reversible polynomial, we need to provide for reversibility of iota. So what is iota? It makes a range by repeatedly incrementing a value. What if instead of increment, we use an arbitrary function? We could do this with generate_n:

template <typename Rng>
std::string to_string_reverse(Rng&& r)
{
  auto s = ranges::size(r);
  auto powers = ranges::view::generate_n(
      [s] () mutable { return --s; }, s);
  return detail::to_string(
      ranges::view::zip(ranges::view::reverse(std::forward<Rng>(r)), 
                        powers));
}

But this isn’t quite as clean as it could be: the lambda mutates state. An alternative is found in the Haskell Prelude:

iterate :: (a -> a) -> a -> [a]

iterate applies a function repeatedly to an argument, feeding the result of the last iteration to the input of the next one. It’s very similar to generate, and we can write both iterate and iterate_n. The view takes a function and an initial value, and produces the sequence:

 x, f(x), f(f(x)), f(f(f(x))), ...

Here’s the C++ for the skeleton of iterate_view:

template<typename G, typename T>
struct iterate_view
  : view_facade<iterate_view<G, T>, infinite>
{
private:
  friend struct range_access;
  using result_t = concepts::Function::result_t<G, T>;
  semiregular_t<G> gen_;
  semiregular_t<result_t> val_;
  ...
  explicit iterate_view(G g, T&& t)
    : gen_(std::move(g)), val_(std::move(t))
  {}
  ...
};

The cursor is easily written just like generate‘s cursor: next() caches a new value, and current() returns it. Like generate, the view can only be traversed once, so the cached value is kept in the view itself rather than the iterator. And there is no end cursor.

struct cursor
{
  iterate_view *view_;
  ...
  result_t current() const
  {
    return view_->val_;
  }
  void next()
  {
    view_->next();
  }
};
void next()
{
  val_ = gen_(val_);
}

For iterate_n, the approach is almost identical, except that next() decrements a count, and size() is also provided. Using iterate_n, we can generate a decreasing sequence and use it in the reverse print function for a polynomial, which we know is a SizedRange.

template <typename Rng>
std::string to_string_reverse(Rng&& r)
{
  auto s = ranges::size(r);
  auto powers = ranges::view::iterate_n(
      [] (auto i) { return i - 1; }, s-1, s);
  return detail::to_string(
      ranges::view::zip(ranges::view::reverse(std::forward<Rng>(r)), 
                        powers));
}

As for iterate itself, it’s potentially useful in lots more places. The obvious use cases (and test cases) are mathematical in nature, for instance my test case using the Collatz conjecture (aka Ulam conjecture, 3n+1 conjecture) sequence starting with 27:

int collatz(int n)
{
  if (n & 1) return 3*n+1;
  return n>>1;
}
 
DEF_TEST(Collatz27, Iterate)
{
  auto m = view::iterate(collatz, 27)
    | view::take_while([] (int n) { return n != 1; });
  EXPECT(ranges::find(m, 9232) != ranges::end(m));
  return true;
}

But there are also plenty of applications in other code, where we are updating the state of something based on its previous state: iterate separates the loop code and allows us to write a clean function that simply produces a new state from an old state. This is useful for games and simulations of all kinds. So iterate is a useful addition to the range views.

Now we can pretty-print polynomials properly, in the next part I’m going to take on the trickier subject of how to multiply them.

Exercising Ranges (part 4)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

On reversibility

It occurs to me that I’ve been touting reversibility as a good thing, both implicitly and explicitly (for polynomials), and the reason for that is not just that I might want to print polynomials in natural decreasing-power-of-x order.

The first reason reversibility is important to me: coming from the STL, we are used to containers being iterable. If a container supports bidirectional iterators, it’s necessarily reversible, since all containers are finite and one can obtain the end iterator. The same is not true of ranges – because of the generalization of end as a sentinel condition, ranges can be infinitely long. An infinitely long range has no end, so there’s no way to reverse it, because there’s nowhere to start. It can still support bidirectional iteration, but without reversibility that’s of limited use in most cases. So in my mind, reversibility is a better concept for ranges than simply the availability of bidirectional iteration. (There may be occasional uses for bidirectional iteration without the existence of end – Turing machine simulation springs to mind – but I think reversibility is the more common desire.)

The second reason might be called Postel’s Law of iterator categories: if you’re writing an algorithm (or range view), accept as weak an iterator category as you can, but provide as strong an iterator category as you can. Many views can work with input iterators. Some require forward iterators. Few require stronger guarantees. (They might work more efficiently if random access iteration is available, but this isn’t the same as saying they require that.) And on the flip side, a view should almost never degrade the iterator category that is passed in to it. Polynomials should be reversible, so views on them should be reversible.

In this vein, I tried to make monoidal_zip require only InputRange, but support reversing if the underlying ranges support it.

Pretty-printing polynomials

With that in mind, I embarked on pretty-printing polynomials and infinite series, wanting to print them both in infinite series form (increasing powers-of-x) and “more natural” polynomial form (decreasing powers-of-x).

We have a sequence of coefficients, and each corresponds to a power of x, so I start out by zipping the polynomial with an increasing sequence of integers representing the powers.

I first thought of view::transform and view::intersperse. I coded up a solution using view::transform (view::intersperse turned out to be problematic because the thing being interspersed – the plus or minus sign – depends on the sign of the coefficient).

Here’s a helper function for printing the power of x:

std::string x_to_power(int n)
{
  if (n == 0) return {};
  if (n == 1) return {"x"};
  return {"x^" + std::to_string(n)};
}

And here’s the prettyprint function (that’s what I originally called it) using view::transform:

template <typename Rng>
void prettyprint(Rng&& r)
{
  bool start = true;
 
  auto m = view::zip(std::forward<Rng>(r),
                     view::iota(0))
    | view::transform(
        [&start] (const std::pair<int, int>& p) -> std::string {
          std::string s;
          if (p.first != 0) {
            if (!start)
              s = p.first > 0 ? " + " : " - ";
            s += std::to_string(p.first) + x_to_power(p.second);
            start = false;
          }
          return s;
        });
 
  ranges::for_each(m, [](const string& s){ cout << s; });
  cout << endl;
}

It has several bugs as-is, and I wasn’t happy with that start boolean hanging out there. And moreover, this seemed like a very procedural way to work and was annoying the functional programming part of my brain. What I actually wanted was not a function to output a sequence, but a fold to convert a sequence into a string, or to put it in C++ language, accumulate. The fold is independent of whether or not the sequence is reversed, so I factored out the “driver function” in anticipation of creating a reversing version of it.

template <typename Rng>
std::string to_string(Rng&& r)
{
  return detail::to_string(
      ranges::view::zip(std::forward<Rng>(r),
                        ranges::view::iota(0)));
}

And the fold looks like this:

template <typename Rng>
std::string to_string(Rng&& r)
{
  return ranges::accumulate(
      std::forward<Rng>(r),
      std::string(),
      [] (std::string s, const auto& p) {
        // if the coefficient is non-zero
        if (p.first != 0)
        {
          // treat the first one printed specially
          if (s.empty())
            s += p.first > 0 ? "" : "-";
          else
            s += p.first > 0 ? " + " : " - ";
 
          auto coeff = p.first > 0 ? p.first : -p.first;
          // don't print coefficient of 1 (unless it's the constant)
          if (coeff != 1 || p.second == 0)
          {
            s += std::to_string(coeff);
          }
          s += detail::x_to_power(p.second);
        }
        return s;
      });
}

The complexity that creeps in here is, I think, the intrinsic complexity of printing a polynomial rather than complexity introduced by the wrong solution (start has disappeared, replaced with logic that deals with the first printed term). So this accumulate-based approach works with the power series approach of printing in increasing powers of x; but how do we reverse that zip? How can we reverse iota?

To answer that, I needed to go back to the Haskell Prelude, in the next part.

Exercising Ranges (part 3)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

So, I was going to implement monoidal_zip, and to do that, I would clone zip_with.hpp. So I did that. Eric’s a great library author, and the ranges code is pretty easy to read. For the most part I found it concise, well-expressed and clear. The parts I didn’t immediately understand were owing to my own (temporary) ignorance and not a result of murky code.

Since zip_with takes N ranges and an n-ary function, it’s written with variadic templates and folding operations over parameter packs. A monoid by definition only uses a binary operation, so I immediately simplified that. Then I worked through each part of the file, adapting it to my use case.

Range cardinality

The first things in zip_with.hpp are a bunch of constexpr functions in a detail namespace, and then a cardinality calculation. Range cardinality turns out to be an enumeration with three members – infinite (-3), unknown (-2), and finite (-1) – and then a value of zero or above is a known cardinality such as might be obtained at compile-time from an array.

Aside: I didn’t think about/realise until much later that the cardinality of a vector isn’t known at compile time. Perhaps it could be in some cases – a shortcoming of the STL re constexpr?

In the case of known cardinalities, zip_with uses the minimum of the N ranges; I needed the maximum. Simple, if verbose to deal with the other possible enum values.

namespace detail
{
  template<typename C1, typename C2>
  using monoidal_zip_cardinality =
    std::integral_constant<cardinality,
      C1::value == infinite || C2::value == infinite ?
        infinite :
        C1::value == unknown || C2::value == unknown ?
          unknown :
          C1::value >= 0 && C2::value >= 0 ?
            max_(C1::value, C2::value) :
            finite>;
} // namespace detail

(The function max_ is one of Eric’s existing constexpr functions in zip_with.hpp.)

Cursors and sentinels

The main work in writing any view is of course, defining the cursor and the sentinel (internal representations of iterators: the view provides begin_cursor and end_cursor functions – in const and mutable forms – which return a cursor and a sentinel respectively).

There are a few simplifying implementation patterns I found in several views that I adopted. Commonly the cursor and the sentinel are templated on a bool representing constness. And it’s also common for the cursor to need a pointer back to the range it’s iterating over, and that pointer needs the right constness, so we end up with code like:

template<typename Fun, typename R1, typename R2>
struct iter_monoidal_zip_view
  : view_facade<iter_monoidal_zip_view<Fun, R1, R2>,
                detail::monoidal_zip_cardinality<range_cardinality<R1>,
                                                 range_cardinality<R2>>::value>
{
  ...
  template <bool IsConst>
  struct cursor
  {
    ...
  private:
    friend struct sentinel<IsConst>;
    // add a const if I am const
    template <typename T>
    using constify_if = meta::apply<meta::add_const_if_c<IsConst>, T>;
    // pointer to my view
    using monoidal_zip_view_t = constify_if<iter_monoidal_zip_view>;
    monoidal_zip_view_t *rng_;
    // iterators I use
    range_iterator_t<constify_if<R1>> it1_;
    range_iterator_t<constify_if<R2>> it2_;
    ...
  };
  ...
};

Satisfying the Concepts

I wanted a reversible range. That meant I needed my cursor to be bidirectional. Every cursor must implement current, next and equal, and of course a bidirectional cursor must implement prev:

auto current() const;
void next();
void prev();
bool equal(cursor const& that) const;

Every sentinel implements comparison with a cursor:

bool equal(cursor<IsConst> const &pos) const;

And the range itself implements begin_cursor and end_cursor, with end_cursor returning either a cursor or a sentinel depending on whether the underlying ranges are bounded:

  using are_bounded_t = meta::and_c<(bool) BoundedRange<R1>(),
                                    (bool) BoundedRange<R2>()>;
 
  cursor<false> begin_cursor()
  {
    return {*this, fun_, begin_tag{}};
  }
  meta::if_<are_bounded_t, cursor<false>, sentinel<false>>
  end_cursor()
  {
    return {*this, fun_, end_tag{}};
  }

(I omitted the const versions of these functions which use the CONCEPT_REQUIRES macro as a nice wrapper for enable_if style functionality.)

Nice conventions

In order that views work on containers-as-ranges, various views simply use view::all to turn whatever container is passed in into a range. If actual ranges are passed in, view::all just forwards them, unaffected. This is very handy.

There is a convention in the ranges code that deals with reporting concept violations to the user: separate out the concept as a template alias, eg:

template<typename Fun, typename R1, typename R2>
using Concept = meta::and_<
  InputRange<R1>, InputRange<R2>,
  Callable<Fun, range_reference_t<R1> &&, range_reference_t<R2> &&>>;

And write the required function with an overloaded version which uses the CONCEPT_REQUIRES_ macro in an inverted sense from the “normal” version:

// "normal" version
template<typename R1, typename R2, typename Fun,
         CONCEPT_REQUIRES_(Concept<Fun, R1, R2>())>
monoidal_zip_view<Fun, all_t<R1>, all_t<R2>> operator()(
    Fun fun, R1 && r1, R2 && r2) const
{ ... }
 
// "concept failed" version
template<typename Fun, typename R1, typename R2,
         CONCEPT_REQUIRES_(!Concept<Fun, R1, R2>())>
void operator()(Fun, R1 &&, R2 &&) const
{ CONCEPT_ASSERT_MESSAGE(...); }

The actual work

So, how does the cursor actually work? The idea is fairly simple. Keep an iterator to each range that the view is considering. Usually they advance together. When we run out of one range, keep track of the difference between the iterators, to enable reverse iteration. The rest is bookkeepping. (Error-prone, exacting bookkeeping: I didn’t get reverse iteration working properly until I gave the whole thing a proper workout with unit tests. Off-by-one errors, one of the three hardest things in computer science.)

The monoidal_zip view’s begin is when both iterators are at their respective begins, and the corresponding thing is true for end. You can find the implementation on github – it’s too much for an already long post.

Writing current is easy, we just need to check the iterators to do the monoid identity thing if needed:

auto current() const
RANGES_DECLTYPE_AUTO_RETURN_NOEXCEPT
(
  diff_ == 0 ?
    fun_(it1_, it2_) :
    diff_ > 0 ? *it1_ : *it2_
)

And that’s it. Now we have monoidal_zip, which fulfils the needs of adding power series (and also subtracting them). It works on anything that’s a monoid, not just addition. And it’s reversible. In implementing this, I discovered that the result of reversing zip_with may be surprising: because the ranges are lazily traversed, effectively it’s as if the ranges were reversed inside the zip. If the ranges are different lengths, that may not be what you want, but if you think about it, it sort of has to be that way?

After that lengthy workout, in the next instalment something easier: pretty-printing power series.

Exercising Ranges (part 2)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

First steps with power series

A power series (or polynomial, to give it a more familiar term in the case where it’s finite) is represented simply as a sequence of coefficients of increasing powers of x. This is simple – to represent:

1 + 2x + 3x^2

the code is:

std::vector<int> v = {1, 2, 3};

The simplest thing we can do to this is to negate it, which in range terms is a transform:

template <typename Rng>
auto negate(Rng&& r)
{
  return ranges::view::transform(std::forward<Rng>(r),
                                 [] (auto i) { return -i; });
}

Adding polynomials

Things get more interesting when we want to add two polynomials, because they might be of differing degree. If we try to use zip_with, things don’t work out as we want:

template <typename R1, typename R2>
auto add(R1&& r1, R2&& r2)
{
  return ranges::view::zip_with(std::plus<>(),
                                std::forward<R1>(r1),
                                std::forward<R2>(r2));
}
...
std::vector<int> v1 = {1, 2, 3};
std::vector<int> v2 = {1, 2, 3, 4, 5};
auto sum = add(v1, v2);
// we want sum to be {2, 4, 6, 4, 5}
// it's actually {2, 4, 6}

The problem is that zip_with (and by extension zip) only zips as far as both ranges go, and discards the “tail” of the longer range, if any. This is mostly What You Want™, but not here. What we want is for the tail to be included.

So I thought about this for a while. The first thing I thought of was appending an infinite sequence of zeroes to the shorter range:

auto summable_range = view::concat(shorter_range, view::repeat(0));

I coded this up to try it out, and in doing so learned some things about range cardinality (more on that to come), but there is a serious limitation here – at least one that was hard for me to immediately circumvent – which is that the view we return isn’t reversible. It really seems like, given two polynomials I can reverse, I ought to be able to reverse their sum. For one thing, if we’re dealing with a polynomial, it’s more naturally written as decreasing powers of x rather than increasing:

3x^2 + 2x + 1 is more natural than 1 + 2x + 3x^2

Functional spider senses

So, I thought some more – what I needed was a variety of zip that, when one range ran out, started just returning elements from the longer range. At this point, the functional programmer part of my brain woke up. It saw addition. It saw zeroes. And it whispered to me, “think monoids”. What I wanted was a generalization of polynomial addition, a “monoidal zip” that would take a binary operation and extend to the end of both ranges, using the identity element when the shorter range ran out.

In fact, since using the identity is, by definition, unchanging to the other argument, there is no need to supply the identity to this function – a realization that I didn’t come to until later after I’d had time to step back. For now, C++ brain took over again.

template <typename R1, typename R2>
auto add(R1&& r1, R2&& r2)
{
  return ranges::view::monoidal_zip(std::plus<>(),
                                    std::forward<R1>(r1),
                                    std::forward<R2>(r2));
}

I delved into zip_with.hpp and this is where my real learnings began, in the next part.

Exercising Ranges (part 1)

Wednesday, July 1st, 2015

The idea

For some time since attending C++Now, I have been meaning to get around to playing with Eric Niebler’s range library. Eric presented ranges in a worked example as one of the keynotes at C++Now – a prior version of the talk that he gave at the Northwest C++ Users Group is currently available on Youtube, and the C++Now keynote should be up soon (under BoostCon).

Listening to Eric’s talk, I was immediately struck by how using ranges is quite functional in nature – composable, declarative, “wholemeal programming” as it’s sometimes called. And I also noticed Eric’s collection of range views and actions, many of which are of course familiar from the STL, but several more of which are clearly inspired by the Haskell Prelude. (And there are useful things in the Prelude that aren’t yet in ranges, for no particular reason other than nobody’s done the work yet.)

Then, the week after C++Now I attended LambdaConf in Boulder, and one talk in particular gave me some motivating examples to test out with ranges. So, a couple of weeks ago, experimenting with ranges finally made it to the top of my to-do list.

Starting out

Would-be range users are cautioned:

This code is fairly stable, well-tested, and suitable for casual use, although currently lacking documentation. No promise is made about support or long-term stability. This code will evolve without regard to backwards compatibility.

OK, just so we know what we’re getting into. Also, the range documentation is described as “woefully incomplete”. Most of it is stubs generated by doxygen. What is there in terms of explanatory prose is also somewhat dated at this point – for instance, recently some template classes were renamed – but having seen the talk, I knew some range fundamentals, so I took the approach of reading code instead of reading documentation. To what end that served me well or steered me wrong remains to be seen, but I got along OK.

Fundamentals

What I wanted to experiment with were views: that is, lightweight, non-owning constructs that (typically) present a way of iterating a range (or ranges). Existing examples are things like take, zip, concat.

From the talk, I knew some fundamentals of ranges and their iterator model. The iterator model used in ranges is a relaxed version of the STL’s begin/end paradigm, particularly with respect to end. Instead of using a pair of iterators, ranges use the idea of an iterator/sentinel pair, so begin is an iterator and end is a sentinel. A sentinel can be compared to an iterator, indeed, it may be an iterator. But it also allows the end of a range to be marked with a sentinel value, or it can express a condition that terminates the range. So range views can express ideas like generate_n, take_while, or infinite ranges.

Concepts

Eric has done impressive work with Concepts, mentioned in his ranges talk almost as an aside, but representing a significant and important addition to the ranges library. It means two things: first, a user of ranges can get a nice compiler diagnostic that hopefully steers her quickly to a fix, rather than having to pore over pages of template errors. But second, and as importantly, many of the ideas behind ranges are explicit in the code, not just documented or implicitly understood. And that was a big help to me as a reader of the library.

In particular, Eric has adopted and extended the STL’s idea of the different iterator categories, applying representative concepts to both iterators and ranges. In my opinion, a working knowledge of these concepts is essential for the would-be range hacker. As a new range user, I was familiar with Input (and Output), Forward, Bidirectional, RandomAccess as applied to iterators and ranges. Then I had to gain familiarity with a few new concepts: BoundedRange and SizedRange being two of them. It took me a while to integrate the various strengths of range capabilities with the old STL iterator categories model.

Feeling like a novice again

For instance, one thing that came up frequently was that I would want to use view::reverse on a range. In general, it might not be obvious which ranges can support this and I found that I had to think a bit and re-examine some assumptions. The fact that a range can be iterated bidirectionally is necessary but not sufficient for reversing it – you also have to be able to find the end. I’m not sure if Reversible exists as a Concept, but it might be a worthy addition.

It has been my experience that the nice Concept error messages aren’t always triggered. This isn’t to say they aren’t working, just that they aren’t sufficient to cover all cases nicely, perhaps especially as a library writer rather than as a user. Several times I scratched my head over a compile problem, and needed to experiment with things a bit to discover exactly what I had done wrong or had accidentally omitted. It’s been a while in my career since I had to noodle over a compiler error for that long!

Inspiration and motivation

At LambdaConf, I saw an excellent talk by Doug McIlroy about modelling power series in Haskell. In a dozen lines of code, he’s able to express arithmetic manipulation of infinite power series, as well as differentiation, integration, and functional composition. And in perhaps another dozen (just to deal with termination conditions) the code deals with polynomials (i.e. finite power series) as well. This is beautiful code, and with the power of ranges I wanted to see what I could do to get the same kind of declarative, natural style in C++.

All the code for my experiments is on github, and in the next part, I’ll cover my initial implementation steps, getting my hands dirty with ranges.

“The Lambda Trick”

Monday, May 18th, 2015

I just got back from C++Now, an excellent conference where C++ template metaprogramming experts abound. A phrase I overheard often was “the lambda trick”. It’s a trick for speeding up compiles when templates get deep.

Every C++ programmer knows that deep templates can slow down compilation. Most assume that this is because of excessive code generation (“code bloat”). But it turns out, I’m told, this isn’t actually the primary cause. As templates get deeper, type names get longer and longer, and it’s the compiler manipulating (hashing/mangling) these internally that is a major cause of slow compilation.

But when a compiler generates a closure type for a lambda, the name of that type doesn’t depend on its arguments. So it’s short, and it effectively truncates the long type name that was building up because of templates. Thus the lambda trick was born. In a post to the Boost developers’ mailing list on June 6th 2014, Louis Dionne shared this technique that he’d discovered while programming the Hana library, a functional library implemented with heavy template metaprogramming.

Let’s look at some code by way of explanation. The effect is most easily demonstrated with large tuples.

#include <iostream>
#include <tuple>
using namespace std;
 
int main(void)
{
  auto len = tuple_size<decltype(
      make_tuple( 1,  2,  3,  4,  5,  6,  7,  8,  9, 10,
                 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
                 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
                 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
                 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
                 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
                 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
                 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
                 81, 82, 83, 84, 85, 86, 88, 88, 89, 90,
                 91, 92, 93, 94, 95, 96, 97, 98, 99, 100))>::value;
  cout << len << endl;
}

This is a very large tuple, resulting in a template type with a long name, and in fact I need to give GCC an argument to increase the template depth it can deal with. When I compile it (with GCC 4.9.2) this is the result:

$ time g++ -std=c++1y -O2 -ftemplate-depth=2048 -c main.cpp -o tmp.o
 
real	0m1.616s
user	0m1.500s
sys	0m0.108s

1.6 seconds to compile that. And it produces the following assembly, optimized as you’d expect:

0000000000000000 <main>:
   0:	48 83 ec 08          	sub    $0x8,%rsp
   4:	be 64 00 00 00       	mov    $0x64,%esi
   9:	bf 00 00 00 00       	mov    $0x0,%edi
   e:	e8 00 00 00 00       	callq  13 <main+0x13>;
   . . .

The constant 100 (0x64) is clearly visible there. And as expected, len is a compile-time constant. I can use it in template expressions or as an array size, for instance.

Now here is the same code with the “lambda trick” applied.

int main(void)
{
  auto list = [] (auto... xs) {
    return [=] (auto access) { return access(xs...); };
  };
 
  auto length = [] (auto xs) {
    return xs([] (auto... z) { return sizeof...(z); });
  };
 
  auto len = length(list( 1,  2,  3,  4,  5,  6,  7,  8,  9, 10,
                         11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
                         21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
                         31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
                         41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
                         51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
                         61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
                         71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
                         81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
                         91, 92, 93, 94, 95, 96, 97, 98, 99, 100));
  cout << len << endl;
}

The idea here is that list is a lambda that hides the long template type. It’s a generic lambda (C++14) and its closure object operator() takes a variadic template argument. We can still pass in a function to work on the template – in this case length – and once again, len is known at compile time (tuple_size has been replaced with sizeof...). Compiling with -O2 gives identical code (although the lambda returned by list looks like it captures, everything gets computed at compile time), but look at the time taken:

$ time g++ -std=c++1y -g -O2 -c main.cpp -o tmp.o
 
real	0m0.325s
user	0m0.296s
sys	0m0.024s

It’s 5x faster! The lambda hid the template, truncating the internal type name, so the compiler didn’t have to deal with manipulating such long strings. That’s “the lambda trick” in a nutshell.

“In-place” might be less in-place than you think

Tuesday, May 5th, 2015

The intuitive view of algorithms that work in-place is that (it sounds obvious) they don’t use any extra space. Canonically in C/C++ we think of something like reversing an array, or that interview staple, removing spaces from an ASCII string, which we might write as:

int remove_spaces(char *s)
{
  char *readptr = s;
  char *writeptr = s;
  while (*readptr)
  {
    if (!isspace(*readptr))
    {
      *writeptr++ = *readptr++;
    }
    else
    {
      ++readptr;
    }
  }
  *writeptr = 0;
  return writeptr - s;
}

But “in-place” has a technical definition that is actually more relaxed than using no extra space, because this intuitive sense is too limiting under a formal analysis. As Wikipedia says:

In computational complexity theory, in-place algorithms include all algorithms with O(1) space complexity, the class DSPACE(1). This class is very limited; it equals the regular languages.[1] In fact, it does not even include any of the examples listed above.

For this reason, we also consider algorithms in L, the class of problems requiring O(log n) additional space, to be in-place.

And in the recent book From Mathematics to Generic Programming by Alexander Stepanov (original author of the STL) and Daniel Rose, page 215 offers the definition:

An algorithm is in-place (also called polylog space) if for an input of length n it uses O((log n)k) additional space, where k is a constant.

(I highly recommend the book, by the way.) If you’re only used to thinking about “in-place” intuitively, you could probably be persuaded that using constant extra space is admissible – after all, one probably has to use a few stack variables – but I think the idea that an algorithm is entitled to call itself “in-place” if it uses logarithmic extra space might be surprising. But that’s the technical definition; an author is entitled to call his or her algorithm “in-place” even though it uses O(log n) extra space (and includes the requirement to allocate that space).