Archive for the ‘Programming’ Category

10 Non-C++ Book Recommendations for C++ Programmers

Friday, January 19th, 2018

So you’ve learned enough C++ to function. You’ve read a bunch of the usual recommendations: Meyers, Stroustrup, maybe even Alexandrescu and Stepanov. You know enough to recommend Lippman et al. to newbies rather than the other “C++ Primer.”

The internet has lots of C++-related book recommendations to make — for example, you should absolutely read all the authors listed above — at whatever C++ developmental stage is appropriate for you. But since you can already find so many C++-specific book lists out there, I’d like to recommend some books perhaps a bit further off the beaten track that should nevertheless be interesting and help you become a better programmer in C++ or any other language. C++ is a multiparadigm language, and there is a lot outside of the usual C++ ecosystem that is valuable to learn.

These recommendations fall broadly in two categories: books about specific areas of programming, mathematics, or computer science; and books about the history of the subject and industry. There is some overlap between these categories, and I believe that all of my recommendations will either give you extra tools to use or at the very least add context to your endeavours. I can earnestly say that I have read every single one of the books I’m recommending here, although some are now available in editions that have evolved since I last read them.

In no particular order:

Dealers of Lightning: Xerox PARC and the Dawn of the Computer Age – Michael A. Hiltzik

I’m kicking off the list with a historical recommendation. As programmers, we tend to exalt the new and pay little attention to the old, but it pays to study the history of the field — nihil sub sole novum, after all. Dealers of Lightning follows the history of Xerox PARC, the place where luminaries like Alan Kay and Butler Lampson pioneered a few small things you’ve probably heard of, like laser printers, personal computing, ethernet, and GUIs. If your only experience with “object-oriented programming” is the way it’s done in C++, well… yeah. You should study history.

A Book of Abstract Algebra – Charles C. Pinter

Perhaps, like me, you have a mathematical background that ends somewhere around high school or your first year of university. Maybe you’ve heard of things like monoids, rings, or groups being used in the context of programming and wondered why smart people like Alexander Stepanov seem to talk about them so much. If you want to fill in the gaps in your knowledge, this is the book for you. One slight word of warning: it is dense, and it is mathematical. The good news is that it is also very accessible. Most of us without university-level mathematics under our belts haven’t studied abstract algebra at all, but it doesn’t really require anything beyond a comprehension of junior high mathematics to get started. When I was a teenager, calculus was hard. As an adult, I’ve been able to expand and enrich my experiences in the physical world, allowing me to make more of the mental connections needed for calculus to be accessible for me. So it is with abstract algebra — it is to programming what calculus is to modelling the real world.

Digital Typography – Donald E. Knuth

Ah, Knuth! Arguably the world’s most famous living computer scientist, Knuth is a programming household name thanks to The Art of Computer Programming, the complete volumes of which sit unread on many bookshelves. Digital Typography is an alternative offering composed of a series of essays relating various aspects of the development of TeX and Metafont. What comes across the most strongly for me while reading this is the astonishing attention to detail that Knuth brings to his works. Once you’re finished reading it, odds are you’ll have read one more Knuth book than most programmers!

The Garbage Collection Handbook: The Art of Automatic Memory Management – Richard Jones, Antony Hosking, Eliot Moss

This is the more modern descendant of Garbage Collection by Richard Jones and Rafael Lins. C++ has an aversion to garbage collection at the language level, but you might well find yourself implementing garbage collection or using a framework that has GC at some point in your career. After all, smart pointers are a form of garbage collection… that don’t handle loops and incur a potentially unbounded cost on a free. At any rate, this book is the bible of GC algorithms. It’s really interesting and useful for anyone concerned with handling memory — i.e., everyone who writes C++.

Purely Functional Data Structures – Chris Okasaki

Functional programming seems to be all the rage these days. Pick a random video from a random C++ conference and there’s an even chance it mentions some FP influence. This seminal book is almost 20 years old and still relevant as we C++-programmers tentatively explore the world of persistent data structures and immutability. It is the book form of Okasaki’s PhD thesis, so it’s quite rigorous, particularly with respect to cost and complexity analysis. It’s also a great read if you want to understand data structures in functional languages and come up with some ideas for your own implementations. The original PhD thesis is also available for free.

Calendrical Calculations – Edward M. Reingold & Nachum Dershowitz

This is a niche topic — and therefore an expensive outlay if you can’t apply it — but for those who are interested in date calculations, it’s both comprehensive and fascinating. The code given in the book is in Lisp and is not licensed for commercial use, but the authors also point out that their aim is simply to communicate the ideas and give you a starting point for your own implementation in your language of choice. If you’ve done any kind of CS or programming course, chances are you’ve written a function to determine leap years, so unlike Microsoft Excel, you know that 1900 was not a leap year. This book takes things so much further; when I say that it is comprehensive, I mean astonishingly so. Gregorian and Julian calendars are just the tip of the iceberg here, and the book covers almost everything else you could think of. Jewish and Chinese calendars? Of course. How about Mayan, or French Revolutionary, or a dozen other rules-based or astronomical calendars? If this whets your appetite, you may want to wait for the Ultimate Edition, currently due to be released at the end of March 2018. I wonder if Howard Hinnant is working his way through all these calendars while building his date library?

About Face: The Essentials of Interaction Design – Alan Cooper et al.

This is now up to its 4th edition; I read the 2nd. If you work on an application that has a user interface — and they all do — you should read this. If you work with designers, this will help you understand their processes and language. This book strikes the right balance between breadth and depth and covers a lot of ground. I found the chapter “Rethinking Files and Save” particularly thought-provoking.

Hacker’s Delight – Henry S. Warren

After a fairly high-level recommendation, this one gets straight to the really low-level fun stuff. This is the bit-twiddler’s almanac, a spiritual descendant of HAKMEM and a trove of two’s complement recreations. If you’ve ever been asked during interview how to count the number of set bits in a word, this is the book you want to reference. It has everything from popcount and power-of-2 boundaries to space-filling curves, error correction, and prime number generation. You should probably also look up HAKMEM (AI Memo 239), the original collection of bit manipulation hacks.

Pearls of Functional Algorithm Design – Richard Bird

You’ve probably seen the famous “C++ Seasoning” talk by Sean Parent — if you haven’t, please go watch it now. Perhaps you were motivated to read Stepanov or study algorithms as a result. The Stepanovian C++-centric view of algorithms is very much concerned with counting operations, performing manual strength reductions, and identifying intermediate calculations. This book offers an alternative way to design beautiful algorithms in a functional style using a comparatively top-down approach to express algorithmic ideas while making them fast through clean decomposition and subsequent fusion. I think it’s important to study and apply both camps of algorithm design. It is difficult to write beautiful code if you just “count the swaps” without an awareness of the mathematics behind it, but it is equally difficult to make code fast if you simply “express the mathematics” without any awareness of the operations.

Taking Sudoku Seriously: The Math Behind the World’s Most Popular Pencil Puzzle – Jason RosenHouse & Laura Taalman

Like the first recommendation, this isn’t a book that will be immediately applicable to your day job, unless perhaps you’re writing a sudoku mobile app — in which case, buy and read this immediately. But I’m recommending it anyway because I love recreational mathematics, and this book is fun and accessible without dumbing down the material. If you’ve played sudoku and ever thought about solving — or better, generating — the puzzle programmatically, odds are you’ll enjoy this book. Group theory, graph theory, and sudoku variations of every kind abound. As a bonus, it’s dedicated to Martin Gardner.

Any recommendation list is going to be wildly incomplete and subjective, and, of course, there are a lot of books I had to leave out from this one, but perhaps a Part Two will follow at some point in the future. Feel free to chime in if you agree, disagree, or have your own favourites.

An algorithmic sketch: inplace_merge

Sunday, March 13th, 2016

One of the things I like to do in my spare time is study the STL algorithms. It is easy to take them for granted and easy, perhaps, to imagine that they are mostly trivial. And some are: I would think that any decent interview candidate ought to be able to write find correctly. (Although even such a trivial-looking algorithm is based on a thorough understanding of iterator categories.)

Uncovering the overlooked

But there are some algorithms that are non-trivial, and some are important building blocks. Take inplace_merge. For brevity, let’s consider the version that just uses operator< rather than being parametrized on the comparison. The one easily generalizes to the other in a way that is not important to the algorithm itself.

template <typename BidirectionalIterator>
void inplace_merge(BidirectionalIterator first,
                   BidirectionalIterator middle,
                   BidirectionalIterator last);

It merges two consecutive sorted ranges into one sorted range. That is, if we have an input like this:

x_0 \: x_1 \: x_2 \: ... \: x_n \: y_0 \: y_1 \: y_2 \: ... \: y_m

Where \forall i < n, x_i \leq x_{i+1} and \forall j < m, y_j \leq y_{j+1}

We get this result occupying the same space:

r_0 \: r_1 \: r_2 \: ... \: r_{n+m}

Where \forall i < n+m, r_i \leq r_{i+1} and the new range is a permutation of the original ranges. In addition, the standard states a few additional constraints:

  • inplace_merge is stable - that is, the relative order of equivalent elements is preserved
  • it uses a BidirectionalIterator which shall be ValueSwappable and whose dereferent (is that a word?) shall be MoveConstructible and MoveAssignable
  • when enough additional memory is available, (last-first-1) comparisons. Otherwise an algorithm with complexity N log(N) (where N = last-first) may be used

Avenues of enquiry

Leaving aside the possible surprise of discovering that an STL algorithm may allocate memory, some thoughts spring to mind immediately:

  • Why does inplace_merge need a BidirectionalIterator?
  • How much memory is required to achieve O(n) performance? Is a constant amount enough?

And to a lesser extent perhaps:

  • Why are merge and inplace_merge not named the same way as other algorithms, where the more normal nomenclature might be merge_copy and merge?
  • What is it with the algorists' weasel-word "in-place"?

First thoughts about the algorithm

It seems that an O(n log n) algorithm should be possible on average, because in the general case, simply sorting the entire range produces the desired output. Although the sort has to be stable, which means something like merge sort, which leads us down a recursive rabbit hole. Hm.

At any rate, it's easy to see how to achieve inplace_merge with no extra memory needed by walking iterators through the ranges:

template <typename ForwardIt>
void naive_inplace_merge(
    ForwardIt first, ForwardIt middle, ForwardIt last)
  while (first != middle && middle != last) {
    if (*middle < *first) {
      std::iter_swap(middle, first);
      auto i = middle;
      std::rotate(++first, i, ++middle);
    } else {

After swapping (say) x_0 and y_0, the ranges look like this:

y_0 \: x_1 \: x_2 \: ... \: x_n \: x_0 \: y_1 \: y_2 \: ... \: y_m

And the call to rotate fixes up x_1 \: ... \: x_n \: x_0 to be ordered again. From there we proceed as before on the ranges x_0 \: ... \: x_n and y_1 \: ... \: y_m.

This algorithm actually conforms to the standard! It has O(n) comparisons, uses no extra memory, and has the advantage that it works on ForwardIterator! But unfortunately, it's O(n²) overall in operations, because of course, rotate is O(n). So how can we do better?

Using a temporary buffer

If we have a temporary buffer available that is equal in size to the smaller of the two ranges, we can move the smaller range to it, move the other range up if necessary, and perform a "normal" merge of the two ranges into the original space:

template <typename BidirIt>
void naive_inplace_merge2(
    BidirIt first, BidirIt middle, BidirIt last)
  using T = typename std::iterator_traits<BidirIt>::value_type;
  auto d1 = std::distance(first, middle);
  auto d2 = std::distance(middle, last);
  auto n = std::min(d1, d2);
  auto tmp = std::make_unique<char[]>(n * sizeof(T));
  T* begint = reinterpret_cast<T*>(tmp.get());
  T* endt = begint + n;
  if (d1 <= d2)
    std::move(first, middle, begint);
    std::merge(begint, endt, middle, last, first);
    std::move(middle, last, begint);
    auto i = std::move_backward(first, middle, last);
    std::merge(i, last, begint, endt, first);

This is essentially the algorithm used by STL implementations if buffer space is available. And this is the reason why inplace_merge requires BidirectionalIterator: because move_backward does.

(This isn't quite optimal: the std::move_backward can be mitigated with reverse iterators and predicate negation, but the BidirectionalIterator requirement remains. Also, strictly speaking, std::merge is undefined behaviour here because one of the input ranges overlaps the output range, but we know the equivalent loop is algorithmically safe.)

Provisioning of the temporary buffer is also a little involved because we don't know that elements in the range are default constructible (and perhaps we wouldn't want to default-construct our temporaries anyway). So to deal correctly with non-trivial types here, std::move should actually be a loop move-constructing values. And when std::inplace_merge is used as a building block for e.g. std::stable_sort, it would also be nice to minimize buffer allocation rather than having an allocation per call. Go look at your favourite STL implementation for more details.

Thinking further

The literature contains more advanced algorithms for merging if a suitably-sized buffer is not available: the basis for the STL's choice is covered in Elements of Programming chapter 11, and in general the work of Dudzinski & Dydek and of Kim & Kutzner seems to be cited a lot.

But I knew nothing of this research before tackling the problem, and attempting to solve it requiring just ForwardIterator.

I spent a couple of evenings playing with how to do inplace_merge. I covered a dozen or more A4 sheets of squared paper with diagrams of algorithms evolving. I highly recommend this approach! After a few hours of drawing and hacking I had a really good idea of the shape of things. Property-based testing came in very handy for breaking my attempts, and eventually led me to believe that a general solution on the lines I was pursuing would either involve keeping track of extra iterators or equivalently require extra space. Keeping track of iterators seemed a messy approach, so an extra space approach is warranted.

How much extra space? Consider the "worst case":

x_0 \: x_1 \: x_2 \: ... \: x_n \: y_0 \: y_1 \: y_2 \: ... \: y_m

Assume for the moment that m \leq n. When y_m < x_0, we need extra space to hold all of x_0 \: ... \: x_m. If n \leq m then we will need extra space for x_0 \: ... \: x_n to likewise move them out of the way. Either way, the number of units of extra space we need is min(n, m).

As we move elements of x into temporary storage, we can see that in general at each stage of the algorithm we will have a situation something like this (using Z to mean a moved-from value):

... \: x_i \: ... \: x_n \: Z_0 \: ... \: Z_{j-1} \: y_j \: ... \: y_m

With some values of x moved into temporary storage:

x_{i-t} \: ... \: x_{i-1}

The temporary storage here is a queue: we always push on to the end and pop from the beginning, since the elements in it start, and remain, ordered. Since we know an upper bound on the number of things in the queue at any one time, it can be a ring buffer (recently proposed) over a regular array of values.

Sketching the start

From this, we can start sketching out an algorithm:

  1. Allocate a buffer of size min(m, n) - call it tmp
  2. We'll walk the iterators along the x (first) and y (middle) ranges
  3. The output iterator o will start at first
  4. The next x to consider will either be in-place in the x range, or (if tmp is not empty) in tmp - call it xlow
  5. If *y < *xlow move *x to tmp, move *y to o, inc y
  6. Otherwise, if *xlow is in tmp, move *x to tmp and *xlow from tmp to o
  7. inc o, inc x
  8. if y < last and o < middle, goto 4
  9. deal with the end(s) of the ranges

Dealing with the end

This gets us as far as exhausting the smaller range: after this, we will be in one of two situations.

Situation 1. If we have exhausted the y range, things look like this:

... \: x_i \: ... \: x_n \: Z_0 \: ... \: Z_m

With values of x in temporary storage:

x_{i-t} \: ... \: x_{i-1}

To fix up this situation, we can repeatedly swap the tmp range with the equivalent x range until we reach middle (i.e Z_0), and then simply move the remaining tmp values into place.

I originally wrote a loop repeatedly swapping the values in tmp right up to the end; but I realised that would involve swapping a moved-from object, which would be wrong (it might work… until it doesn’t). Moved-from objects should either be destroyed or made whole (assigned to); nothing else.

Situation 2. The possibility is that we have exhausted the x range, in which case things look like this:

... \: Z_0 \: ... \: Z_{n-i} \: y_j \: ... \: y_m

With values of x in temporary storage:

x_i \: ... \: x_n

To fix up this situation, we can just do a regular merge on the remaining y range and tmp, outputting starting at middle (i.e Z_0). (With the same proviso as before about undefined behaviour with overlapping ranges.) We know that it will be safe to do a loop equivalent to merge, because we have exactly the required amount of space before y_j to fit x_i \: ... \: x_n. This is the same as the STL’s normal buffered merge strategy.

Final thoughts

I tackled this exercise from scratch, knowing nothing about actual implementations of inplace_merge. This algorithm does some extra housekeeping under the hood, but:

  • it does the minimum number of comparisons
  • each element is moved at most twice: into tmp and out again
  • it needs only ForwardIterator

Optimization and benchmarking under differing conditions of range size, comparison and move cost is left as an exercise to the reader…

I cannot recommend Elements of Programming enough. I am partway through reading it; after this exercise I skipped to chapter 11 to see what it said. Every time I dive into the STL algorithms, I am re-impressed by the genius of Alex Stepanov: Paul McJones’ recent talk The Concept of Concept explains this well, in particular the key role of concepts in the STL in service of algorithmic purity. Alex knew about concepts from the start: it’s taken C++ over 2 decades to catch up.

After doing this work, I discovered a recent proposal that calls for weakening the iterator categories of inplace_merge and related algorithms.

An implementation of this algorithm is on github. It’s just a sketch, written for clarity rather than optimality. This has been a fun exercise.

Exercising Ranges (part 8)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

Extras from the Haskell Prelude

Having implemented iterate, there were a couple of other useful things from the Haskell Prelude that I wanted to have available with ranges. First, there’s cycle. It simply takes a finite range and repeats it ad infinitum. The functionality of the cursor is fairly simple, and uses patterns we’ve already seen, so here it is in its entirety:

template <bool IsConst>
struct cursor
  template <typename T>
  using constify_if = meta::apply<meta::add_const_if_c<IsConst>, T>;
  using cycle_view_t = constify_if<cycle_view>;
  cycle_view_t *rng_;
  range_iterator_t<constify_if<Rng>> it_;
  using single_pass = std::false_type;
  cursor() = default;
  cursor(cycle_view_t &rng)
    : rng_{&rng}
    , it_{begin(rng.r_)}
  constexpr bool done() const
    return false;
  auto current() const
  void next()
    if (++it_ == end(rng_->r_))
      it_ = begin(rng_->r_);
  CONCEPT_REQUIRES((bool) BidirectionalRange<Rng>())
  void prev()
    if (it_ == begin(rng_->r_))
      it_ = end(rng_->r_);
  bool equal(cursor const &that) const
    return it_ == that.it_;

Like all cursors, it implements current(), next() and equal(). Additionally, if the range passed in is bidirectional, so is the cursor, using the CONCEPT_REQUIRES macro to conditionally enable prev(). There is no sentinel type needed, and the cursor implements done() in a way that makes the range infinite.

In Haskell, cycle is occasionally useful, although it is often used with zip and filter to achieve the same effect that stride – a function already provided by the range library – offers. (I think this combo use of cycle, zip and filter was hinted at briefly in Eric’s ranges keynote when discussing the motivation that lead to stride.) But it’s sufficiently easy to implement and makes a nice example.

One more from the Prelude

A slightly more useful function from the Prelude is scan (in Haskell, scanl and scanr). Despite its somewhat obscure name, it’s easy to understand: it’s a fold that returns all the partial results in a sequence. For example:

Prelude> scanl (+) 0 [1,2,3,4]

In C++, I’m not sure what to call this… perhaps accumulate_partials? I left it as scan, for want of a better name. (Edit: It’s partial_sum and it exists in the STL and the ranges library. So I’ll let this section just stand as a further example.)

One immediate thing to notice about scan is that the range it produces is one element longer than the input range. That’s easily done:

namespace detail
  template<typename C>
  using scan_cardinality =
      C::value == infinite ?
        infinite :
        C::value == unknown ?
          unknown :
          C::value >= 0 ?
            static_cast<ranges::cardinality>(C::value + 1) :
} // namespace detail

In general, the action of the cursor is going to be the same as we saw with iterate: keep a cached value that is updated with a call to next(), and return it with a call to current().

A nice addition to accumulate

Eric made a nice addition to the range version of accumulate which I also used for scan: in addition to the binary operation and the initial value, there is the option of a projection function which will transform the input values before passing them to the binary operation. So when next() updates the cached value, that code looks like this:

void update_value()
  val_ = rng_->op_(val_, rng_->proj_(*it_));

It’s like having transform built in instead of having to do it in a separate step: a nice convenience. And that’s all there is to scan.


This has been a fun series of experiments with ranges, and I’ve learned a lot. I’ve made mistakes along the way, and I’m sure there are still parts of my code that reflect extant misunderstandings and general lack of polish. But as an experiment, this has been a success. The code works. As a user of the ranges library with some functional programming experience, it wasn’t too hard to put together solutions. And the Concept messages really helped, along with a knowledge transfer of iterator categories from the STL.

As an extender of the library adding my own views, it took a little effort to establish correct assumptions and understandings, but Eric’s code is clear and offers good patterns to build from. I’m sure a little documentation would have steered me more quickly out of problem areas that I encountered. There are some things that are still problematic and I’m not yet sure how to tackle: chiefly involving recursion. And at the moment, Concept messages don’t always help – I think digging into compiler diagnostics is going to be a fact of life as a range implementer.

C++ will never be as syntactically nice as Haskell is, but ranges offer our best chance yet at high-performance, composable, declarative, correct-by-construction code.

Exercising Ranges (part 7)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

Calculus operations

It’s relatively simple to do differentiation and integration of power series, since they’re just regular polynomial-style. No trigonometric identities, logarithms or other tricky integrals here. Just the plain rule for positive powers of x. For differentiation, we have:

\frac{d}{dx}kx^{n} = knx^{n-1}

Which is straightforwardly expressed with ranges: we simply take the tail of the power series and pairwise multiply with n, the power:

template <typename Rng>
auto differentiate(Rng&& r)
  return ranges::view::zip_with(

Integration is almost as easy. The fly in the ointment is that up until now, we have been working entirely with integers, or whatever underlying arithmetic type is present in the range. Here is where Haskell wins with its built in rational numbers and polymorphic fromIntegral for conversion from integral types. Integration brings division into the picture:

\int_{}{} kx^n dx = \frac{kx^{n+1}}{n+1}

Which means we need to turn a range of ints into a range of floats or doubles. Or use a rational number library. But anyway, sweeping these concerns under the rug, the actual code for integration is as easy as that for differentiation:

template <typename Rng>
auto integrate(Rng&& r)
  return ranges::view::concat(
          [] (auto x, auto y) { return (float)x / (float)y; },

We prepend a zero as the constant of integration.

Calculus: in real life, more difficult than arithmetic, but with ranges, considerably easier. That was a welcome break from extreme mental effort. In the next part, I round out my range experiments for now, with a look at a couple more functions from the Haskell Prelude.

Exercising Ranges (part 6)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

Multiplying power series

Now we come to something that took considerable thought: how to multiply ranges. Doug McIlroy’s Haskell version of range multiplication is succinct, lazily evaluated and recursive.

(f:ft) * gs@(g:gt) = f*g : ft*gs + series(f)*gt

Which is to say, mathematically, that if we have:

F(x) = f + xF'(x)
G(x) = g + xG'(x)

then the product is given by:

F(x)G(x) = fg + xF'(g + xG') + xfG'

In C++, this function can be naively written:

template <typename R1, typename R2>
auto mult(R1&& r1, R2&& r2)
  auto h1 = *ranges::begin(std::forward<R1>(r1));
  auto h2 = *ranges::begin(std::forward<R2>(r2));
  return view::concat(
      view::single(h1 * h2),
      add(mult(view::tail(r1), r2),
              [] (auto x) { return x * h1; } )));

If this could be compiled, it might even work. I think the lazy evaluation of the resulting range together with the monoidal_zip behaviour of add would terminate what looks like infinite recursion. But this doesn’t compile, because we don’t just have to deal with recursion at runtime: there is no termination of the recursion at compile-time resulting from the compiler trying to do type inference on the return value. Clang initially complained about exceeding the max template depth (default 256) and when I upped this to 1024, clang laboured over it for an impressive 50 seconds before crashing.

Back to pencil and paper

So I needed to think harder. It seemed that I would need to write a view representing multiplication of two series, and I started playing with polynomials – for the moment, assuming they were of the same degree. After a while I noticed something useful:

F = f_0 : f_1 : f_2 : ...
G = g_0 : g_1 : g_2 : ...
FG = f_0g_0 : f_0g_1 + f_1g_0 : f_0g_2 + f_1g_1 + f_2g_0 : ...

Each successive term is the sum of the terms for which multiplication produces the appropriate power of x. And the computation of any given term is:

f_0g_n + f_1g_{n-1} + f_2g_{n-2} + ... + f_ng_0

This is inner_product of a range with a reversed range! And we know we can reverse the range as long as it has a bidirectional iterator, because it’s just what we’ve already counted up to from the beginning – we already have the end iterator in hand. Next I had to think about how to deal with the cardinality of the product of two polynomials: it’s not enough to iterate to the end of both, even if they’re of the same degree, because there are higher powers of x that get generated in the “tail” of the product. The cardinality is actually the sum of the input cardinalities, minus one.

template<typename C1, typename C2>
using series_mult_cardinality =
    C1::value == infinite || C2::value == infinite ?
      infinite :
      C1::value == unknown || C2::value == unknown ?
        unknown :
        C1::value >= 0 && C2::value >= 0 ?
          static_cast<ranges::cardinality>(C1::value + C2::value - 1) :

A trick of the tail

And once we reach the end of both range iterators we are multiplying, we need to go an extra distance equal to the length we’ve just covered, to obtain the higher power coefficients. I do this with a length_ variable that tracks the distance advanced, and a tail_ variable that increments after the ranges are exhausted until it reaches length_, at which point we are at the end. The end sentinel comparison looks like this:

bool equal(cursor<IsConst> const &pos) const
  return detail::equal_to(pos.it1_, end1_)
    && detail::equal_to(pos.it2_, end2_)
    && pos.tail_ == pos.length_;

The only remaining thing is to deal with different degree polynomials, for which I use the same approach as monoidal_zip, keeping a diff_ variable and doing some bookkeeping. Computing the current value for the iterator becomes the inner product of one range leading up to the current position with the reverse of its counterpart in the other range (with appropriate adjustments to deal with the tail and the different degrees of the ranges):

auto compute_current() const
  auto r1 =  make_range(
      begin(rng_->r1_) + tail_ + (diff_ > 0 ? diff_ : 0),
  auto r2 = view::reverse(
          begin(rng_->r2_) + tail_ + (diff_ < 0 ? -diff_ : 0),
  return inner_product(r1, r2, 0);

So view::series_mult works. It isn’t perfect, because it imposes a bidirectional constraint on the arguments. It is O(n2), which can’t be avoided (we must ultimately multiply every coefficient by every other coefficient). To avoid the bidirectional constraint and just traverse the sequences once, I would need to go back to the recursive definition and reformulate it, somehow avoiding the recursive type, and I can think of a few possible ways to do that:

  • Explicitly write a recursive type function, which includes a base case.
  • Type-erase the multiplied range somehow (this would involve throwing away cardinality and maybe some other things, so it might not lead to a good outcome).
  • Break the type recursion with a technique akin to recursive_variant.
  • Somehow collapse the type of the range, how I’m not sure… the only real way to do this I think is by copying to a container and making a fresh range from it.
  • Write my own system for lazy evaluation… which obviates ranges, really.

None of these are easy (and some have undesirable consequences). I continue to think on the matter, because Doug McIlroy’s Haskell formulations are recursive and beautiful, and multiplication is the easiest of the recursive definitions. Things get even more tricky with division and composition…

Tackling exponentiation

As a footnote to multiplication, I tried to define exponentiation. (I recently read the excellent From Mathematics to Generic Programming, and Chapter 7 is very instructive in the process of generalizing exponentiation from the basis of multiplication.) But I ran into basically the same issue: writing recursive functions to produce ranges is tricky because of the recursive type expansion. Maybe in this case, the problem is defined enough that I can write a recursive type function counterpart to avoid the compiler’s non-terminating type inference.

For the next part of this series, I’ll look at an easier topic: differentiation and integration of power series.

Exercising Ranges (part 5)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

Generalising iota

To pretty-print a reversible polynomial, we need to provide for reversibility of iota. So what is iota? It makes a range by repeatedly incrementing a value. What if instead of increment, we use an arbitrary function? We could do this with generate_n:

template <typename Rng>
std::string to_string_reverse(Rng&& r)
  auto s = ranges::size(r);
  auto powers = ranges::view::generate_n(
      [s] () mutable { return --s; }, s);
  return detail::to_string(

But this isn’t quite as clean as it could be: the lambda mutates state. An alternative is found in the Haskell Prelude:

iterate :: (a -> a) -> a -> [a]

iterate applies a function repeatedly to an argument, feeding the result of the last iteration to the input of the next one. It’s very similar to generate, and we can write both iterate and iterate_n. The view takes a function and an initial value, and produces the sequence:

 x, f(x), f(f(x)), f(f(f(x))), ...

Here’s the C++ for the skeleton of iterate_view:

template<typename G, typename T>
struct iterate_view
  : view_facade<iterate_view<G, T>, infinite>
  friend struct range_access;
  using result_t = concepts::Function::result_t<G, T>;
  semiregular_t<G> gen_;
  semiregular_t<result_t> val_;
  explicit iterate_view(G g, T&& t)
    : gen_(std::move(g)), val_(std::move(t))

The cursor is easily written just like generate‘s cursor: next() caches a new value, and current() returns it. Like generate, the view can only be traversed once, so the cached value is kept in the view itself rather than the iterator. And there is no end cursor.

struct cursor
  iterate_view *view_;
  result_t current() const
    return view_->val_;
  void next()
void next()
  val_ = gen_(val_);

For iterate_n, the approach is almost identical, except that next() decrements a count, and size() is also provided. Using iterate_n, we can generate a decreasing sequence and use it in the reverse print function for a polynomial, which we know is a SizedRange.

template <typename Rng>
std::string to_string_reverse(Rng&& r)
  auto s = ranges::size(r);
  auto powers = ranges::view::iterate_n(
      [] (auto i) { return i - 1; }, s-1, s);
  return detail::to_string(

As for iterate itself, it’s potentially useful in lots more places. The obvious use cases (and test cases) are mathematical in nature, for instance my test case using the Collatz conjecture (aka Ulam conjecture, 3n+1 conjecture) sequence starting with 27:

int collatz(int n)
  if (n & 1) return 3*n+1;
  return n>>1;
DEF_TEST(Collatz27, Iterate)
  auto m = view::iterate(collatz, 27)
    | view::take_while([] (int n) { return n != 1; });
  EXPECT(ranges::find(m, 9232) != ranges::end(m));
  return true;

But there are also plenty of applications in other code, where we are updating the state of something based on its previous state: iterate separates the loop code and allows us to write a clean function that simply produces a new state from an old state. This is useful for games and simulations of all kinds. So iterate is a useful addition to the range views.

Now we can pretty-print polynomials properly, in the next part I’m going to take on the trickier subject of how to multiply them.

Exercising Ranges (part 4)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

On reversibility

It occurs to me that I’ve been touting reversibility as a good thing, both implicitly and explicitly (for polynomials), and the reason for that is not just that I might want to print polynomials in natural decreasing-power-of-x order.

The first reason reversibility is important to me: coming from the STL, we are used to containers being iterable. If a container supports bidirectional iterators, it’s necessarily reversible, since all containers are finite and one can obtain the end iterator. The same is not true of ranges – because of the generalization of end as a sentinel condition, ranges can be infinitely long. An infinitely long range has no end, so there’s no way to reverse it, because there’s nowhere to start. It can still support bidirectional iteration, but without reversibility that’s of limited use in most cases. So in my mind, reversibility is a better concept for ranges than simply the availability of bidirectional iteration. (There may be occasional uses for bidirectional iteration without the existence of end – Turing machine simulation springs to mind – but I think reversibility is the more common desire.)

The second reason might be called Postel’s Law of iterator categories: if you’re writing an algorithm (or range view), accept as weak an iterator category as you can, but provide as strong an iterator category as you can. Many views can work with input iterators. Some require forward iterators. Few require stronger guarantees. (They might work more efficiently if random access iteration is available, but this isn’t the same as saying they require that.) And on the flip side, a view should almost never degrade the iterator category that is passed in to it. Polynomials should be reversible, so views on them should be reversible.

In this vein, I tried to make monoidal_zip require only InputRange, but support reversing if the underlying ranges support it.

Pretty-printing polynomials

With that in mind, I embarked on pretty-printing polynomials and infinite series, wanting to print them both in infinite series form (increasing powers-of-x) and “more natural” polynomial form (decreasing powers-of-x).

We have a sequence of coefficients, and each corresponds to a power of x, so I start out by zipping the polynomial with an increasing sequence of integers representing the powers.

I first thought of view::transform and view::intersperse. I coded up a solution using view::transform (view::intersperse turned out to be problematic because the thing being interspersed – the plus or minus sign – depends on the sign of the coefficient).

Here’s a helper function for printing the power of x:

std::string x_to_power(int n)
  if (n == 0) return {};
  if (n == 1) return {"x"};
  return {"x^" + std::to_string(n)};

And here’s the prettyprint function (that’s what I originally called it) using view::transform:

template <typename Rng>
void prettyprint(Rng&& r)
  bool start = true;
  auto m = view::zip(std::forward<Rng>(r),
    | view::transform(
        [&start] (const std::pair<int, int>& p) -> std::string {
          std::string s;
          if (p.first != 0) {
            if (!start)
              s = p.first > 0 ? " + " : " - ";
            s += std::to_string(p.first) + x_to_power(p.second);
            start = false;
          return s;
  ranges::for_each(m, [](const string& s){ cout << s; });
  cout << endl;

It has several bugs as-is, and I wasn’t happy with that start boolean hanging out there. And moreover, this seemed like a very procedural way to work and was annoying the functional programming part of my brain. What I actually wanted was not a function to output a sequence, but a fold to convert a sequence into a string, or to put it in C++ language, accumulate. The fold is independent of whether or not the sequence is reversed, so I factored out the “driver function” in anticipation of creating a reversing version of it.

template <typename Rng>
std::string to_string(Rng&& r)
  return detail::to_string(

And the fold looks like this:

template <typename Rng>
std::string to_string(Rng&& r)
  return ranges::accumulate(
      [] (std::string s, const auto& p) {
        // if the coefficient is non-zero
        if (p.first != 0)
          // treat the first one printed specially
          if (s.empty())
            s += p.first > 0 ? "" : "-";
            s += p.first > 0 ? " + " : " - ";
          auto coeff = p.first > 0 ? p.first : -p.first;
          // don't print coefficient of 1 (unless it's the constant)
          if (coeff != 1 || p.second == 0)
            s += std::to_string(coeff);
          s += detail::x_to_power(p.second);
        return s;

The complexity that creeps in here is, I think, the intrinsic complexity of printing a polynomial rather than complexity introduced by the wrong solution (start has disappeared, replaced with logic that deals with the first printed term). So this accumulate-based approach works with the power series approach of printing in increasing powers of x; but how do we reverse that zip? How can we reverse iota?

To answer that, I needed to go back to the Haskell Prelude, in the next part.

Exercising Ranges (part 3)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

So, I was going to implement monoidal_zip, and to do that, I would clone zip_with.hpp. So I did that. Eric’s a great library author, and the ranges code is pretty easy to read. For the most part I found it concise, well-expressed and clear. The parts I didn’t immediately understand were owing to my own (temporary) ignorance and not a result of murky code.

Since zip_with takes N ranges and an n-ary function, it’s written with variadic templates and folding operations over parameter packs. A monoid by definition only uses a binary operation, so I immediately simplified that. Then I worked through each part of the file, adapting it to my use case.

Range cardinality

The first things in zip_with.hpp are a bunch of constexpr functions in a detail namespace, and then a cardinality calculation. Range cardinality turns out to be an enumeration with three members – infinite (-3), unknown (-2), and finite (-1) – and then a value of zero or above is a known cardinality such as might be obtained at compile-time from an array.

Aside: I didn’t think about/realise until much later that the cardinality of a vector isn’t known at compile time. Perhaps it could be in some cases – a shortcoming of the STL re constexpr?

In the case of known cardinalities, zip_with uses the minimum of the N ranges; I needed the maximum. Simple, if verbose to deal with the other possible enum values.

namespace detail
  template<typename C1, typename C2>
  using monoidal_zip_cardinality =
      C1::value == infinite || C2::value == infinite ?
        infinite :
        C1::value == unknown || C2::value == unknown ?
          unknown :
          C1::value >= 0 && C2::value >= 0 ?
            max_(C1::value, C2::value) :
} // namespace detail

(The function max_ is one of Eric’s existing constexpr functions in zip_with.hpp.)

Cursors and sentinels

The main work in writing any view is of course, defining the cursor and the sentinel (internal representations of iterators: the view provides begin_cursor and end_cursor functions – in const and mutable forms – which return a cursor and a sentinel respectively).

There are a few simplifying implementation patterns I found in several views that I adopted. Commonly the cursor and the sentinel are templated on a bool representing constness. And it’s also common for the cursor to need a pointer back to the range it’s iterating over, and that pointer needs the right constness, so we end up with code like:

template<typename Fun, typename R1, typename R2>
struct iter_monoidal_zip_view
  : view_facade<iter_monoidal_zip_view<Fun, R1, R2>,
  template <bool IsConst>
  struct cursor
    friend struct sentinel<IsConst>;
    // add a const if I am const
    template <typename T>
    using constify_if = meta::apply<meta::add_const_if_c<IsConst>, T>;
    // pointer to my view
    using monoidal_zip_view_t = constify_if<iter_monoidal_zip_view>;
    monoidal_zip_view_t *rng_;
    // iterators I use
    range_iterator_t<constify_if<R1>> it1_;
    range_iterator_t<constify_if<R2>> it2_;

Satisfying the Concepts

I wanted a reversible range. That meant I needed my cursor to be bidirectional. Every cursor must implement current, next and equal, and of course a bidirectional cursor must implement prev:

auto current() const;
void next();
void prev();
bool equal(cursor const& that) const;

Every sentinel implements comparison with a cursor:

bool equal(cursor<IsConst> const &pos) const;

And the range itself implements begin_cursor and end_cursor, with end_cursor returning either a cursor or a sentinel depending on whether the underlying ranges are bounded:

  using are_bounded_t = meta::and_c<(bool) BoundedRange<R1>(),
                                    (bool) BoundedRange<R2>()>;
  cursor<false> begin_cursor()
    return {*this, fun_, begin_tag{}};
  meta::if_<are_bounded_t, cursor<false>, sentinel<false>>
    return {*this, fun_, end_tag{}};

(I omitted the const versions of these functions which use the CONCEPT_REQUIRES macro as a nice wrapper for enable_if style functionality.)

Nice conventions

In order that views work on containers-as-ranges, various views simply use view::all to turn whatever container is passed in into a range. If actual ranges are passed in, view::all just forwards them, unaffected. This is very handy.

There is a convention in the ranges code that deals with reporting concept violations to the user: separate out the concept as a template alias, eg:

template<typename Fun, typename R1, typename R2>
using Concept = meta::and_<
  InputRange<R1>, InputRange<R2>,
  Callable<Fun, range_reference_t<R1> &&, range_reference_t<R2> &&>>;

And write the required function with an overloaded version which uses the CONCEPT_REQUIRES_ macro in an inverted sense from the “normal” version:

// "normal" version
template<typename R1, typename R2, typename Fun,
         CONCEPT_REQUIRES_(Concept<Fun, R1, R2>())>
monoidal_zip_view<Fun, all_t<R1>, all_t<R2>> operator()(
    Fun fun, R1 && r1, R2 && r2) const
{ ... }
// "concept failed" version
template<typename Fun, typename R1, typename R2,
         CONCEPT_REQUIRES_(!Concept<Fun, R1, R2>())>
void operator()(Fun, R1 &&, R2 &&) const

The actual work

So, how does the cursor actually work? The idea is fairly simple. Keep an iterator to each range that the view is considering. Usually they advance together. When we run out of one range, keep track of the difference between the iterators, to enable reverse iteration. The rest is bookkeepping. (Error-prone, exacting bookkeeping: I didn’t get reverse iteration working properly until I gave the whole thing a proper workout with unit tests. Off-by-one errors, one of the three hardest things in computer science.)

The monoidal_zip view’s begin is when both iterators are at their respective begins, and the corresponding thing is true for end. You can find the implementation on github – it’s too much for an already long post.

Writing current is easy, we just need to check the iterators to do the monoid identity thing if needed:

auto current() const
  diff_ == 0 ?
    fun_(it1_, it2_) :
    diff_ > 0 ? *it1_ : *it2_

And that’s it. Now we have monoidal_zip, which fulfils the needs of adding power series (and also subtracting them). It works on anything that’s a monoid, not just addition. And it’s reversible. In implementing this, I discovered that the result of reversing zip_with may be surprising: because the ranges are lazily traversed, effectively it’s as if the ranges were reversed inside the zip. If the ranges are different lengths, that may not be what you want, but if you think about it, it sort of has to be that way?

After that lengthy workout, in the next instalment something easier: pretty-printing power series.

Exercising Ranges (part 2)

Wednesday, July 1st, 2015

(Start at the beginning of the series if you want more context.)

First steps with power series

A power series (or polynomial, to give it a more familiar term in the case where it’s finite) is represented simply as a sequence of coefficients of increasing powers of x. This is simple – to represent:

1 + 2x + 3x^2

the code is:

std::vector<int> v = {1, 2, 3};

The simplest thing we can do to this is to negate it, which in range terms is a transform:

template <typename Rng>
auto negate(Rng&& r)
  return ranges::view::transform(std::forward<Rng>(r),
                                 [] (auto i) { return -i; });

Adding polynomials

Things get more interesting when we want to add two polynomials, because they might be of differing degree. If we try to use zip_with, things don’t work out as we want:

template <typename R1, typename R2>
auto add(R1&& r1, R2&& r2)
  return ranges::view::zip_with(std::plus<>(),
std::vector<int> v1 = {1, 2, 3};
std::vector<int> v2 = {1, 2, 3, 4, 5};
auto sum = add(v1, v2);
// we want sum to be {2, 4, 6, 4, 5}
// it's actually {2, 4, 6}

The problem is that zip_with (and by extension zip) only zips as far as both ranges go, and discards the “tail” of the longer range, if any. This is mostly What You Want™, but not here. What we want is for the tail to be included.

So I thought about this for a while. The first thing I thought of was appending an infinite sequence of zeroes to the shorter range:

auto summable_range = view::concat(shorter_range, view::repeat(0));

I coded this up to try it out, and in doing so learned some things about range cardinality (more on that to come), but there is a serious limitation here – at least one that was hard for me to immediately circumvent – which is that the view we return isn’t reversible. It really seems like, given two polynomials I can reverse, I ought to be able to reverse their sum. For one thing, if we’re dealing with a polynomial, it’s more naturally written as decreasing powers of x rather than increasing:

3x^2 + 2x + 1 is more natural than 1 + 2x + 3x^2

Functional spider senses

So, I thought some more – what I needed was a variety of zip that, when one range ran out, started just returning elements from the longer range. At this point, the functional programmer part of my brain woke up. It saw addition. It saw zeroes. And it whispered to me, “think monoids”. What I wanted was a generalization of polynomial addition, a “monoidal zip” that would take a binary operation and extend to the end of both ranges, using the identity element when the shorter range ran out.

In fact, since using the identity is, by definition, unchanging to the other argument, there is no need to supply the identity to this function – a realization that I didn’t come to until later after I’d had time to step back. For now, C++ brain took over again.

template <typename R1, typename R2>
auto add(R1&& r1, R2&& r2)
  return ranges::view::monoidal_zip(std::plus<>(),

I delved into zip_with.hpp and this is where my real learnings began, in the next part.

Exercising Ranges (part 1)

Wednesday, July 1st, 2015

The idea

For some time since attending C++Now, I have been meaning to get around to playing with Eric Niebler’s range library. Eric presented ranges in a worked example as one of the keynotes at C++Now – a prior version of the talk that he gave at the Northwest C++ Users Group is currently available on Youtube, and the C++Now keynote should be up soon (under BoostCon).

Listening to Eric’s talk, I was immediately struck by how using ranges is quite functional in nature – composable, declarative, “wholemeal programming” as it’s sometimes called. And I also noticed Eric’s collection of range views and actions, many of which are of course familiar from the STL, but several more of which are clearly inspired by the Haskell Prelude. (And there are useful things in the Prelude that aren’t yet in ranges, for no particular reason other than nobody’s done the work yet.)

Then, the week after C++Now I attended LambdaConf in Boulder, and one talk in particular gave me some motivating examples to test out with ranges. So, a couple of weeks ago, experimenting with ranges finally made it to the top of my to-do list.

Starting out

Would-be range users are cautioned:

This code is fairly stable, well-tested, and suitable for casual use, although currently lacking documentation. No promise is made about support or long-term stability. This code will evolve without regard to backwards compatibility.

OK, just so we know what we’re getting into. Also, the range documentation is described as “woefully incomplete”. Most of it is stubs generated by doxygen. What is there in terms of explanatory prose is also somewhat dated at this point – for instance, recently some template classes were renamed – but having seen the talk, I knew some range fundamentals, so I took the approach of reading code instead of reading documentation. To what end that served me well or steered me wrong remains to be seen, but I got along OK.


What I wanted to experiment with were views: that is, lightweight, non-owning constructs that (typically) present a way of iterating a range (or ranges). Existing examples are things like take, zip, concat.

From the talk, I knew some fundamentals of ranges and their iterator model. The iterator model used in ranges is a relaxed version of the STL’s begin/end paradigm, particularly with respect to end. Instead of using a pair of iterators, ranges use the idea of an iterator/sentinel pair, so begin is an iterator and end is a sentinel. A sentinel can be compared to an iterator, indeed, it may be an iterator. But it also allows the end of a range to be marked with a sentinel value, or it can express a condition that terminates the range. So range views can express ideas like generate_n, take_while, or infinite ranges.


Eric has done impressive work with Concepts, mentioned in his ranges talk almost as an aside, but representing a significant and important addition to the ranges library. It means two things: first, a user of ranges can get a nice compiler diagnostic that hopefully steers her quickly to a fix, rather than having to pore over pages of template errors. But second, and as importantly, many of the ideas behind ranges are explicit in the code, not just documented or implicitly understood. And that was a big help to me as a reader of the library.

In particular, Eric has adopted and extended the STL’s idea of the different iterator categories, applying representative concepts to both iterators and ranges. In my opinion, a working knowledge of these concepts is essential for the would-be range hacker. As a new range user, I was familiar with Input (and Output), Forward, Bidirectional, RandomAccess as applied to iterators and ranges. Then I had to gain familiarity with a few new concepts: BoundedRange and SizedRange being two of them. It took me a while to integrate the various strengths of range capabilities with the old STL iterator categories model.

Feeling like a novice again

For instance, one thing that came up frequently was that I would want to use view::reverse on a range. In general, it might not be obvious which ranges can support this and I found that I had to think a bit and re-examine some assumptions. The fact that a range can be iterated bidirectionally is necessary but not sufficient for reversing it – you also have to be able to find the end. I’m not sure if Reversible exists as a Concept, but it might be a worthy addition.

It has been my experience that the nice Concept error messages aren’t always triggered. This isn’t to say they aren’t working, just that they aren’t sufficient to cover all cases nicely, perhaps especially as a library writer rather than as a user. Several times I scratched my head over a compile problem, and needed to experiment with things a bit to discover exactly what I had done wrong or had accidentally omitted. It’s been a while in my career since I had to noodle over a compiler error for that long!

Inspiration and motivation

At LambdaConf, I saw an excellent talk by Doug McIlroy about modelling power series in Haskell. In a dozen lines of code, he’s able to express arithmetic manipulation of infinite power series, as well as differentiation, integration, and functional composition. And in perhaps another dozen (just to deal with termination conditions) the code deals with polynomials (i.e. finite power series) as well. This is beautiful code, and with the power of ranges I wanted to see what I could do to get the same kind of declarative, natural style in C++.

All the code for my experiments is on github, and in the next part, I’ll cover my initial implementation steps, getting my hands dirty with ranges.