Archive for the ‘Programming’ Category

C++ Guru Question – followup

Tuesday, August 12th, 2014

(following on from C++ Guru Question)

There are a few reasons why the code before didn’t work: mainly

a) C++ template argument deduction works one-way with a list of candidates, it’s not H-M type inference.
b) A C++ lambda is a thing with some internal type, not a std::function (although it can be assigned to one, that doesn’t matter for template type deduction).

Anyway, a reasonable solution to this problem is to simplify the template argument matching to the function type, and use a simple function_traits template to take apart the function argument and return types.

template <typename T>
struct function_traits
  : public function_traits<decltype(&T::operator())>
template <typename R, typename A>
struct function_traits<R(A)>
  typedef R returnType;
  typedef A argType;
template <typename C, typename R, typename A>
struct function_traits<R(C::*)(A) const>
  : public function_traits<R(A)>
template <typename T>
struct Foo
  T m_t;
template <typename F>
typename function_traits<F>::returnType operator/=(
    Foo<typename function_traits<F>::argType> foo, const F& fn)
  return fn(foo.m_t);
void mystery()
  auto foo = Foo<int>{1};
  // this works...
  function<Foo<int>(int)> f1 = [] (int i) { return Foo<int>{i}; };
  auto bar1 = foo /= f1;
  // this does too!
  auto f2 = [] (int i) { return Foo<int>{i}; };
  auto bar2 = foo /= f2;

C++ Guru Question

Monday, August 11th, 2014

Wondering about this… template argument deduction succeeds for the explicitly-typed variable, fails in the auto case. (Also, it succeeds either way for an equivalently-typed unary operator template).

template <typename T>
struct Foo
  T m_t;
template <typename T, typename U>
Foo<U> operator/=(Foo<T> foo, function<Foo<U>(T)> fn)
  return fn(foo.m_t);
void mystery()
  auto foo = Foo<int>{1};
  // this works...
  function<Foo<int>(int)> f1 = [] (int i) { return Foo<int>{i}; };
  auto bar1 = foo /= f1;
  // this doesn't
  auto f2 = [] (int i) { return Foo<int>{i}; };
  auto bar2 = foo /= f2;
  // clang++ 3.3-16ubuntu1 says:
  // file:line:col: error: no viable overloaded '/='
  // auto bar2 = foo /= f2;
  //             ~~~ ^  ~~
  // file:line:col: note: candidate template ignored: could not match
  // 'function<Foo<type-parameter-0-1> (type-parameter-0-0)>' against
  // '<lambda at file:line:col>'
  // Foo<U> operator/=(Foo<T> foo, std::function<Foo<U>(T)> fn)
  //        ^

(followup to this)

An Interesting C++ Technique

Monday, August 26th, 2013

I recently discovered a C++ technique I haven’t seen recognized before in any books, articles, or mentioned online anywhere (search terms are difficult perhaps). For want of a better name I call it Structural Capture. Consider the following code:

#include <functional>
#include <iostream>
using namespace std;
struct Foo
  template <typename T>
  Foo(const T& t)
    m_outputFunc = [=] (ostream& s) -> ostream& { return s << t; };
  function<ostream& (ostream&)> m_outputFunc;
ostream& operator<<(ostream& s, const Foo& f)
  return f.m_outputFunc(s);
int main(int argc, char *argv[])
  Foo f(1);
  cout << f;
  return 0;

Foo is a regular struct(/class), but its constructor is a method template which wraps its argument in a lambda. When you construct a Foo, the you get the benefit of structural typing that templates bring: as long as the template argument implements a stream output operator, it’ll compile.

But through the use of a lambda, you also get the benefit of runtime polymorphism, and because Foo is a concrete type, you can put it in a data structure (like a container) without the template argument leaking out.

This seems like it could be quite useful in various scenarios. Sure, there’s a memory cost, and you could look at it as replacing a (per-type) vtable with per-instance member variables (ie. lambda functions and their associated captures). But as a way of mixing the benefits of compile-time and runtime polymorphism I think it’s pretty neat.

RIP drm

Thursday, October 13th, 2011

This week, Dennis M. Ritchie, co-creator of C and Unix, died. His death has not made the front page like Steve Jobs’ did. But although the non-tech world had never heard of him, he was more important than Jobs.

I can still remember buying my copy of K&R, that slim volume that was my reference manual and introduction to C. Shortly before term in my second year, my Dad drove me up to Cambridge, and after unloading my stuff and having lunch, we went into Heffers on King’s Parade. On the upper tier, about two-thirds of the way toward the back on the right, was the Computer Science section. My Dad bought for me a copy of K&R and also a copy of Modula-3. K&R cost £25.

Modula-3 I seldom used after graduating, but C became a permanent part of my life. I spent my university summers compiling Linux kernels and fiddling with programs, and then I got a job in the games industry, and the rest is history. Until only a few years ago, I still used my copy of K&R now and then, mostly for reference (p53, operator precedence; p122, complicated declarations, and of course the appendix for remembering the order of arguments to fread() et al). At my last company, at one point it was issued to every engineer as a matter of course. It has now been retired only because of “proper” C++ I/O finally being acceptable in games.

As for the rest, Wired says it better than I can. Thank you, Dennis.

How a Bug Made Me a Better Programmer

Sunday, March 1st, 2009

This is the tale of a bug. A challenging bug I encountered some time ago, and which changed my career for the better.

It was late 2004. I was lead multiplayer engineer on Goldeneye: Rogue Agent, a title which could charitably be described as mediocre, but I like to think the networked multiplayer game was a bit of fun – we had fun with it, anyway.

We’d been working 16-hour-ish days, 6.5 days a week for oh, about four months. These were the bad old days immortalized by ea_spouse, but that’s another story. Suffice to say that I was probably running up a sleep deficit, with only amusing online videos and Fire Emblem on the SP keeping me sane in between bug fixing.

We’d been having fun every evening playing 8-player network games for a while, and overall, the network game was pretty stable. The whole project was down to somewhere around a hundred outstanding bugs, which sounds like a lot, but the engineering team was over 70 engineers at its peak. There was one particular bug that was vexing me, however. The network game went in rounds, cycling maps. Every so often, a machine would fail to get into the next round after loading a map. It would just hang on the loading screen, having completed loading, and never having got the message to start playing.

I’m betting that most network game programmers (and indeed most network programmers) would immediately recognise the type of bug – a classic race condition. The QA dept reported this bug originally, and it happened about 1% of the time. If you were playing an 8-player game, you’d see it during a few hours of play. This was a real showstopper – no way out but to reboot the console, too frequent to ship, but far too rare to get much of a handle on.

So the first thing to do was to glean what I could. Since it wasn’t actually crashed at the point of the hang (still executing the game loop, just doing nothing), I could attach a debugger to a QA machine when the bug occurred. However, this didn’t really help much per se – a crash with a call stack showing the exact location would have been preferable. However, the debugger did allow me to slightly narrow down what was happening through judicious breakpointery.

Debugging network games is notoriously difficult, and with good reason. Although I could see the state of the problem machine, there was no context left to evaluate the problem. The rest of the machines had left it behind by this point, and I had to try to piece together the trail of events leading up to the bad state, after the bad state had already been triggered. So I fell back on the old school technique of adding copious amounts of logging, aka “printf debugging”.

So over the course of several days and evenings, my cubemates and I continued to play game after game, and each time we saw this problem, I’d stop and painstakingly go through the logs. I’d figure out a little bit more of what was happening, narrow down the error to a smaller section of code, add more logging, rebuild and redeploy my new version to the team. The turnaround time was probably a little under an hour. Of course, the first thing I did also was to make the game timeout after a few seconds, so that we could set up an 8-player game and leave it to map-cycle without actually wasting problem-hunting time actually playing the game.

In between times, I read the code. And re-read the code. And re-read it, and stepped through it, trying to think about and follow each possible path of execution. By this time, experiment had narrowed the problem down to a few functions, and eventually, while reading the code, I realised what the problem was. The evidence from the logs and the days of trying to reproduce the issue confirmed the diagnosis. It was, of course, a race condition. And as is so often the case with programming, once characterised, the bug was fairly easy to fix. I put a fix in place, commented the change, tested it, and committed it to the codebase. We did not see any more occurrences of the bug.

That was Thursday. On Tuesday I got the bug returned by QA. At first I was incredulous – how could it be extant after 5 days of silence, seemingly fixed? But by this stage of my programming career, I had developed some humility about my own code, and respect for QA. I mean, the guys and girls who test stuff don’t actually lie or make stuff up. If the bug was still there, it was still there, so I went back to it, after checking the reported version.

The problem was, as I saw it, that there was no way for it to happen any more. By this point, this was not hubris or denial on my part either. I had been through the code literally hundreds of times. For a week, this code was the last thing I thought about before falling asleep at night, and the first thing I pondered when waking up. I’m sure I had dreamt it once or twice. I could close my eyes and see the code in my head as clearly as I could read it on the screen. There was just no way the bug could still be there.

And yet it was there.

So, I had to find a new approach. As odd as it may seem, I had only so far looked at the bug in C++. After all, it was not a particularly gnarly bug as far as these things normally went – no heap or stack corruption, no optimised build vs debug variance, no weird-PS2-DMA-ness, no occurrence-on-real-media-only type of thing. But now it had got significantly harder, so I decided to take a look at the assembly code generated by the compiler. This was actually easy to get started with: I didn’t need to reproduce the bug; all I had to do was set a breakpoint in the function, run to that, and look at the assembly in the debugger.

Now I wouldn’t call myself an assembly language expert. I’ve never written assembler in anger, but I can get by reading it, and I know the basics to get around – how the compiler sets up and returns from function calls, how to find this pointers and recognise virtual tables, etc. And all of this stuff is easier on a RISC CPU than on say, Intel. So I fired up the PS2 devkit.

What I found was very puzzling. I looked at the assembly language for the function in question. I looked at the C++ source code. And what I saw on one side was not what I saw on the other side. They matched up to a point. But my fix was nowhere to be found in the machine code. It should have been fairly obvious to spot – the code had a few handy signposts like arithmetic instructions and constants which were transparent in the assembler. But the fix was just not there.

I called over a couple of other senior engineers to get a second opinion. Was I seeing things? They agreed that I wasn’t. Very strange, but now we were quite close to finally, properly, solving this bug. All we had to do was figure out why the compiler was ignoring the C++ source line that constituted the fix.

So we looked at the source control logs. What we found was that the file in question had, some time ago, had its line endings changed. What should have been CRLF was merely LF. Ordinarily this would be fine, but when I’d made the fix, I’d commented it. The fix was one line. The comment was one line, above the fix. Evidently the compiler had seen both lines as one. And the whole thing was therefore a comment, to be discarded.

Once found, of course, the fix was trivial. Fixing the file line endings allowed my original fix to work, and the bug was finally dead.

Since then, I regularly drop to assembler in the debugger. With assembly language, you know what’s going on. It’s made me a better programmer, and a better bug fixer. All those gnarly bugs I mentioned earlier (eg. ones that only occur in release builds) don’t seem so gnarly any more when you’re used to looking at assembler.

I wrote my first Python program

Tuesday, January 20th, 2009

Last Friday. Knowing almost no Python at noon, by 5pm I had some code to munge XML and do something useful for my current project.

So it’s not bad. It’s good for productivity. Mostly because:

  • It has useful libraries.

  • Bread-and-butter data structures are built in, i.e. lists and dictionaries.
  • I don’t care too much about performance for small tasks.

I can deal with significant whitespace, and a few syntactic things trip me up (like colons after if statements), but that’s small beans.

There are a few things on the minus side:

  • All that stuff with self just seems like ugly boilerplate extra typing.

  • Philosophically, I don’t like a BDFL unilaterally breaking backwards compatibility and removing FP features.

    Overall, Python is just fine, but underwhelming. It doesn’t have a monopoly on the good features, which makes it really just another learn-on-demand scripting language.

Functional Rainbow

Monday, December 29th, 2008

Functional Rainbow

(Part of) what I do all day

Friday, September 12th, 2008

Lately I’ve been asked what I do at work. What I really do. Well, if Great Biggary can frequently expound on his latest Maya scripts and python excursions, here goes with my explanations…

Most of what I do all day is read code. Just like an author or poet will read many many more books and poems than they write, so programmers read much more code than they write. At least, the ones that want to be any good do.

So I read and read and read, and hopefully understand, before I finally write a little bit which pokes the program in just that way and achieves the right result. Changing or writing as little code as possible means introducing as few bugs as possible.

I’m working with a few million lines of code, most of it C++. Today was a day when I didn’t quite get it to do what I wanted, but had to settle for second best. I’m writing some code to analyse how much data we load off disk every frame. (Note for my non-technical readers: a frame – in this case – is 1/30 of a second or 33ms. Quite a long time to an Xbox 360, but not long to you and me. It’s like… now to now. Nowtonow. N-now. You get the idea.)

Where was I? Oh yes. Well, while the game is running, we’re often loading things from the disk, which is why I’m writing this code. During development, we run off hard disk, which is really fast, but when we ship the final game, we need to run off DVD, which is really slow by comparison, which is why it’s important for us to know exactly how much data we’re asking it to read over time. Otherwise bad things happen like the player coming around a corner to discover that the ground ahead has temporarily disappeared. You might have seen something like this in video games from time to time, especially when discs get dirty.

There are basically a few categories of things we load (or “stream”) while the game is playing:

  • Sound effects and music
  • Textures (i.e. graphics, to explain further is outside the scope of this post, as they say)
  • Videos (when they are playing)
  • Extra parts of levels (“the world”, which are triggered to load at certain points as the player moves along)

So, imagine I’ve written a spiffy little display which shows how much data we loaded in each category over the last 2 seconds (or 60 frames, if you were paying attention earlier). All I need to do is actually feed the data into it, which means intercepting the part of the code where we actually read from the disc, and keeping a running total of the number of bytes we read, in each category.

This was quite easy to do for textures and levels, which go through one mechanism for loading. No problem there. But it was a little more tricky to do for the audio and video data. Especially the video. Here I need to get a bit technical to explain why.

To hook up the texture and level data to the right category in the display was fairly easy – at one point in time when a disc load is scheduled, I save a little piece of context along with the load request so that later, when the disc load happens, the saved context tells me what category to add to. The code that knows which category to add to is just one step away from the code that actually does the loading, so this is easy to keep track of.

But it’s a large codebase, and the code that reads data off the disc for the audio and video data is different from the code that reads data for the other categories. In fact, it turns out that the video code especially is much more complex. In this case, the code that actually does the loading is many steps away from the code that knows the context of the data. This is because the video and audio code is pretty aggressively multi-threaded.

Explanatory aside for the non-techies: a thread is a single set of instructions the computer is executing, one after another. In the real olden days, one thread was all you got. Then some bright spark (no, really) noticed that a lot of the time the processor was waiting for other stuff to happen (like reading from a disk, say) and the concept of multi-threading was born. If one lot of instructions (e.g. one program) was waiting for something to happen and unable to proceed, well, we could let the computer work at doing something else for a while until it could resume the first task. Heck, we could even interrupt the computer after it had been doing one thing for a while and let it do another thing for a while, and if we did this fast enough, it would look like the computer was doing two things at once!

This concept is now the basis of all modern operating systems, even those written by Microsoft. This is why you can have your email open and browse the web at the same time. And nowadays we don’t just have the illusion of computers (and consoles) doing multiple things at once; we are starting to get machines with more than one processor in them, so that they really can do multiple things in parallel.

So back to the video problem. It’s this: the code that counts the bytes that are loaded is in one thread. The video player control is in another. The code handling the audio is another. The code that actually reads the data off disc is yet another. And (this is the real kicker) the code that decodes the video and actually triggers more data to be read off disc when it’s ready for more? Well that code is on several threads, and they are changed around while the program is running according to which processors have spare time.

So to get a 100% accurate attribution of bytes read from disc to audio and video categories would require passing that bit of context through many handoff points, writing a lot of code, and recompiling many libraries, and therefore also maintaining those changes to the libraries, which are normally handled by other people in other parts of the company. (A library – being a chunk of code that is self-contained and that you can use generally without detailed knowledge of its innards – is primarily a mechanism for reducing complexity. Audio and video code is typical of the sort of thing that gets put into libraries. So changing the insides of a library, while technically as possible as changing any code is, is preferably to be avoided.)

So my solution? I put in a piece of code to flag when a video was playing, and a piece of code to intercept the actual reading the data off disc part that services the audio and video code. A piece of code at the very top and bottom layer of the cake, as it were. Videos play rarely, so when a video is not playing, I add up all the bytes and simply put them in the audio category. When a video is playing, I put them all in the video category (even though at that point it includes some audio data). The result is not perfect, since video and audio data can’t be separated, but since videos play rarely, it works well enough for my purposes.

Sometimes, even in programming, one needs to abandon the perfectly correct in favour of the good-enough, pragmatic solution.

Why Reading Books Matters to a Programmer

Tuesday, April 29th, 2008

The programming book market these days is small. Nothing like what it was eight years ago or so. And apparently, programmers don’t read books (any more).

It’s mostly true. But of course, there are still books worth reading. I’m going to take as read the easy arguments: let’s assume that you’ve already pared away all the “learn such-and-such in 24 hours”, anything with “for dummies” or “for teens” or equivalent in the title, and any book that weighs more than your head.

What you are left with is two kinds of books: timeless or beautiful texts that everyone agrees will still be worth reading in 20 years (and in many cases, already have passed this test), and the tougher category to decide on: the middling sort of books, probably specific to certain technologies and/or languages. In many cases they may be very good books, may even have been defining texts on first release, but either technology has moved on, or there is so much great information out there on the web that it makes spending $50+ on a book a proposition to think about.

The answer is: you should absolutely use all the information on the web, pulling it in as you “page fault”, and googling solutions as you go, in a practical way. AND you should absolutely read these books.

Reading and doing are not mutually exclusive ways of learning. They are complementary. And reading is more useful than many pragmatist programmers realise. Sure, you’re not going to learn practical things like debugging or how to wrangle libraries and packages just by reading. But you’re not going to get much wider context just by doing, either.

One of the first programming (culture) books I bought was the New Hacker’s Dictionary. It says this about reading:

Although high general intelligence is common among hackers, it is not the sine qua non one might expect. Another trait is probably even more important: the ability to mentally absorb, retain, and reference large amounts of ‘meaningless’ detail, trusting to later experience to give it context and meaning. A person of merely average analytical intelligence who has this trait can become an effective hacker, but a creative genius who lacks it will swiftly find himself outdistanced by people who routinely upload the contents of thick reference manuals into their brains.

The crucial part here is that reading gives a wider context to doing. This is why programmers should read, and reading and retaining (or at least being able, on second encounter, to remember and put in context) technical information is an important skill to have.

So read those books. Read code and blogs too. It may sound daunting (“How can I possibly remember everything?”) but the more you read and remember, the easier it will get, because brains are great at putting things in context and linking to things that they already know. And then when you get to doing, you’ll see the bigger picture, and perhaps you’ll also remember about a looming pitfall you read about, but which the hastily-constructed web tutorial you’re currently perusing glosses over.

Curve and Vector

Sunday, March 30th, 2008

Curve (com.elbeno.curve) is my common lisp package for doing cool things with two-dimensional curves. In particular, modulating cubic Bézier curves and splines, but also approximating arbitrary elliptical and circular arc segments with cubic Bézier curves.

It depends on Vector (com.elbeno.vector), a cobbled together set of functionality for representing and manipulating points on the 2D Cartesian plane. It also depends on Zach Beane’s excellent Vecto, for ease of drawing. Both Curve and Vector are available under the GPLv3 license.

Download Curve
Download Vector

Currently both Curve and Vector are at v0.1.