Happy Solstice

June 22nd, 2009

…and Father’s day, if applicable. I’m actually over a day late now… I spent a pleasant day with the family, and trying to keep a cold at bay.

Things You Didn’t Know Had Another Name

June 14th, 2009

Under the general heading of “Oh, so that’s what it is! I never realised,” picture this.

It’s late 70s/early 80s Britain. You can’t sleep, so get up at some infeasibly early time, grab your morning stimulant of choice and settle down in the lounge to see what’s on TV. Not much, in this pre-dawn, pre-breakfast-TV scenario. Pages from Ceefax of course, and flipping over to BBC2 you hear the familiar strains of…

Leonard Salzedo’s Divertimento!

(Explanation for Americans)

Ow

June 7th, 2009

While the mini-Elbeno and I were horsing around today, he managed to get me in the eye with a fingernail. Luckily our next door neighbour is an optometrist, and she took me down to her office tonight to check it out. So I have a scratched cornea, not too bad - although it is painful. It should heal in a couple of days and I have some antibiotic drops. (O.S. in case you’re wondering, and it’s not affecting my vision.)

I’m Jaunty

May 16th, 2009

A pretty smooth upgrade -

  • Freeciv(-client-gtk) seemed to barf, so I just uninstalled it.

  • The new Amarok is nice, even if the jaunty version isn’t totally up-to-the-minute. I did have to install phonon-backend-xine though.
  • LLVM is v2.5, hurrah! GHC is still v6.8.2, boo.

Fixed gate

May 14th, 2009

Today we had a new automatic gate opener/closer installed, so our driveway gate will now stay closed as a rule. This also means that we can let mini-Elbeno out into the yard or the driveway to play or trundle about on his trike while we are nearby, without having to constantly shepherd him or prevent him from going out into the street.

Battle of the Bands

May 8th, 2009

The buildup started 5-6 weeks ago with band formations and rumours of entries. Then came the posters stuck up across the campus advertising the various bands. Finally, at 5pm today it all kicked off - the second annual Blizzard Battle of the Bands.

When I worked at EA, we had the odd band contest - usually held in the cafe at lunchtime, with a handful of bands and the more keen ones dressing up. Blizzard’s extravaganza was simply on a different scale.

The whole company (and their families) kicked back to enjoy 2 hours of rock on a balmy evening, with free beer/soda/water, kids running around and everyone just having a great time. Eight bands took their turns up on stage - and what a show it was. Multi-kilowatt speaker systems, stage/disco lighting, half a dozen big-screen TVs, and the whole thing filmed by DirecTV.

The bands themselves left nothing spared in their quest for glory. Costumes, makeup, throwing t-shirts and tchotchkes out to the crowd, pyrotechnics, streamers and confetti, the works. And most of them played a mean game of Guitar Hero too! We all rocked out with Cute Overlord, Max DPS, Zergling Rush, THAC0, Megadeth Knight, Yasu and the Werewolves, Super Shredder Super Show, and Hair Trigger Reloaded. Who won? You’ll have to wait for the video to surface at Blizzcon…

Book Festival

April 26th, 2009

This weekend we all went to the LA Times Festival of Books at UCLA. The number of stalls was noticeably smaller this year: a whole row of children’s publishers was gone. We saw the following panels:

  • Real Science (K. C. Cole moderating, with Carl Zimmer, Leonard Susskind and Avery Gilbert).
    A good panel discussion with 3 different fields represented: the evolutionary biology of E. coli, cosmology, and the science of smells. Each panellist was engaging in his own way. I picked up The Black Hole War and What the Nose Knows.

  • Science Fiction: The Grand Masters (Scott Timberg moderating, with Joe Haldeman, Harry Harrison and Robert Silverberg).
    This discussion was a disappointment. There were some interesting points but for the most part it could have been retitled “Science Fiction: The Grumpy Old Codgers”. The moderator wasn’t very good either, dragging out such dreadfully old chestnuts as “Is SF really about the future, or about the present?”

  • T.C. Boyle with an Introduction by Georges Borchardt.
    Mrs Elbeno attended this one, and according to her it was very good. I took care of the mini-Elbeno at the Target Children’s Stage for most of the time, and we watched Choo Choo Soul. Then he ran around exploring the university staircases and such for an hour while Mrs Elbeno waited to meet TC and get her books (Tortilla Curtain and The Women) signed.

  • Michael J. Fox in Conversation with Mary McNamara.
    I didn’t get to see much of this because the mini-Elbeno wasn’t able to sit still quietly, so I headed out with him after about 10 minutes. Which was actually more like 5 minutes because someone held up the proceedings by shouting for a doctor like it was a medical emergency, only to recover and be apparently all right. Anyway, Mrs Elbeno reported that Michael J. Fox was good; Mary McNamara not so much. And the questions from the audience were of the usual poor standard. Given that I had left early, I went straight down to the post-interview signing zone, but all the presigned books were already sold out, and I guessed that MJF wasn’t about to do a big signing after the interview. So I picked up an unsigned one from the UCLA store.

I also snagged a few other books from the UCLA store which was having a festival-long 20% off sale: Gaming: Essays on Algorithmic Culture, Handbook of Computer Game Studies and HCI Remixed.

Re April 1st

April 1st, 2009

Dear Internet,

I refer you to my thoughts of 2 years ago about April 1st.

yours grumpily,
Elbeno

How a Bug Made Me a Better Programmer

March 1st, 2009

This is the tale of a bug. A challenging bug I encountered some time ago, and which changed my career for the better.

It was late 2004. I was lead multiplayer engineer on Goldeneye: Rogue Agent, a title which could charitably be described as mediocre, but I like to think the networked multiplayer game was a bit of fun - we had fun with it, anyway.

We’d been working 16-hour-ish days, 6.5 days a week for oh, about four months. These were the bad old days immortalized by ea_spouse, but that’s another story. Suffice to say that I was probably running up a sleep deficit, with only amusing online videos and Fire Emblem on the SP keeping me sane in between bug fixing.

We’d been having fun every evening playing 8-player network games for a while, and overall, the network game was pretty stable. The whole project was down to somewhere around a hundred outstanding bugs, which sounds like a lot, but the engineering team was over 70 engineers at its peak. There was one particular bug that was vexing me, however. The network game went in rounds, cycling maps. Every so often, a machine would fail to get into the next round after loading a map. It would just hang on the loading screen, having completed loading, and never having got the message to start playing.

I’m betting that most network game programmers (and indeed most network programmers) would immediately recognise the type of bug - a classic race condition. The QA dept reported this bug originally, and it happened about 1% of the time. If you were playing an 8-player game, you’d see it during a few hours of play. This was a real showstopper - no way out but to reboot the console, too frequent to ship, but far too rare to get much of a handle on.

So the first thing to do was to glean what I could. Since it wasn’t actually crashed at the point of the hang (still executing the game loop, just doing nothing), I could attach a debugger to a QA machine when the bug occurred. However, this didn’t really help much per se - a crash with a call stack showing the exact location would have been preferable. However, the debugger did allow me to slightly narrow down what was happening through judicious breakpointery.

Debugging network games is notoriously difficult, and with good reason. Although I could see the state of the problem machine, there was no context left to evaluate the problem. The rest of the machines had left it behind by this point, and I had to try to piece together the trail of events leading up to the bad state, after the bad state had already been triggered. So I fell back on the old school technique of adding copious amounts of logging, aka “printf debugging”.

So over the course of several days and evenings, my cubemates and I continued to play game after game, and each time we saw this problem, I’d stop and painstakingly go through the logs. I’d figure out a little bit more of what was happening, narrow down the error to a smaller section of code, add more logging, rebuild and redeploy my new version to the team. The turnaround time was probably a little under an hour. Of course, the first thing I did also was to make the game timeout after a few seconds, so that we could set up an 8-player game and leave it to map-cycle without actually wasting problem-hunting time actually playing the game.

In between times, I read the code. And re-read the code. And re-read it, and stepped through it, trying to think about and follow each possible path of execution. By this time, experiment had narrowed the problem down to a few functions, and eventually, while reading the code, I realised what the problem was. The evidence from the logs and the days of trying to reproduce the issue confirmed the diagnosis. It was, of course, a race condition. And as is so often the case with programming, once characterised, the bug was fairly easy to fix. I put a fix in place, commented the change, tested it, and committed it to the codebase. We did not see any more occurrences of the bug.

That was Thursday. On Tuesday I got the bug returned by QA. At first I was incredulous - how could it be extant after 5 days of silence, seemingly fixed? But by this stage of my programming career, I had developed some humility about my own code, and respect for QA. I mean, the guys and girls who test stuff don’t actually lie or make stuff up. If the bug was still there, it was still there, so I went back to it, after checking the reported version.

The problem was, as I saw it, that there was no way for it to happen any more. By this point, this was not hubris or denial on my part either. I had been through the code literally hundreds of times. For a week, this code was the last thing I thought about before falling asleep at night, and the first thing I pondered when waking up. I’m sure I had dreamt it once or twice. I could close my eyes and see the code in my head as clearly as I could read it on the screen. There was just no way the bug could still be there.

And yet it was there.

So, I had to find a new approach. As odd as it may seem, I had only so far looked at the bug in C++. After all, it was not a particularly gnarly bug as far as these things normally went - no heap or stack corruption, no optimised build vs debug variance, no weird-PS2-DMA-ness, no occurrence-on-real-media-only type of thing. But now it had got significantly harder, so I decided to take a look at the assembly code generated by the compiler. This was actually easy to get started with: I didn’t need to reproduce the bug; all I had to do was set a breakpoint in the function, run to that, and look at the assembly in the debugger.

Now I wouldn’t call myself an assembly language expert. I’ve never written assembler in anger, but I can get by reading it, and I know the basics to get around - how the compiler sets up and returns from function calls, how to find this pointers and recognise virtual tables, etc. And all of this stuff is easier on a RISC CPU than on say, Intel. So I fired up the PS2 devkit.

What I found was very puzzling. I looked at the assembly language for the function in question. I looked at the C++ source code. And what I saw on one side was not what I saw on the other side. They matched up to a point. But my fix was nowhere to be found in the machine code. It should have been fairly obvious to spot - the code had a few handy signposts like arithmetic instructions and constants which were transparent in the assembler. But the fix was just not there.

I called over a couple of other senior engineers to get a second opinion. Was I seeing things? They agreed that I wasn’t. Very strange, but now we were quite close to finally, properly, solving this bug. All we had to do was figure out why the compiler was ignoring the C++ source line that constituted the fix.

So we looked at the source control logs. What we found was that the file in question had, some time ago, had its line endings changed. What should have been CRLF was merely LF. Ordinarily this would be fine, but when I’d made the fix, I’d commented it. The fix was one line. The comment was one line, above the fix. Evidently the compiler had seen both lines as one. And the whole thing was therefore a comment, to be discarded.

Once found, of course, the fix was trivial. Fixing the file line endings allowed my original fix to work, and the bug was finally dead.

Since then, I regularly drop to assembler in the debugger. With assembly language, you know what’s going on. It’s made me a better programmer, and a better bug fixer. All those gnarly bugs I mentioned earlier (eg. ones that only occur in release builds) don’t seem so gnarly any more when you’re used to looking at assembler.

I wrote my first Python program

January 20th, 2009

Last Friday. Knowing almost no Python at noon, by 5pm I had some code to munge XML and do something useful for my current project.

So it’s not bad. It’s good for productivity. Mostly because:

  • It has useful libraries.

  • Bread-and-butter data structures are built in, i.e. lists and dictionaries.
  • I don’t care too much about performance for small tasks.

I can deal with significant whitespace, and a few syntactic things trip me up (like colons after if statements), but that’s small beans.

There are a few things on the minus side:

  • All that stuff with self just seems like ugly boilerplate extra typing.

  • Philosophically, I don’t like a BDFL unilaterally breaking backwards compatibility and removing FP features.

    Overall, Python is just fine, but underwhelming. It doesn’t have a monopoly on the good features, which makes it really just another learn-on-demand scripting language.