Doxygen and C++: <i>not</i> a marriage made in heaven

I briefly interrupt your scheduled browsing…

Doxygen is a documentation system for C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors) and to some extent PHP, C#, and D. So it says on the website, and so it is. It’s also a Good Thing, most people would assume.

Doxygen has its roots in Javadoc, a documentation system for Java. I’m sure auto-generation of documentation is great for Java. But you see, Java needs it. In Java, you write functions right in the class definition. No header file prototypes to browse. And so a documentation system is necessary to collect everything into one readable place. Imagine trying to browse Java without docs – you’d be picking through 10 pages of code to find the function definitions. The same applies to C#.

Doxygen seems to have taken the C++ world (at least the corner of it which I inhabit) by storm too. For a couple of years now I’ve heard it touted as some kind of panacea for finding one’s way around APIs painlessly. And actually, auto-generated documentation (whether from Doxygen or some other tool) can be very handy. But this kind of documentation can be poor too.

In C and C++, we have header files! And here’s the thing: I see far too much autogenerated (Doxygen) documentation that tells me precisely nothing I couldn’t have learned from looking at a header file containing sensible function names and parameters. It is not helpful to read documentation (to make up a hypothetical example) that says:

DBConnectionHandle OpenDatabaseConnection(const char *dataSource);
Opens a database connection.
Parameters:
dataSource – String specifying the data source.
Returns:
Handle to the new connection.

Point 1: Poor Doxygen documentation is rife.

In fact it’s worse than “not helpful”, it’s actually harmful. Generating this documentation has now polluted the header file with probably 10 lines of superfluous comments to feed Doxygen and has made the actual code harder to read. It would have been better simply to choose decent function and argument names, and add one or two lines of comment if necessary. Header files are the fastest available documentation for a function: the programmer’s hands are on the keyboard, and the function definition is only a source-browser hotkey away.

Point 2: Doxygen makes plain header files harder to read.

There is another problem, too. Because much more typing goes into Doxygen comments, it encourages laziness – copy and paste, failure to keep comments up to date. And we all know that inaccurate comments are worse than no comments.

Point 3: Auto-generated documentation is more likely to be stale (or plain wrong) than normal commenting.

One more thing: auto-generated documentation has a limited audience. Normally, an API or class can have several intended audiences: the maintainer, the client API user, and the client API extender. Or to put it another way, someone interested in internals, someone interested in usage only, and someone interested in deriving classes and overriding methods. Now the fact is that C++ doesn’t do a very good job of servicing these roles anyway – but at least the class author can go some way and tailor parts of the class for one audience or another. For example choosing whether to group areas by functionality, or by visibility. Doxygen can’t easily make this distinction.

Point 4: Doxygen only wears one hat and only serves one audience well.

Doxygen does do some useful things. Showing a tree of include files is a good aid to optimising compile times, for instance. But as documentation for C++ and C APIs and classes, I think we’re better off writing clear definitions, and if necessary, leaving the documentation for a human to write. I’ve used good auto-generated documentation, and I’ve used bad. I’ve used Doxygen to document my code, and I’ve written plain documentation for my code. And I prefer a well-written header file or a well-written API usage guide over an auto-generated API reference any day of the week.

Leave a Reply