Not in fact any relation to the famous large Greek meal of the same name.

Wednesday 15 April 2009

Debugging The Invisible

The Empeg codebase, and the Empeg coding style for embedded C++, don’t use C++ exceptions. The original reason was that ARM support in GCC was still a bit sketchy when Empeg started, and exceptions plain didn’t work.

But I think that even with a toolchain in which C++ exceptions work properly, they should still be avoided, and not just in embedded systems either. Here’s why.

Exception safety

You need to write your code in exception-safe style: a style which neither the compiler, nor other tools, can enforce. Some would say that “write exception-safe code” is just another way of saying “don’t write buggy code”, or “don’t be a crap programmer”, but many other ways of saying that — “write const-correct code”, “don’t leak memory” — can be automatically checked, and thus a unit-test suite (perhaps with Valgrind) can give you great confidence that your code fulfils those criteria, and remains in fulfilment under subsequent maintenance. You can’t easily unit-test for exception safety.

Another consideration is that many, perhaps most, C++ developers used to be C developers. Most things these developers need to learn when switching to C++, are completely new things. But exception-safety is something they need to learn which changes the way they’d write perfectly reasonable straight-line C code, so that’s always going to be trickier for everyone.

Scott Meyers didn’t even bother writing exception-safe code until the third edition of Effective C++ (search for “dfc”).

Invisible control flow

The Empeg C++ coding style, like others, forbade non-const reference parameters. The reason is, that doing so forces by-reference objects which may be modified to go via pointers instead. This, in turn, forces call sites to use “&” when a function may modify the value, and no “&” when it doesn’t. This puts this essential information about the data flow at the call site, not just at the declaration — and, when debugging, it’s usually the call site and not the declaration that you’re looking at. So banning non-const references improves the understandability of the data flow by making it visible.

Exceptions are, in a way, even worse than non-const references: they bestow on your code unasked, not just data flow you can’t see, but control flow you can’t see. Consider, in particular, a function that doesn’t contain a try or a throw, but calls functions which do (or may do). There’s a second control flow through all that code, which you didn’t write — the one where something below it in the call-chain throws, and something above it catches — but which has a crucial effect on whether your code is buggy or not.

And again, you can’t unit-test for it: even if you use gcov to test that your tests cover the normal control flow of your code, it can’t tell you whether you’ve also exercised all the invisible, secondary control flows. (In theory, a much smarter gcov-like tool could, though.)

Joel agrees.

Speed, size, exceptions: pick two

In D&E, Stroustrup that claims a design goal for C++ exceptions was “No added cost (in time or space) to code that does not throw an exception” (16.2), but later on in 16.9.1 he admits that people have figured out how to avoid cost in time in the no-exception case, but not cost in space.

A medium-sized C++ program I have here (amd64-linux, GCC 4.3.2) has a text section of 548K and exception-handling tables of 127K. But even that 23% increase isn’t the whole story, as is revealed by disassembly: that try-less, throw-less, but non-leaf function — if it has local variables with destructors — also gains snippets of unwinding code for each such variable. These snippets appear in the text section, so can’t easily be totalled-up, but an unscientific test on one source file suggests that they make up about 7% of the text section.

These exception-handling tables do affect embedded systems more than desktops or servers — on which the exception-handling tables are never demand-paged in unless they’re used — but the unwinding snippets are still there affecting your paging bandwidth and filling your instruction cache.

But you don’t get to choose

In a sense, choosing which of the two distinct languages — C++-with-exceptions or C++-without-exceptions — to use to program your embedded system, is a choice you only get to make when starting the system from scratch. Or, at least, converting a codebase from C++-without-exceptions to C++-with-exceptions is a herculean and error-prone task, and a half-converted one is a bug tarpit, because of the exception-safety issues. If the actual goal is to use a library written in C++-with-exceptions with a main program written in C++-without-exceptions, the best way is to write wrappers for each library function, that catch(...) and report errors through the existing non-exception error handling mechanism. With a bit of luck you can then continue to compile the main program with -fno-exceptions (or similar) and avoid the space overhead.

Even this doesn’t really work when the library is something like Boost, which is chock-full of throws in headers that really you need to be including in any code that uses the library. (Including for things like locking a non-recursive mutex that’s already locked, which in my book is more like a programming error than an exceptional condition, and would be better as an assertion. But that’s a different rant.)

No comments:

Post a Comment

About Me

Cambridge, United Kingdom
Waits for audience applause ... not a sossinge.
CC0 To the extent possible under law, the author of this work has waived all copyright and related or neighboring rights to this work.