Whitebait, Kleftiko, Ekmek Special: April 2009

Thursday, 23 April 2009

One Template To Rule Them All

How do you write a template that can be instantiated for any type? Here’s a header file which purports to implement a tracing (logging) mechanism...

#include <stdio.h>

class Tracer {};

inline const Tracer& operator<<(const Tracer& r, int i)
{ printf("%d", i); return r; }

inline const Tracer& operator<<(const Tracer& r, const char *s)
{ printf("%s", s); return r; }

template <class T>
inline const Tracer& operator<<(const Tracer& r, T *p)
{ printf("%p", p); return r; }

class NullTracer {};

template <typename T>
inline const NullTracer& operator<<(const NullTracer& n, T) { return n; }

#ifdef DEBUG
#define TRACE Tracer()
#else
#define TRACE NullTracer()
#endif

...and here’s some example client code:

class X
{
    enum { A, B } m_state;

    void Foo() { TRACE << "x" << this << " is in state " << m_state << "\n"; }

public:
    X();
};

(I used this rather nifty script to do the syntax-colouring.)

What’s going on here is that, in debug builds (with -DDEBUG on the compiler command-line), TRACE expands to an expression that creates an object that acts a bit like std::cout, in that various different types of object can be sent to it and each printed in a sensible and type-safe manner. Just as with std::cout, the precedence and associativity of the << operator get everything printed in the right order.

In release builds, meanwhile, TRACE expands to an expression that creates an object of a different class, whose operator<< ensures that whatever type is sent to it, nothing is printed — and, indeed, the entire expression optimises away to nothing.

Except there’s a problem. The code above doesn’t actually compile in release builds (without -DDEBUG). Here’s what GCC 4.3.3 says about it:

wtf.cpp: In member function ‘void X::Foo()’: wtf.cpp:30: error: no match for ‘operator<<’ in ‘operator<< [with T = const char*](((const NullTracer&)((const NullTracer*)operator<< [with T = X*](((const NullTracer&)((const NullTracer*)operator<< [with T = const char*](((const NullTracer&)((const NullTracer*)(& NullTracer()))), ((const char*)"x")))), this))), ((const char*)" is in state ")) << ((X*)this)->X::m_state’ wtf.cpp:10: note: candidates are: const Tracer& operator<<(const Tracer&, int) wtf.cpp:13: note: const Tracer& operator<<(const Tracer&, const char*)

So naturally I knew at once what the problem was. Not. The error suggests that it can’t find an operator<< to do the job — but surely there’s one right there that can do any such job. Yet it isn’t even listed as a candidate.

While composing my GCC Bugzilla entry in my head, I tried to whittle down the problem. To my surprise, it was the enum which was causing the problem: trace statements with only strings and pointers compiled just fine. (Those experienced with what Abrahams and Gurtovoy call “debugging the error novel” might just about be able to make that out from the error message.)

So what’s up with m_state? What’s so very special about its type that it isn’t matched by template<typename T>? Well, there is one special thing about its type. It’s unnamed. When I wrote m_state into that code, neither its declaration, nor any of its uses, actually needed to name its type — so I applied Occam’s Razor and didn’t give it a name. And it turns out that that’s the problem: you can’t instantiate a template with an unnamed type as a template-argument. I can’t find a statement to that effect in Stroustrup-3, but it’s right there in the ISO C++03 standard ISO/IEC 14882:2003 at 14.3.1 [temp.arg.type]:

A local type, a type with no linkage, an unnamed type or a type compounded from any of these types shall not be used as a template-argument for a template type-parameter.

This does make perfect sense when you think about it a bit more: if templates could be instantiated with unnamed types, how would their names be mangled? Or, in other words, how could the compiler and linker conspire to arrange that multiple instantiations of NullTracer<unnamed-enum-type>, in multiple translation units, ended up as one instantiation in the final program? Without name matching, there’d be no way to do that — at least within the constraints that the C++ committee set for themselves, specifically that C++ compilers be usable with C-era linkers by dint of name mangling. The same mangling considerations are probably also the reasoning behind the ban on string literals as template arguments.

And the reason it works in non-debug builds, with the real Tracer, is that in that case no template is instantiated with m_state’s actual type: integral promotion makes it an int, and the overload for int is called. Nor can you fix up NullTracer by adding a template specialisation for int: the template specialisation rules aren’t quite the same as the overload resolution rules, and the compiler is doomed to pick the generic operator<<, and have it fail, before it tries the integral promotions. (SFINAE doesn’t help us, as again it is applied only with the enum type, not the promoted type.)

So what's needed is another overload, one that catches unnamed enum types:

inline const NullTracer& operator<<(const NullTracer& n, int) { return n; }

And now everything works as expected: unnamed enums (and, in fact, ints too, not that it matters) are passed via the non-templated overload, and everything else goes via the templated overload.

Wednesday, 15 April 2009

Debugging The Invisible

The Empeg codebase, and the Empeg coding style for embedded C++, don’t use C++ exceptions. The original reason was that ARM support in GCC was still a bit sketchy when Empeg started, and exceptions plain didn’t work.

But I think that even with a toolchain in which C++ exceptions work properly, they should still be avoided, and not just in embedded systems either. Here’s why.

Exception safety

You need to write your code in exception-safe style: a style which neither the compiler, nor other tools, can enforce. Some would say that “write exception-safe code” is just another way of saying “don’t write buggy code”, or “don’t be a crap programmer”, but many other ways of saying that — “write const-correct code”, “don’t leak memory” — can be automatically checked, and thus a unit-test suite (perhaps with Valgrind) can give you great confidence that your code fulfils those criteria, and remains in fulfilment under subsequent maintenance. You can’t easily unit-test for exception safety.

Another consideration is that many, perhaps most, C++ developers used to be C developers. Most things these developers need to learn when switching to C++, are completely new things. But exception-safety is something they need to learn which changes the way they’d write perfectly reasonable straight-line C code, so that’s always going to be trickier for everyone.

Scott Meyers didn’t even bother writing exception-safe code until the third edition of Effective C++ (search for “dfc”).

Invisible control flow

The Empeg C++ coding style, like others, forbade non-const reference parameters. The reason is, that doing so forces by-reference objects which may be modified to go via pointers instead. This, in turn, forces call sites to use “&” when a function may modify the value, and no “&” when it doesn’t. This puts this essential information about the data flow at the call site, not just at the declaration — and, when debugging, it’s usually the call site and not the declaration that you’re looking at. So banning non-const references improves the understandability of the data flow by making it visible.

Exceptions are, in a way, even worse than non-const references: they bestow on your code unasked, not just data flow you can’t see, but control flow you can’t see. Consider, in particular, a function that doesn’t contain a try or a throw, but calls functions which do (or may do). There’s a second control flow through all that code, which you didn’t write — the one where something below it in the call-chain throws, and something above it catches — but which has a crucial effect on whether your code is buggy or not.

And again, you can’t unit-test for it: even if you use gcov to test that your tests cover the normal control flow of your code, it can’t tell you whether you’ve also exercised all the invisible, secondary control flows. (In theory, a much smarter gcov-like tool could, though.)

Joel agrees.

Speed, size, exceptions: pick two

In D&E, Stroustrup that claims a design goal for C++ exceptions was “No added cost (in time or space) to code that does not throw an exception” (16.2), but later on in 16.9.1 he admits that people have figured out how to avoid cost in time in the no-exception case, but not cost in space.

A medium-sized C++ program I have here (amd64-linux, GCC 4.3.2) has a text section of 548K and exception-handling tables of 127K. But even that 23% increase isn’t the whole story, as is revealed by disassembly: that try-less, throw-less, but non-leaf function — if it has local variables with destructors — also gains snippets of unwinding code for each such variable. These snippets appear in the text section, so can’t easily be totalled-up, but an unscientific test on one source file suggests that they make up about 7% of the text section.

These exception-handling tables do affect embedded systems more than desktops or servers — on which the exception-handling tables are never demand-paged in unless they’re used — but the unwinding snippets are still there affecting your paging bandwidth and filling your instruction cache.

But you don’t get to choose

In a sense, choosing which of the two distinct languages — C++-with-exceptions or C++-without-exceptions — to use to program your embedded system, is a choice you only get to make when starting the system from scratch. Or, at least, converting a codebase from C++-without-exceptions to C++-with-exceptions is a herculean and error-prone task, and a half-converted one is a bug tarpit, because of the exception-safety issues. If the actual goal is to use a library written in C++-with-exceptions with a main program written in C++-without-exceptions, the best way is to write wrappers for each library function, that catch(...) and report errors through the existing non-exception error handling mechanism. With a bit of luck you can then continue to compile the main program with -fno-exceptions (or similar) and avoid the space overhead.

Even this doesn’t really work when the library is something like Boost, which is chock-full of throws in headers that really you need to be including in any code that uses the library. (Including for things like locking a non-recursive mutex that’s already locked, which in my book is more like a programming error than an exceptional condition, and would be better as an assertion. But that’s a different rant.)

Sunday, 12 April 2009

Autoconf, #ifdef, and #if

The standard macros available in GNU Autoconf for “checking for...” C/C++ functions and header files, AC_CHECK_FUNCS and AC_CHECK_HEADERS, have a couple of things going for them: their syntax and semantics essentially haven’t changed for years, perhaps ever, and everyone — well, everyone who writes configure scripts, which if it isn’t you should be a big hint as to how much you’ll get from this blog — already knows how they work. Which is to say, they end up embedding in your generating header, typically config.h, lines which look like this:

/* Define to 1 if you have the <linux/inotify.h> header file. */
/* #undef HAVE_LINUX_INOTIFY_H */

/* Define to 1 if you have the `localtime_r' function. */
#define HAVE_LOCALTIME_R 1

That’s all you need to start making your code conditional on which facilities are available:

#include "config.h"
    [...]
#ifdef HAVE_LOCALTIME_R
    localtime_r(&start, &stm);
#else
    // This code isn't thread-safe and could
    // get the wrong time -- but the result
    // would just be an odd-looking filename, not 
    // incorrect operation, so we live with it
    stm = *localtime(&start);
#endif

But there’s a problem. In fact, there’re two problems. One of the best features of C and C++, one of the few things that lets the old guard lord it over today’s arriviste scripting languages, the pythons and the rubies, is that almost everything is checked at compile-time. If you mistype an identifier, say localtime_r, as localtimer or whatever else, your code won’t compile, even if it’s on a rarely- or never-used path through the code. Unless you’re unlucky enough to hit the exact name of another existing identifier (and even then the type system may save you, unless you’re unlucky again), your mistyping is caught right there as you’re writing the code, not in QA — or, horrors, in production.

The same can’t be said, though, of identifiers in #ifdef. It’s perfectly valid to use a name there that doesn’t exist — nor could it be otherwise, as there’s nothing else it does except distinguish existent from non-existent names. So if you fat-finger it — HAVE_LOCALTIMER, maybe — the code compiles without even a warning, and just does the wrong thing, or at least the less-optimal thing. Even into production.

The second problem is config.h itself: if you forget to #include it, again your code compiles without a warning, but all your features get turned off! There are ways around this, such as by having a little script that checks your sources using grep, or by using GCC’s -include command-line option to force the inclusion, but none is ideal. (For instance, the latter gets dependencies wrong if config.h changes.)

Really what you want is warnings at compile-time if you mistype the identifier, or forget the header. In order to do this, you’re going to need to use #if rather than #ifdef; in other words, you’re going to need a config.h with lines like this:

/* Define to 1 if you have <linux/inotify.h>, or to 0 if you don't. */
#define HAVE_LINUX_INOTIFY_H 0

/* Define to 1 if you have localtime_r(), or to 0 if you don't. */
#define HAVE_LOCALTIME_R 1

And to do that, you’re going to have to use custom Autoconf macros instead of the standard AC_CHECK_ ones. Which is the point at which most people would give up, as Autoconf macros expand into shell code but are themselves written in the macro language m4, whose syntax can be charitably described as quixotic, and is regularly described using rather shorter and more Anglo-Saxon words than that. So here, I’ve done it for you, have these ones:

AC_DEFUN([PDH_CHECK_HEADER], [
AC_MSG_CHECKING([for <$1>])
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[#include <$1>
]])],
          [AC_MSG_RESULT([yes]); pdh_haveit=1; pdh_yesno="yes"],
          [AC_MSG_RESULT([no]);  pdh_haveit=0; pdh_yesno="no"])
AC_DEFINE_UNQUOTED(AS_TR_CPP([HAVE_$1]), $pdh_haveit,
          [Define to 1 if you have <$1>, or to 0 if you don't.])
AS_TR_SH([ac_cv_header_$1])=$pdh_yesno
])

AC_DEFUN([PDH_CHECK_HEADERS], [
m4_foreach_w([pdh_header], $1, [PDH_CHECK_HEADER(pdh_header)])
])

AC_DEFUN([PDH_CHECK_FUNC], [
AC_MSG_CHECKING([for $1()])
AC_LINK_IFELSE([AC_LANG_PROGRAM([], [[(void)$1();]])],
        [AC_MSG_RESULT([yes]); pdh_haveit=1; pdh_yesno="yes"],
        [AC_MSG_RESULT([no]);  pdh_haveit=0; pdh_yesno="no"])
AC_DEFINE_UNQUOTED(AS_TR_CPP([HAVE_$1]), $pdh_haveit,
        [Define to 1 if you have $1(), or to 0 if you don't.])
AS_TR_SH([ac_cv_func_$1])=$pdh_yesno
])

AC_DEFUN([PDH_CHECK_FUNCS], [
m4_foreach_w([pdh_func], $1, [PDH_CHECK_FUNC(pdh_func)])
])

That bracketfest defines macros PDH_CHECK_HEADERS and PDH_CHECK_FUNCS which do the same job as the standard AC_CHECK_HEADERS and AC_CHECK_FUNCS...

PDH_CHECK_HEADERS([io.h poll.h errno.h fcntl.h sched.h net/if.h stdint.h stdlib.h unistd.h shlwapi.h pthread.h ws2tcpip.h sys/poll.h sys/time.h sys/types.h sys/socket.h sys/syslog.h netinet/in.h netinet/tcp.h sys/inotify.h sys/resource.h linux/unistd.h linux/inotify.h linux/dvb/dmx.h linux/dvb/frontend.h])
PDH_CHECK_FUNCS([mkstemp localtime_r setpriority getaddrinfo inotify_init gettimeofday gethostbyname_r sync_file_range sched_setscheduler])

...but leave you with identifiers in config.h that you can use #if on rather than #ifdef, so you can enjoy compile-time warnings if you get any of them wrong.

Ah, except you can’t just yet, because there’s one final wrinkle. The C and C++ standards allow undefined identifiers in #if just as in #ifdef — they don’t expand to nothing, giving the syntax errors you’d initially hope for, but are defined as evaluating to zero. So, back to square one. Except, fortunately, the authors of GCC appear to be as sceptical about the usefulness of that preprocessor feature as I am, and they provide an optional warning which can be emitted whenever the preprocessor is obliged to evaluate an undefined identifier as zero: -Wundef. So, if you use the configury macros above, and add -Wundef to your GCC command-lines, you can at last get the compiler to point it out to you when you mistype your configure macros.

When I wrote these new macros, I did have a nagging feeling that it was all just NIH syndrome, and that it was silly makework to be rewriting core bits of Autoconf for this marginal benefit. But once I’d done it, re-compiling Chorale showed up two warnings, both places where a mistyped identifier was causing real and otherwise hard-to-find problems. So I felt retrospectively justified in my tinkering. The moral of the story: if you’ve fixed a bug, you’ve done a day’s work; but if you’ve fixed a bug by ensuring that it, and its whole class of fellow bugs, can never happen undetected again — then, you’ve done a good day’s work.

Whitebait, Kleftiko, Ekmek Special