Not in fact any relation to the famous large Greek meal of the same name.

Tuesday 25 August 2015

C++: Deduced templates are smarter than explicit templates

Suppose you’re writing a C++ class that encapsulates the idea of a callback, a routine that’s invoked when an asynchronous operation completes. (For instance, such a thing would be useful when implementing promises.) Effectively what you want is a lightweight version of std::function, perhaps a bit like this:

template <typename RET, typename ...ARGS>
class Callback
{
    typedef RET (*pfn)(void*, ARGS...);

    void *m_ptr;
    pfn m_pfn;

public:
    Callback() :  m_ptr(NULL), m_pfn(NULL) {}
    Callback(void *ptr, pfn ppfn) : m_ptr(ptr), m_pfn(ppfn) {}

    RET operator()(ARGS... args) const { return (*m_pfn)(m_ptr, args...); }
};

And here’s what a corresponding callback-using interface might look like:

typedef Callback<unsigned int, unsigned int, unsigned int, const std::string&> BDC;

class CDA
{
public:
    unsigned browse(uint32_t starting_index, BDC callback)
    {
        return callback(2, starting_index+1, "hello");
    }
};

All that, of course, uses C++11 variadic templates, in order to encompass callbacks with zero, one or several arguments. But in fact the technique described here works equally well with good old C++98 (where you’d need separate callback classes for each different number of arguments).

Now the client – the code supplying the callback – is itself probably a C++ class, in which case we don’t want to pass a bare C function pointer into Callback’s constructor. What we want is the ability to pass a method of our object as a Callback – or, in other words, to bind a method and a class instance together so that they function as a Callback.

So how do you write this bind() function? What you’d like to write is something like this (where the final template parameter is not a type, but the pointer-to-member itself, thus ensuring that a different Callback is instantiated for each different method called):

template <typename T, typename RET, typename ...ARGS, RET (T::*FN)(ARGS...)>
Callback<RET, ARGS...> bind(T *t)
{
    //...
}

That would probably work, but would look a mess at the call site, because you’d have to explicitly state all the template parameters:

auto c = bind<unsigned, unsigned, unsigned, std::string&, &Foo::onBrowseDone>(this);

Really you’d like all those other types to be worked out from the pointer-to-member, whose own type does after all incorporate all the information:

auto c = bind<&Foo::onBrowseDone>(this);

Sadly you can’t do that, because you can’t parameterise bind() on just the pointer-to-member – because you can’t even name the necessary pointer-to-member type without using the names of other types, which aren’t defined yet:

template <RET (T::*FN)(ARGS...)>  // doesn't compile
Callback<RET,ARGS...> bind(T *t)
{
    //...
}

Such are promises: all lies and jest.

But wait: it turns out that there are template invocations in C++ (both ’98 and ’11) that are more powerful than these explicit invocations. These are the deduced templates, and they happen in two contexts: when a call is made to an overloaded function, and the parameter types are used to pick which overload to use; and when a partially-specialised class or function is used, and a particular specialisation must be chosen. The first of those doesn’t help us much, but we can work with the second.

If we declare a class Binder, parameterised on any type, but then specialise it for the generic type of pointers-to-member, then the C++ compiler is required to use what is essentially a full functional-programmer’s unification algorithm to match the caller’s type against the list of specialisations. This unification is much like what you see in ML’s pattern-matching, or Erlang’s “=” operator. Truly devious people use it for things such as SFINAE and template meta-programming, but here we can get it to deduce all the other needed types from the pointer-to-member’s type alone:

template <typename T>
class Binder;

template <typename RET, typename T, typename ...ARGS>
class Binder<RET (T::*)(ARGS...)>
{
    template <RET (T::*FN)(ARGS...)>
    static RET caller(void *ptr, ARGS... args)
    {
        return (((T*)ptr)->*FN)(args...);
    }

public:
    template <RET (T::*FN)(ARGS...)>
    static Callback<RET,ARGS...> bind(T *t)
    {
        return Callback<RET,ARGS...>((void*)t, &caller<FN>);
    }
};

#define BIND(x,y) Binder<typeof(x)>::bind<x>(y)

It’s not very obvious what’s up with the template parameters there, so to make it explicit: first we declare class Binder as a template class with one template parameter, which is a type. Then we specialise class Binder for those types which match the pattern RET (T::*)(ARGS...) for arbitrary RET, T, and ARGS – which is precisely the types of pointers-to-member. (And that’s it: there’s only one specialisation of Binder.)

Inside class Binder is static method bind(), which is a template method: it’s parameterised again. This time the template parameter is not a type: it’s a non-type template parameter, a value, an actual pointer-to-member. So both bind() and caller() are instantiated once for each different member called.

That difference between the two template parameters, of Binder and bind(), isn’t very obvious from the syntax – but compare it to this slightly contrived but analogous example, where again the outer template argument is a type, and the inner one is a value of that type:

template <typename T>
struct IntFuncs
{
    template <T N>
    static T IntFunc()
    {
        return N;
    }
};

unsigned answer = IntFuncs<unsigned>::IntFunc<42u>();

The only awkwardness when using class Binder – the only thing separating it from the idealised call to bind() – is the need to specify the pointer-to-member twice: once as a type, once as a value. But that repetition can be parcelled-up in a macro, as seen in BIND() above. And now the client code looks very straightforward:

class Foo
{
    unsigned onBrowseDone(unsigned int, unsigned int, const std::string&);

public:
    void fnord(CDA *cda)
    {
        cda->browse(3, BIND(&Foo::onBrowseDone, this));
    }
};

Once again the Law of Conservation of Ick applies: we’ve removed ickiness from both the API (the callback-caller) and the client (the callback-supplier), while somewhat concentrating it in the definitions of classes Callback and particularly Binder. But as Binder only needs writing once (or indeed even less often than that, if you just nick it from this blog post, Creative Commons Zero and all that) and it’s generic enough to be used be many APIs and clients, that seems like a good trade-off.

About Me

Cambridge, United Kingdom
Waits for audience applause ... not a sossinge.
CC0 To the extent possible under law, the author of this work has waived all copyright and related or neighboring rights to this work.