Not in fact any relation to the famous large Greek meal of the same name.

Monday 31 May 2010

Portability, Or Cut-And-Paste?

Suppose you’ve got some functionality with two completely different implementations on different platforms:

// Event.h v1
class Event // Win32 waitable event
{

  void *m_hEvent;
public:
  Event() : m_hEvent(::CreateEvent()) {}

  void Wake();
  ...
};
versus
// Event.h v2
class Event // Posix waitable event
{
  int m_fd[2];
public:
  Event() { ::pipe(m_fd); } // or maybe eventfd on Linux
  void Wake();
  ...
};

And, by the way, if you’ve got a class where only some of the implementation is different on different platforms, you’ve first got an opportunity for refactoring to extract that part of the implementation as a self-contained class. Portability and maintainability are aided if each of your classes is either wholly portable, or wholly platform-specific. But as it is, you’ve got a more immediate problem, which is that there’s two classes both called Event, and any translation unit that includes both headers isn’t going to compile. Here’s one solution:

// Event.h v3
#ifdef WIN32
class Event
{
  ...
};
#else
class Event
{
  ...
};
#endif
This, plus its close relative
// Event.h v4
#ifdef WIN32
#include "win32/Event.h"
#else
#include "posix/Event.h"
#endif
could well be the most commonly-encountered solutions to this problem in real-world code. But they’re not without problems. One problem is with code analysis tools, including such things as production of automated documentation with Doxygen. Really you want such tools to analyse your whole codebase, and if they do they’ll get hopelessly confused with the two classes both called Event. (You can set up preprocessor defines in Doxygen so that it only sees one Event class -- but that only means the other doesn’t get documented, or that you have two entire sets of Doxygen output for the two platforms, neither of which sounds desirable.)

To solve this problem, you’ve going to have to give the classes different names:

// Event.h v5
class Win32Event
{
  ...
};

class PosixEvent
{
  ...
};

No, no, no, this is C++, there’s a built-in language facility especially for namespaces, you don’t have to reinvent it in the class names themselves:

// Event.h v6
namespace win32 {
class Event
{
  ...
};
} // namespace win32

namespace posix {
class Event
{
  ...
};
} // namespace posix

(Why the comments on the “}” end-of-namespace lines? Because the error messages you get from C++ when you accidentally miss out an end-of-namespace brace are often insane and impenetrable, especially if you do it in a header file, and it’s convenient to be able to grep namespace *.h *.cpp and check whether braces appear in matching pairs in the grep output.)

Notice that both declarations are parsed on every compilation on either platform. This might require you to move some method definitions (those calling such unportable APIs as ::CreateEvent and ::pipe) into a cpp file. But even compiling the other platform’s declarations helps ensure that changes on one platform won’t break the other platform without somebody noticing. And a Doxygen run, or other automated code analysis, will at least get to see both sets of declarations even if not both sets of definitions.

While going down the route of helping platform developers avoid breaking other platforms, it’s been conspicuous so far that there’s nothing in Event.h making sure that the Win32 and Posix implementations keep the same API. Without such enforced consistency, there’s a risk of client code unwittingly using non-portable APIs. The way to enforce the consistency, of course, is to derive from an abstract base class containing the API:

// Event.h v7
class EventAPI {
public:
  virtual ~EventAPI() {}
  virtual void Wake() = 0;
  ...
};

namespace win32 {
class Event: public EventAPI
{
  ...
};
} // namespace win32

namespace posix {
class Event: public EventAPI
{
  ...
};
} // namespace posix

This is, after all, exactly the sort of thing you’d do if you had two different sorts of Event that were both used in the same build of the program. Really the fact that no one individual binary will contain instances of both classes, doesn’t mean that the quest for well-factored design should be thrown out of the window and replaced with cut-and-paste.

Now, though, all the method calls have become virtual function calls. For most of any given codebase, that won’t matter, but there’ll always be hot paths and/or embedded systems where it does, and indeed an event class might quite plausibly be on such a critical path. And after all, compiling the client code itself provides a, potentially large, body of checks that the API has remained consistent over time. It’s reasonable to adopt the position that using design techniques to discourage unthinkingly changing core APIs in a non-portable way isn’t actually necessary, so long as the compilation-failure results of such a change are always speedily available from all target platforms, such as from an automated-build or continuous-integration system.

If, conversely, your class is only used in situations where nobody’s really counting individual machine cycles, you could hide the implementations altogether, at the cost of ruling out stack objects and member variables of type Event and requiring a heap (new) allocation every time one is created:

// Event.h v8
class Event {
public:
  virtual ~Event() {}
  virtual void Wake() = 0;
  ...
};

std::auto_ptr<Event> CreateEvent();

// Event.cpp
std::auto_ptr<Event> CreateEvent()
{
  ... return whichever derived class of Event is appropriate ...
}

But for now let’s assume you can’t really conceal the two platform-specific declarations, and go back to Event.h version 6 or version 7. Let’s look at how client code gets to pick which of the two implementations it uses. For a start, not like this:

class NominallyPortableDomainAbstraction
{
#ifdef WIN32
  win32::Event m_event;
#else
  posix::Event m_event;
#endif
  ...
  m_event.Wake();
  ...
};

Flouting our design rule that each class is either wholly portable or wholly platform-specific, this sort of thing couples portability decisions into all client code, which is very undesirable. This is much better:

// Event.h v9
... as before in v6 or v7

#ifdef WIN32
typedef win32::Event Event;
#else
typedef posix::Event Event;
#endif

// Client code
class NominallyPortableDomainAbstraction
{
  Event m_event;
  ...
  m_event.Wake();
  ...
};

It’s still not great, though: if there are lots of classes involved, lots of code ends up inside the #ifdefs. How can we minimise the amount of #ifdef’d code? Well, here’s one way:

// Event.h v10
... as before in v6 or v7

#ifdef WIN32
using namespace win32:
#else
using namespace posix;
#endif

But this, of course, introduces everything from those namespaces into the surrounding namespace. Not only is that namespace pollution, it’s potentially misleading for client code, as there may be Win32-specific classes or functions -- ones with no Posix equivalent, for use in Win32 situations only -- which are now accessible in portable code without the telltale win32:: prefix. (And, of course, vice versa for Posix-specific ones.) Really we want to explicitly enumerate which classes and APIs are intended to be portable. So we can do this:

// Event.h v11
... as before in v6 or v7

#ifdef WIN32
namespace platform = win32;
#else
namespace platform = posix;
#endif

using platform::Event;
using platform::PollThread;
...

The intention is that the using declarations list precisely those facilities which are available on all platforms, but with differing implementations on each. Non-portable classes or functions can stay in namespace win32 (or namespace posix) so that any use of them in client code immediately flags that code up as itself non-portable.

There is one problem with this neat solution, though: it doesn’t actually compile. Or rather, once you do the same thing in several different header files, you’ll find that multiple namespace X = Y; statements for the same namespace X aren’t allowed, even if Y is the same each time. So, sadly, you have to come up with a different name for “platform” each time (or centralise the using in one header file, causing potentially-undesirable widespread dependency on that file):

// Event.h v12
... as before in v6 or v7

#ifdef WIN32
namespace eventimpl = win32;
#else
namespace eventimpl = posix;
#endif

using eventimpl::Event;
using eventimpl::PollThread;
...

There’s only one remaining wrinkle, which is that you can’t forward-declare a class-name if that name only exists due to using. So if there are header files that traffic in Event* or Event& and could otherwise be satisfied with a forward declaration class Event; and avoid including Event.h, then you can’t use Event.h version 12. (Version 9 is also out, as you can’t forward-declare typedefs either.) The best you can do is probably this:

// Event.h v13
... as before in v6 or v7

#ifdef WIN32
namespace eventimpl = win32;
#else
namespace eventimpl = posix;
#endif

class Event: public eventimpl::Event {};
class PollThread: public eventimpl::PollThread {};
...

although naturally that only works as-written if the eventimpl base classes don’t have constructors other than the default constructor; if they did, you’d have to write forwarding constructors in each derived class, making the code a lot less neat.

1 comment:

  1. This is quite a ... familiar problem. Are you going to make recommendations?

    ReplyDelete

About Me

Cambridge, United Kingdom
Waits for audience applause ... not a sossinge.
CC0 To the extent possible under law, the author of this work has waived all copyright and related or neighboring rights to this work.