Not in fact any relation to the famous large Greek meal of the same name.

Wednesday, 25 September 2013

A Git workflow at EI

Electric Imp have used Git from the very beginning of the company, and in that time we’ve evolved what I at least reckon is a useful way of using it, a useful workflow.

It’s ended up similar to, but not quite the same as, Vincent Driessen’s “Gitflow” model, and this blog post purposely uses similar diagrams, terminology, and colour-coding to that one, to make comparisons easier (though hopefully it also stands alone, for those who haven’t read it).

The big picture

There’s a single central Git repository, origin, from which all releases are made and in which all tags reside. Because Git is “decentralised”, each developer has one or more local repositories too.

This diagram, like Vincent Driessen’s original, is drawn with oldest at the top, newest at the bottom, which is the opposite of the convention used by gitk.

Quick summary of differences from “Gitflow”

  • The yellow (main, integration) branch is, for historical reasons, called master;
  • The blue (deployment) branch is called production;
  • Bug-fixes are cherry-picked out from yellow to release branches wherever possible, rather than being merged from release branches back to yellow;
  • Pink feature branches (those done by single individuals, at least) are done as bow-shaped merges;
  • Because of the bow-shaped merges, yellow is never merged out to feature branches: if a feature branch needs some new stuff that’s landed on yellow, it gets rebased on top of yellow;
  • Because we do two kinds of releases from the same codebase – server deployments which are lightweight and rapid, and client firmware upgrades which are more heavyweight and intrusive – there are two kinds of green release branch which are treated slightly differently. (But the server deployment one works much like the “Gitflow” equivalent.)
These are mostly fairly minor differences. (But notice how there are very few non-rebased merges.)

The two long-lived branches and their relationship

The integration branch (“master”, yellow) and the deployment branch (“production”, blue) are the only branches that continue to get new commits indefinitely.

All new work happens on master;

A one-commit story:

$ git checkout master
$ git pull
hack ... hack ... hack
test ... test ... test
$ git commit
$ git pull --rebase
test ... test ... test
$ git push
work that consists of only one or two commits goes straight in, and work that’s more involved than that lands via the merging of a feature branch, of which more below.

The Jenkins continuous-integration server runs whenever new commits are made to master: it builds the whole codebase for all relevant platforms, runs all the unit tests, runs all the integration tests, and finally runs some system-tests on a test farm of real hardware. The quality bar for pushes to master, is clean runs on all of these test suites; any failures are stop-the-line emergencies. If a build or tests is failing, the very next push must be the fix, or other developers can’t continue pushing (because they can’t know whether their own work passes that test or not). This is usually known as “do not commit on red” – although with Git, it’s actually the “push”, not “commit”, operation that’s the relevant one.

This achieves the goal that “There is a known production branch, so you don’t have to think. If you checkout the equivalent of production, it’s either exactly what’s currently in production or it’s what’s about to be in production.”

Also, “The production branch is known-good. It is never a mistake to push the production branch to production servers, ever.” This eases communication with the Operations team. New work is never done directly onto production: it arrives there due to either merges or cherry-picks from master (possibly via an intermediate release or hot-fix branch).

Feature branches

Feature branches are usually short-lived, and indeed usually exist as named branches only in developers’ local repositories. (With Git, if you merge a branch locally into master and then push the result, the branching structure is pushed to origin and becomes part of permanent history, but the branch name isn’t pushed, and doesn’t appear in the origin repository except perhaps in the commit comment of the merge.)

Feature branches are usually named with the developer’s initials and a brief hint to the branch’s purpose: for instance, pdh-regexp was my branch for implementing a regular-expressions feature.

Starting a feature branch:

$ git checkout master
$ git pull
$ git checkout -b pdh-modbus
hack ... hack ... hack
test ... test ... test
There are two exceptions to the above description, both (hopefully) pretty rare: the first is that, if a feature branch is getting so big or so long-lived that it could do with living on the origin server too purely as a backup strategy, then its developer can push it to origin. Prefixing the name with the initials, though, makes clear that it’s a private branch, in the sense that it is likely to get forcibly rebased by that developer, so caveat emptor.

The second exception is when several developers are working on the same feature. This is also probably relatively rare (Kanban and Agile encourage single-developer, or single-pair, working), but it doesn’t fit the same model, because a branch that gets commits from two different sources, can’t be rebased without messing up the other developers. So in that situation, you’d keep the feature branch on origin, the co-operating developers would pull it using git pull --rebase and push it using git push. Once the feature is reviewed, QA’d, and delivered, the collaborative feature branch can be merged to master. This is the only situation in which a non-rebased branch gets merged to master. (“Gitflow” also suggests the use of developer-to-developer, not developer-to-origin, Git pulls and pushes for managing this case, but that sounds to me like a recipe for confusion, plus it’s hard to do with a rebase workflow.)

Once a feature on a branch is complete (and reviewed, and tested), the feature branch can be merged back to master. This is done by rebasing the feature branch on top of master, then doing a no-fast-forward (--no-ff) merge; the thinking behind that style of merge, and full information and walk-throughs of how to perform one, can be found at Bow-shaped branches: a Git workflow.

Because, in order to do a bow-shaped merge, every feature branch eventually gets rebased on top of master, there shouldn’t be any merges from master out to a feature branch. If the feature branch needs some functionality that only landed on master after the feature branch started, it should be rebased on top of master instead. Indeed, it’s good practice to rebase all your feature branches on top of master fairly regularly, as it eases and subdivides the final rebasing process that happens before the delivery merge.

Notice that with the bow-shaped merge construction, although there can be several current unmerged feature branches at any time – mostly in developers’ local repositories – the merging process serialises them completely (by always rebasing before pushing), so that Git permanent history never contains overlapping or nested ones. This makes it easier to find problems using git bisect.

Two different release patterns

Electric Imp has a single repository from which all parts of the system are built: this eases system testing, and the addition of system-wide features, but it does mean that two different types or cadences of “release” happen from the same codebase.

Server releases are deployed to our cloud service. As is best-practice in the server software culture, this is (close to) continuous deployment. Releases are made really quite often, sometimes several times per day – so often, in fact, that it’s pointless even to tag or number them (we’d be in the hundreds). This is achievable because it’s relatively easy for automated testing to cover the entire gamut of server functionality, because upgrades themselves and reverts or hot-fixes are so straightforward as to be virtually push-button, and because (assuming the revert script works as-tested) the impact of a “bad” release is relatively minor. The pace of server releases demands a lightweight release process.

None of those considerations apply to client firmware releases: covering the gamut of firmware functionality can require custom hardware, upgrades get downloaded over the Internet and programmed into flash memory (which is a bit disruptive and can be time-consuming) – and, in theory at least, a “bad” release could be quite awkward to recover from (requiring careful actions by individual end-users). So firmware releases are performed with considerably more caution: the QA, beta-test, and qualification process for a newly-made release branch typically takes a number of weeks. This is (by our standards at least) a heavyweight release process.

Another important difference is that the end-user can at any time get bored of the device, put it away in a drawer for an arbitrary length of time, then rekindle their interest, retrieve the device, and try to use it. This means that the current server release must work with all previous client releases (at least enough for them to upgrade themselves), a criterion fortunately not present in the reverse direction. This concern makes it worth our while keeping the total number of client releases down (and getting cross when “beta” or “test” releases go out without being tagged).

The heavyweight release process

The heavyweight release process, which we use for firmware releases, is based mainly on an abundance of caution.

Once the required collection of new functionality has landed on master, a new release branch is made. This is named after the first release that’s expected to be made from the branch: every release is numbered, with (for instance) releases 25, 25.1, and 25.2 all coming from the release-25-dev branch.

Once the branch is made, it is subjected to the unblinking eye of QA – even a culture of good unit-tests, integration tests, and system tests does not rule out the need for exploratory testing before release.

For major new functionality there may even be a closed beta process, where end-users hand-picked for both eagerness and cluefulness get given tagged beta releases from the branch to supplement our internal testing.

Once a release branch is made, the only subsequent changes are bug fixes. If and when issues are found on a release branch, we adopt GCC’s rule that fixes must (wherever possible) be made on master first and then cherry-picked out to the release branch. This is what ensures that the fix will also end up in subsequent releases: unlike in “Gitflow”, the release branch is not merged back to master.

And if (horrors!) an issue should crop up in a the release after is tagged and rolled out, it again gets fixed on master first and cherry-picked out to the release branch. A point release gets tagged and rolled out: release-27.1, say.

Only if master has moved on so much in the meantime, that the fix for master doesn’t apply on the branch, would fixing take place directly on the release branch.

The lightweight release process

The lightweight release process, which we use for server releases, is based on responding with alacrity to new requirements or to current events – for instance, unexpected load on the servers might require new logging or instrumentation to be added basically immediately.

Releases are made so often that they don’t even get names (and nobody would remember or use them if they did). So to indicate the current state of the production servers, a deployment branch is used. (This is the same as the “blue branch” of “Gitflow”, except that we call it production rather than master.) It’s also the case that, because when we upgrade the server everyone gets it straightaway, previous versions are dead and gone: they don’t hang around in the way that previous firmware releases do. To a much larger extent than with firmware, at any given time only the most recent release matters at all.

As for updating production: if major replumbing or massive new functionality has landed in the server code, it might sometimes be useful to use the heavyweight process – except, with the success event being merging out to production rather than tagging and releasing. More often, though, the necessary alacrity is achieved by a reduced process: picking a suitable version of master, testing it (perhaps by deploying it to staging servers), applying fixes directly to master where necessary, and then simply merging out to production and pushing.

Hot-fixes, small patches to the production code done for emergency situations, can be written on master, cherry-picked locally into production, passed by code-reviewers and/or QA, and then pushed to origin/production. (In “Gitflow”, hot-fixes are landed via a short-lived hot-fix branch. That would be useful where a hot-fix itself consists of a series of commits, not just one – but that seems like it would rarely actually happen.)

Scaling out to enormous development organisations

All of the above assumes that the development organisation is small enough to operate as a single team. Above a certain size, this starts to become awkward: even the rare bad commits on master start to happen too often, and the (lock-free but not wait-free) bow-shaped merge process starts to become a bottle-neck.

In this situation, all you can do is introduce more process (and hope that the increase in developer numbers offsets the decrease in per-developer productivity – an outcome far from guaranteed). What you end up doing is dividing into teams and running the heavyweight release process – but, instead of releasing directly, releasing to an internal “meta-integration” branch where the “best available” versions of each team’s work are combined, to then face further automated and manual testing before actual release.

Really enormous organisations would end up with meta-meta-integration branches, or worse. Releases become great tides that ripple through the organisation, to be taken at the flood or omitted as necessary: the magic phrase to Google for to read more about Agile-in-the-large seems to be “release train”...

About Me

Cambridge, United Kingdom
Waits for audience applause ... not a sossinge.
CC0 To the extent possible under law, the author of this work has waived all copyright and related or neighboring rights to this work.