Not in fact any relation to the famous large Greek meal of the same name.

Sunday 18 February 2024

System-testing embedded code in Rust, part one: Infrastructure

Previously on #rust:

     
One of the goals of the Rust crates I’ve been working on, cotton-netif and cotton-ssdp, is for them to be useful on embedded systems: microcontroller-based devices only capable of running simple real-time operating systems, as opposed to full-size Linux systems.

There is good support in Rust itself for targetting such platforms: it’s relatively easy to write such code (with the no_std attribute), and not even much harder to cross-compile it (using Cargo) and even run it on the target (the probe-rs folks do great work there). But if you’ve read some of the other posts here, you’ll be familiar with the idea that software isn’t done until it can be repeatably shown to be done. Someone — perhaps not Themis, the goddess of Justice, as pictured; more likely Laminar, the goddess of CI — must solemnly, dispassionately, objectively weigh the code’s activities against (some representation of) its specification, and hold it in judgement if it falls short.

Less fancifully, what’s needed is an automated way for CI (running on a rich, non-embedded host platform) to run the code on a genuine, embedded, target platform and check its functionality. Of course, it’s best to arrange that as as much code as possible is abstracted away from the hardware so that it can be unit-tested on the host, run through Miri and Valgrind and other dynamic-analysis tools on the host, and just plain debugged on the host, where everything is a little less awkward. But the proof of the pudding is still in the eating: only tests that run on the target can be the final arbiter.

(In particular, whether the target hardware appears well-documented or not, it’s all too easy to have misconceptions about how peripherals behave, leading to a situation where the code and the host-side unit-test agree about what’s going on, but they’re both wrong because the actual hardware does something completely different.)

At Electric Imp we had (what became) a large subsystem of Python scripts which our CI system ran, and which in turn installed the newly-built device firmware on some dozens of Imp devices, running through system-tests including thorough regression-testing of the wifi connectivity, the “curated” peripherals, and really the entire of the firmware functionality. This worked extremely well, and over the years caught simply oodles of bugs which had passed unit-testing but would have in some way scuppered our customers’ actual devices — but all of that Python always felt like an add-on to the main C++ build system, needing different skills to maintain. As Rust (or at least Cargo), by contrast, quite rightly represents testing as a first-class language feature, I wondered whether Cargo’s own facilities could be used to system-test embedded Rust without having to invent lots of extra infrastructure bolted on the side.

In this blog post I’ll focus on getting the testing infrastructure set up, literally just far enough to get a target-side test that prints “Hello, World” and a host-side test runner which checks that it has done so. Adding actual tests for the crates’ functionality (SSDP, to start with) will likely come in a subsequent post; CI considerations in yet another.

Goals of the Cotton automated system-tests

  1. Joel Test #2 compatibilityJoel Spolsky, writing some years ago now, has some pithy questions to ask software development teams. “Can you make a build in one step?” is always valid to ask, for the reasons he lists — mostly about not forgetting intermediate steps — but also because in practice what’s easy is what most people will do. If the easy thing to do is run a command that builds absolutely everything, then most of the time developers will run that command. (Except perhaps if it starts to take too long, but that’s a different issue.) And the more that’s built by the one command that everybody uses each day, the harder it is to accidentally introduce a bug that affects some builds or facets of the system but not others.
  2. Easy for me/anyone/CI to test everything on the host — Because most of the code compiles for the host (potentially even parts that are only useful on the target), host-side development should remain straightforward. In particular, it mustn’t require the presence of a cross-toolchain, or nightly Rust, or any target hardware.

    It certainly mustn’t require people using the crates from crates.io in normal host-side builds in the normal way, to install or attach anything special. (But Cargo makes sure of that anyway.)

    Someone wanting to work on a Cotton crate in combination with their own project, can check out the Cotton repository alongside their own, and use a path dependency when importing the Cotton crate (just as they would for a one-crate upstream repository).

  3. Easy for me/anyone/CI to test everything on one target — Even if Cotton eventually targets many different embedded systems, almost all embedded developers will only have one type of hardware attached to their development host at any one time. It must be straightforward to run all the tests that can run on (say) an STM32F746-Nucleo, but none that requires different hardware.
  4. Easy for me/anyone/CI to test one crate on one target — Eventually several crates (not just SSDP) will share the same system-test infrastructure; it must remain possible to test just one crate at a time, for the sake of development cycle time.

  5. Possible for CI to test everything on many/all targets — Once Cotton targets several different embedded systems, each one should have its tests run by CI. But this mustn’t require one CI host per target — it must be possible to attach several development boards to the same CI host and have it run the right tests on the right boards. Notice that this goal is for it to be “possible”, not “easy”: having lots of different development boards attached is going to be uncommon, so it’s okay for it to need slightly more awkward setup.
  6. Separation of concerns — The cotton-ssdp crate, say, can probably be tested on each of several different target devices. Adding a new target device mustn’t require changes to cotton-ssdp itself.

Outline of the solution

You can see the merge that creates this infrastructure at commit 181a8fdc and the whole tree at that commit here on Github.

There were definitely a few false starts along the way to achieving those goals. The first thing I tried was “per-package targets”; this is a Rust facility that in theory should make it possible to mark certain packages in a workspace as building for a different platform than the rest of the workspace. The inspiration for the feature was people building web apps where the server end compiles to x86_64 or ARM64 or whatever’s cheap in AWS these days, but the client end compiles to WASM to run in browsers. It’s more-or-less exactly what’s needed here too — but sadly it’s more complicated to implement than you’d first think, and is only available in nightly Rust where it doesn’t work very well. (When I tried it, cargo test kept trying to run my STM32 binaries on the host.)

Without per-package targets, I was going to need different invocations of Cargo for host and device builds (because each invocation of Cargo can only build for a single platform). So I tried having a build.rs build script that, when invoked for the host platform, re-runs Cargo for the device platform. Yee-hah, right? Sheer cowboyery. It doesn’t work, because Cargo deadlocks trying to rebuild the same crate it’s already building. Can’t really blame it, either.

I looked for a while into setting rustflags in Cargo.toml, but that can’t be set per-package in a single workspace, let alone per-test in a single package. Each different set of rustflags must currently be a separate invocation of Cargo.

The answer seems to be, to take parts of each of those ideas:

  • Have the workspace as a whole continue to build native for the host,
  • and have a build script that re-invokes Cargo for each target platform,
  • but have each target platform’s root crate be separate, and never built for the host, using an exclusion in the workspace Cargo.toml:
    Cargo.toml
    [workspace]
    members = [
        "cotton-netif",
        "cotton-ssdp",
        "systemtests",
    ]
    
    exclude = [
        "cross",
    ]

The resulting workspace structure looks like this:

cotton
├── Cargo.toml
├── cotton-netif
│   ├── Cargo.toml
│   └── ...
├── cotton-ssdp
│   ├── Cargo.toml
│   └── ...
├── cross
│   └── stm32f746-nucleo
│       ├── .cargo
│       │   └── config.toml
│       ├── Cargo.toml
│       ├── memory.x
│       └── src
│           └── bin
│               └── hello.rs
└── systemtests
    ├── build.rs
    ├── Cargo.toml
    ├── src
    │   └── lib.rs
    └── tests
        └── stm32f746-nucleo.rs


The cross subdirectory is excluded from the root workspace (in the root Cargo.toml), and the crates inside it (at present, only stm32f746-nucleo) are built by a recursive Cargo invocation in systemtests/build.rs; recursively invoking Cargo is a bit subtle, as you need to unset a bunch of environment variables in order that the sub-Cargo runs mostly as a new top-level Cargo (otherwise, the deadlocking issues reappear). Here’s the build-script section that achieves that, with the cross-compilation guarded by a (Cargo) feature called arm:

systemtests/build.rs (partial)
    if env::var("CARGO_FEATURE_ARM").is_ok() {
        /* Run the inner Cargo without any Cargo-related environment variables
         * from this outer Cargo.
         */
        let filtered_env: HashMap<String, String> = env::vars()
            .filter(|(k, _)| !k.starts_with("CARGO"))
            .collect();
        let child = Command::new("cargo")
            .arg("build")
            .arg("-vv")
            .arg("--all-targets")
            .arg("--target")
            .arg("thumbv7em-none-eabi")
            .current_dir("../cross/stm32f746-nucleo")
            .env_clear()
            .envs(&filtered_env)
            .output()
            .expect("failed to cross-compile for ARM");
        io::stdout().write_all(&child.stderr).unwrap();
        io::stdout().write_all(&child.stdout).unwrap();
        assert!(child.status.success());
    }

The obvious downside here, is that build scripts aren’t run with a live terminal: unless the outer Cargo is invoked with -vv, the script’s standard output and standard error are written only to files, and shown only if the script fails. If the recursive Cargo invocation succeeds, all you see is a long pause in your build — though at least if the recursive invocation fails, you do see its error output.

For the time being, this acts as extra encouragement to keep complex code out of the only-compiled-for-target crates — though if need be you can always do a normal top-level Cargo build inside the STM32 crate, to see live standard output and standard error:

cargo -C cross/stm32f746-nucleo build --target thumbv7em-none-eabi

Host-side runner for a device-side test

To start with (and to validate the rest of the solution), there is only one actual system-test in the first merge: writing the “Hello World” binary onto an STM32F746-Nucleo development board, running it, and checking the output. Cargo will have run our build script before running the test, so we know that our device-side binaries have all been built and are up-to-date (another Joel Test benefit) — and it’s just a question of using (the excellent) probe-run, which is built on probe-rs and already knows about STM32 development boards, to write the binary to the STM32 chip and then run it. The STM32 binary uses defmt-rtt for its logging output, using the Cortex-M Real-Time Tracing system, support for which is again built-in to probe-run (no semihosting! no UARTs!), so the device side is as simple as this:

cross/stm32f746-nucleo/src/bin/hello.rs
#![no_std]
#![no_main]
 
use defmt_rtt as _; // global logger
use panic_probe as _;
use cortex_m::asm;
 
#[cortex_m_rt::entry]
fn main() -> ! {
    defmt::println!("Hello STM32F746 Nucleo!");
 
    loop {
        asm::bkpt()
    }
}
and the host side only a little less simple:
systemtests/tests/stm32f746-nucleo.rs
use assertables::*;
use serial_test::*;
use std::env;
use std::path::Path;
use std::process::Command;
 
use std::io::{self, Write};
 
#[test]
#[serial]
#[cfg_attr(miri, ignore)]
fn arm_stm32f7_hello() {
    let elf = Path::new(env!("CARGO_MANIFEST_DIR")).join(
        "../cross/stm32f746-nucleo/target/thumbv7em-none-eabi/debug/hello",
    );
 
    let mut cmd = Command::new("probe-run");
    if let Ok(serial) = env::var("COTTON_PROBE_STM32F746_NUCLEO") {
        cmd.arg("--probe");
        cmd.arg(serial);
    }
    let output = cmd
        .arg("--chip")
        .arg("STM32F746ZGTx")
        .arg(elf)
        .output()
        .expect("failed to execute probe-run");
 
    println!("manifest: {}", env!("CARGO_MANIFEST_DIR"));
    println!("status: {}", output.status);
    io::stdout().write_all(&output.stderr).unwrap();
    io::stdout().write_all(&output.stdout).unwrap();
    assert!(output.status.success());
 
    let stdout = String::from_utf8_lossy(&output.stdout);
    assert_contains!(stdout, "Hello STM32F746 Nucleo");
}

Notice the use of serial_test to defeat Rust’s default behaviour of running multiple integration tests in parallel — that wouldn’t end well if they were all competing for one single physical development board. (Although obviously there’s only one test at the moment.)

Clearly much more complex logging and checking could (and will) be implemented via the same mechanism, but as a proof-of-concept this suffices for this initial blog post.

Does this meet our goals?

  1. Joel Test — the commands:
    cargo build
    cargo test
    cargo build-all-features --all-targets
    cargo test-all-features --all-targets
    all do as expected, and none need any cross-toolchains, special hardware, or even nightly Rust. (The all-features ones work because the arm feature is carefully excluded from all-features builds.) The same generic “CI for Rust” scripts that worked for simple old host-side Cotton, continue to work fine with the exciting new multi-platform Cotton.
  2. Test everything on the host — as above, the easy, everyday commands build and test all the host-compatible crates. The device-only crates can be built with:
    cargo build -F arm
    cargo test -F arm
    This needs the thumbv7em-none-eabi target to be installed for the current Rust toolchain:
    rustup target add thumbv7em-none-eabi
    but does not need any hardware, nor nightly Rust.
  3. Test everything on one target — The Cargo.toml in the systemtests declares a Cargo feature stm32f746-nucleo, which depends on feature arm and enables the integration test whose host side is shown above. So building and running this test (i.e., running all the tests for that particular development board) looks like this:
    cargo build -F arm,stm32f746-nucleo
    cargo test -F arm,stm32f746-nucleo
    This needs the cross-toolchain installed as above, and of course it needs an actual physical STM32F746-Nucleo development board attached via USB to the host computer running the tests.
  4. Test one crate on one target — Because the system-tests infrastructure is shared, this would have to be accomplished by careful naming of tests, and the use of Cargo’s wildcard test name option:
    cargo test -F arm,stm32f746-nucleo --test '*ssdp*'
  5. Test everything on many targets — the main issue here is that, if several different (probe-rs compatible) development boards are attached to the same host, the probe-run command needs to be told each time which one to use. This is the idea behind the optional COTTON_PROBE_STM32F746_NUCLEO environment variable seen in the host-side runner above: a CI or other setup that has several development boards attached, needs to specify using these environment variables the unique “probe identifier” of each one. This feature will get more of a workout in a future blog post when a second target platform is added.
  6. Separation of concerns — So far, this is good but not perfect. Device-side code that can be compiled for the host goes in the top-level workspace (perhaps alongside host-side code that can’t be compiled for the device). Device-side code that can’t be compiled for the host, goes in one of the crates under the “cross” directory. (Maybe one day those too should become workspaces?) Entire device-side applications (or, at the very least, example ones) can live there too.

    At some level it would be appealing for (say) system-tests for SSDP to go somewhere under the cotton-ssdp create. But device-side tests are inherently device-specific, and given N crates and M devices and a potentially N×M-sized testing matrix, it seemed better to keep the tests by device rather than by crate-under-test, on the basis that all engineers working with embedded Rust (in the device directories) are surely also familiar with host-side Rust, but the reverse (engineers working on host-side Rust in, say, the SSDP crate being familiar with embedded Rust) seems less guaranteed.

    Also, all the code in the systemtests crate, including much of build.rs, is generic across any crate wanting system-testing, and isn’t Cotton-specific. As of this blog post, no better advice is being offered here than to copy-and-paste it into your own projects, but in the future it would be more Rust-like to offer this functionality in free-standing crates that other projects desiring system-testing could just import in the normal way.

That Github link once again

You can see the merge that creates this infrastructure at commit 181a8fdc and the whole tree at that commit here on Github.

Similar blogs elsewhere

Ferrous Systems have a series of blog posts covering very similar topics, though they use cargo-xtask to construct their single build commands.

Continue to...

Friday 9 February 2024

Perceived angular bisection

Given collinear points A, B, C in the plane, the locus of points P such that APB = BPC 0 is the unique generalised circle through B which inverts A to C.

Those angles are also trivially equal (both zero) for all P collinear with A, B, C and not between A and C. The locus of other points P is a true circle, except when AB = BC, in which case it is the straight line that is the perpendicular bisector of AC through B; "generalised" circle includes this case as a notional "circle of zero curvature".

Equivalently, the locus is the circle through B centred at a point O chosen such that OA.OC=OB2 (again with the exception when AB = BC).

This result answers the question: given three points in a line and no other information — three lamp-posts in an otherwise-dark landscape, for instance — whereabouts are the viewpoints from which the middle one appears, through perspective, to be exactly halfway in-between the other two?

Proof

Consider such a point P. Without loss of generality, assume AB < BC (the AB=BC case is trivial, and the AB > BC case is symmetrical with this one).

Draw the unique circle through P and B whose centre is collinear with A, B, and C (e.g. by constructing the perpendicular bisector of PB and marking its centre where that line intersects with the line through A and C). Call this centre O.

Label the angles as shown: OPA = c APB = d which is also BPC OCP = e OBP = g

Then consider:

in △CPB, e +d+(180°-g) = 180° e +d = g △POB is isosceles, c +d = g therefore, e = c i.e., OPA = OCP

But the triangles OAP and OPC also have the angle at O in common, so they have two angles in common, therefore they are similar, therefore the ratios of corresponding sides are equal: OA/OP= OP/OC .

And OP=OB , because both are radii of the circle, so OA/OB= OB/OC or OA.OC=OB2, which is one definition of the circle of inversion.

Note that the position of O is the same whichever P we pick; it doesn't depend on any of the angles. (This is easier to see if we rearrange OA.OC=OB2 into OA=AB2BC-AB — it's uniquely specified by the positions of A, B, and C.) So the circle does represent the locus of all P.

Bit off-topic for this blog though eh

Yes. But it is a problem I've been thinking about for a while — in fact, ever since I read that, given the original formulation (three lamp-posts on a darkened plain) you can't even tell from a photograph whether or not they're evenly-spaced, because perspective means that they could be anywhere. But if your photograph has four lamp-posts in, you can tell whether or not they're evenly-spaced, because the cross-ratio is projection-invariant.

So I wondered whether, given three lamp-posts, there was always a point from which they appear evenly-spaced. It felt sort-of plausible that there were such points, but I had no intuition what their locus looked like. The first time I attacked this problem I plotted the points by brute force, and was surprised to see an apparently perfect circle. With the cosine rule and lots of ghastly slog (and Wolfram Alpha) I found the equation of the circle and the position of O. But that was clearly not The Book's proof, so I shelved the draft blog post — for some years. Then I happened to be reading about circle inversion, and suddenly realised it was talking about the three-lamp-posts circle. Eventually I was able to use circle-inversion techniques to come up with a geometrical, not algebraic proof, which I'm much happier with.

And if you're dying to see how bad the ghastly brute-force proof looked, here's a part of it (p is AB and q is BC):

p2q2x2 -2p2qx3 +p2x4 +p2x2y2 +2pq2x3 +2pq2xy2 -4pqx4 -4pqx2y2 +2px5 +4px3y2 +2pxy4 +q2x4 +2q2x2y2 +q2y4 -2qx5 -4qx3y2 -2qxy4 +x6 +3x4y2 +3x2y4 +y6 =p2q2x2 -2p2qx3 -2p2qxy2 +p2x4 +2p2x2y2 +p2y4 +2pq2x3 -4pqx4 -4pqx2y2 +2px5 +4px3y2 +2pxy4 +q2x4 +q2x2y2 -2qx5 -4qx3y2 -2qxy4 +x6 +3x4y2 +3x2y4 +y6

(I didn't set that by hand; there used to be an online Latex-to-MathML converter, but it seems to have since been retired.)

About Me

Cambridge, United Kingdom
Waits for audience applause ... not a sossinge.
CC0 To the extent possible under law, the author of this work has waived all copyright and related or neighboring rights to this work.