Previously on #rust:
There is good support in Rust itself for targetting such platforms: it’s relatively easy to write such code (with the no_std attribute), and not even much harder to cross-compile it (using Cargo) and even run it on the target (the probe-rs folks do great work there). But if you’ve read some of the other posts here, you’ll be familiar with the idea that software isn’t done until it can be repeatably shown to be done. Someone — perhaps not Themis, the goddess of Justice, as pictured; more likely Laminar, the goddess of CI — must solemnly, dispassionately, objectively weigh the code’s activities against (some representation of) its specification, and hold it in judgement if it falls short.
Less fancifully, what’s needed is an automated way for CI (running on a rich, non-embedded host platform) to run the code on a genuine, embedded, target platform and check its functionality. Of course, it’s best to arrange that as as much code as possible is abstracted away from the hardware so that it can be unit-tested on the host, run through Miri and Valgrind and other dynamic-analysis tools on the host, and just plain debugged on the host, where everything is a little less awkward. But the proof of the pudding is still in the eating: only tests that run on the target can be the final arbiter.
(In particular, whether the target hardware appears well-documented or not, it’s all too easy to have misconceptions about how peripherals behave, leading to a situation where the code and the host-side unit-test agree about what’s going on, but they’re both wrong because the actual hardware does something completely different.)
At Electric Imp we had (what became) a large subsystem of Python scripts which our CI system ran, and which in turn installed the newly-built device firmware on some dozens of Imp devices, running through system-tests including thorough regression-testing of the wifi connectivity, the “curated” peripherals, and really the entire of the firmware functionality. This worked extremely well, and over the years caught simply oodles of bugs which had passed unit-testing but would have in some way scuppered our customers’ actual devices — but all of that Python always felt like an add-on to the main C++ build system, needing different skills to maintain. As Rust (or at least Cargo), by contrast, quite rightly represents testing as a first-class language feature, I wondered whether Cargo’s own facilities could be used to system-test embedded Rust without having to invent lots of extra infrastructure bolted on the side.
In this blog post I’ll focus on getting the testing infrastructure set up, literally just far enough to get a target-side test that prints “Hello, World” and a host-side test runner which checks that it has done so. Adding actual tests for the crates’ functionality (SSDP, to start with) will likely come in a subsequent post; CI considerations in yet another.
Goals of the Cotton automated system-tests
- Joel Test #2 compatibility — Joel Spolsky, writing some years ago now, has some pithy questions to ask software development teams. “Can you make a build in one step?” is always valid to ask, for the reasons he lists — mostly about not forgetting intermediate steps — but also because in practice what’s easy is what most people will do. If the easy thing to do is run a command that builds absolutely everything, then most of the time developers will run that command. (Except perhaps if it starts to take too long, but that’s a different issue.) And the more that’s built by the one command that everybody uses each day, the harder it is to accidentally introduce a bug that affects some builds or facets of the system but not others.
Easy for me/anyone/CI to test everything on the host — Because most of the code compiles for the host (potentially even parts that are only useful on the target), host-side development should remain straightforward. In particular, it mustn’t require the presence of a cross-toolchain, or nightly Rust, or any target hardware.
It certainly mustn’t require people using the crates from crates.io in normal host-side builds in the normal way, to install or attach anything special. (But Cargo makes sure of that anyway.)
Someone wanting to work on a Cotton crate in combination with their own project, can check out the Cotton repository alongside their own, and use a path dependency when importing the Cotton crate (just as they would for a one-crate upstream repository).
- Easy for me/anyone/CI to test everything on one target — Even if Cotton eventually targets many different embedded systems, almost all embedded developers will only have one type of hardware attached to their development host at any one time. It must be straightforward to run all the tests that can run on (say) an STM32F746-Nucleo, but none that requires different hardware.
Easy for me/anyone/CI to test one crate on one target — Eventually several crates (not just SSDP) will share the same system-test infrastructure; it must remain possible to test just one crate at a time, for the sake of development cycle time.
- Possible for CI to test everything on many/all targets — Once Cotton targets several different embedded systems, each one should have its tests run by CI. But this mustn’t require one CI host per target — it must be possible to attach several development boards to the same CI host and have it run the right tests on the right boards. Notice that this goal is for it to be “possible”, not “easy”: having lots of different development boards attached is going to be uncommon, so it’s okay for it to need slightly more awkward setup.
- Separation of concerns — The cotton-ssdp crate, say, can probably be tested on each of several different target devices. Adding a new target device mustn’t require changes to cotton-ssdp itself.
Outline of the solution
You can see the merge that creates this infrastructure at commit 181a8fdc and the whole tree at that commit here on Github.
There were definitely a few false starts along the way to achieving those goals. The first thing I tried was “per-package targets”; this is a Rust facility that in theory should make it possible to mark certain packages in a workspace as building for a different platform than the rest of the workspace. The inspiration for the feature was people building web apps where the server end compiles to x86_64 or ARM64 or whatever’s cheap in AWS these days, but the client end compiles to WASM to run in browsers. It’s more-or-less exactly what’s needed here too — but sadly it’s more complicated to implement than you’d first think, and is only available in nightly Rust where it doesn’t work very well. (When I tried it, cargo test kept trying to run my STM32 binaries on the host.)
Without per-package targets, I was going to need different invocations of Cargo for host and device builds (because each invocation of Cargo can only build for a single platform). So I tried having a build.rs build script that, when invoked for the host platform, re-runs Cargo for the device platform. Yee-hah, right? Sheer cowboyery. It doesn’t work, because Cargo deadlocks trying to rebuild the same crate it’s already building. Can’t really blame it, either.
I looked for a while into setting rustflags in Cargo.toml, but that can’t be set per-package in a single workspace, let alone per-test in a single package. Each different set of rustflags must currently be a separate invocation of Cargo.
The answer seems to be, to take parts of each of those ideas:
- Have the workspace as a whole continue to build native for the host,
- and have a build script that re-invokes Cargo for each target platform,
- but have each target platform’s root crate be separate, and never
built for the host, using an exclusion in the workspace Cargo.toml:
Cargo.toml [workspace] members = [ "cotton-netif", "cotton-ssdp", "systemtests", ] exclude = [ "cross", ]
The resulting workspace structure looks like this:
cotton ├── Cargo.toml ├── cotton-netif │ ├── Cargo.toml │ └── ... ├── cotton-ssdp │ ├── Cargo.toml │ └── ... ├── cross │ └── stm32f746-nucleo │ ├── .cargo │ │ └── config.toml │ ├── Cargo.toml │ ├── memory.x │ └── src │ └── bin │ └── hello.rs └── systemtests ├── build.rs ├── Cargo.toml ├── src │ └── lib.rs └── tests └── stm32f746-nucleo.rs
|
The cross subdirectory is excluded from the root workspace (in the root Cargo.toml), and the crates inside it (at present, only stm32f746-nucleo) are built by a recursive Cargo invocation in systemtests/build.rs; recursively invoking Cargo is a bit subtle, as you need to unset a bunch of environment variables in order that the sub-Cargo runs mostly as a new top-level Cargo (otherwise, the deadlocking issues reappear). Here’s the build-script section that achieves that, with the cross-compilation guarded by a (Cargo) feature called arm:
systemtests/build.rs (partial) |
---|
if env::var("CARGO_FEATURE_ARM").is_ok() { /* Run the inner Cargo without any Cargo-related environment variables * from this outer Cargo. */ let filtered_env: HashMap<String, String> = env::vars() .filter(|(k, _)| !k.starts_with("CARGO")) .collect(); let child = Command::new("cargo") .arg("build") .arg("-vv") .arg("--all-targets") .arg("--target") .arg("thumbv7em-none-eabi") .current_dir("../cross/stm32f746-nucleo") .env_clear() .envs(&filtered_env) .output() .expect("failed to cross-compile for ARM"); io::stdout().write_all(&child.stderr).unwrap(); io::stdout().write_all(&child.stdout).unwrap(); assert!(child.status.success()); } |
The obvious downside here, is that build scripts aren’t run with a live terminal: unless the outer Cargo is invoked with -vv, the script’s standard output and standard error are written only to files, and shown only if the script fails. If the recursive Cargo invocation succeeds, all you see is a long pause in your build — though at least if the recursive invocation fails, you do see its error output.
For the time being, this acts as extra encouragement to keep complex code out of the only-compiled-for-target crates — though if need be you can always do a normal top-level Cargo build inside the STM32 crate, to see live standard output and standard error:
cargo -C cross/stm32f746-nucleo build --target thumbv7em-none-eabi
Host-side runner for a device-side test
To start with (and to validate the rest of the solution), there
is only one actual system-test in the first merge: writing the
“Hello World” binary onto an STM32F746-Nucleo development board,
running it, and checking the output. Cargo will have run our build
script before running the test, so we know that our device-side
binaries have all been built and are up-to-date (another Joel Test
benefit) — and it’s just a question of using (the excellent)
probe-run,
which is built on probe-rs and already knows about STM32
development boards, to write the binary to the STM32 chip and then
run it. The STM32 binary
uses defmt-rtt
for its logging output, using the
cross/stm32f746-nucleo/src/bin/hello.rs |
---|
#![no_std] #![no_main] use defmt_rtt as _; // global logger use panic_probe as _; use cortex_m::asm; #[cortex_m_rt::entry] fn main() -> ! { defmt::println!("Hello STM32F746 Nucleo!"); loop { asm::bkpt() } } |
systemtests/tests/stm32f746-nucleo.rs |
---|
use assertables::*; use serial_test::*; use std::env; use std::path::Path; use std::process::Command; use std::io::{self, Write}; #[test] #[serial] #[cfg_attr(miri, ignore)] fn arm_stm32f7_hello() { let elf = Path::new(env!("CARGO_MANIFEST_DIR")).join( "../cross/stm32f746-nucleo/target/thumbv7em-none-eabi/debug/hello", ); let mut cmd = Command::new("probe-run"); if let Ok(serial) = env::var("COTTON_PROBE_STM32F746_NUCLEO") { cmd.arg("--probe"); cmd.arg(serial); } let output = cmd .arg("--chip") .arg("STM32F746ZGTx") .arg(elf) .output() .expect("failed to execute probe-run"); println!("manifest: {}", env!("CARGO_MANIFEST_DIR")); println!("status: {}", output.status); io::stdout().write_all(&output.stderr).unwrap(); io::stdout().write_all(&output.stdout).unwrap(); assert!(output.status.success()); let stdout = String::from_utf8_lossy(&output.stdout); assert_contains!(stdout, "Hello STM32F746 Nucleo"); } |
Notice the use of serial_test to defeat Rust’s default behaviour of running multiple integration tests in parallel — that wouldn’t end well if they were all competing for one single physical development board. (Although obviously there’s only one test at the moment.)
Clearly much more complex logging and checking could (and will) be implemented via the same mechanism, but as a proof-of-concept this suffices for this initial blog post.
Does this meet our goals?
- ✅ Joel Test — the commands:
cargo build
all do as expected, and none need any cross-toolchains, special hardware, or even nightly Rust. (The all-features ones work because the arm feature is carefully excluded from all-features builds.) The same generic “CI for Rust” scripts that worked for simple old host-side Cotton, continue to work fine with the exciting new multi-platform Cotton.
cargo test
cargo build-all-features --all-targets
cargo test-all-features --all-targets
- ✅ Test everything on the host — as above, the
easy, everyday commands build and test all the host-compatible crates.
The device-only crates can be built with:
cargo build -F arm
This needs the thumbv7em-none-eabi target to be installed for the current Rust toolchain:
cargo test -F arm
rustup target add thumbv7em-none-eabi
but does not need any hardware, nor nightly Rust. - ✅ Test everything on one target — The
Cargo.toml in the systemtests declares a Cargo
feature stm32f746-nucleo, which depends on
feature arm and enables the integration test whose host
side is shown above. So building and running this test (i.e.,
running all the tests for that particular development board)
looks like this:
cargo build -F arm,stm32f746-nucleo
This needs the cross-toolchain installed as above, and of course it needs an actual physical STM32F746-Nucleo development board attached via USB to the host computer running the tests.
cargo test -F arm,stm32f746-nucleo
- ✅ Test one crate on one target — Because the
system-tests infrastructure is shared, this would have to be
accomplished by careful naming of tests, and the use of Cargo’s
wildcard test name option:
cargo test -F arm,stm32f746-nucleo --test '*ssdp*'
- ✅ Test everything on many targets — the main issue here is that, if several different (probe-rs compatible) development boards are attached to the same host, the probe-run command needs to be told each time which one to use. This is the idea behind the optional COTTON_PROBE_STM32F746_NUCLEO environment variable seen in the host-side runner above: a CI or other setup that has several development boards attached, needs to specify using these environment variables the unique “probe identifier” of each one. This feature will get more of a workout in a future blog post when a second target platform is added.
✅ Separation of concerns — So far, this is good but not perfect. Device-side code that can be compiled for the host goes in the top-level workspace (perhaps alongside host-side code that can’t be compiled for the device). Device-side code that can’t be compiled for the host, goes in one of the crates under the “cross” directory. (Maybe one day those too should become workspaces?) Entire device-side applications (or, at the very least, example ones) can live there too.
At some level it would be appealing for (say) system-tests for SSDP to go somewhere under the cotton-ssdp create. But device-side tests are inherently device-specific, and given N crates and M devices and a potentially N×M-sized testing matrix, it seemed better to keep the tests by device rather than by crate-under-test, on the basis that all engineers working with embedded Rust (in the device directories) are surely also familiar with host-side Rust, but the reverse (engineers working on host-side Rust in, say, the SSDP crate being familiar with embedded Rust) seems less guaranteed.
Also, all the code in the systemtests crate, including much of build.rs, is generic across any crate wanting system-testing, and isn’t Cotton-specific. As of this blog post, no better advice is being offered here than to copy-and-paste it into your own projects, but in the future it would be more Rust-like to offer this functionality in free-standing crates that other projects desiring system-testing could just import in the normal way.
That Github link once again
You can see the merge that creates this infrastructure at commit 181a8fdc and the whole tree at that commit here on Github.
Similar blogs elsewhere
Ferrous Systems have a series of blog posts covering very similar topics, though they use cargo-xtask to construct their single build commands.