Not in fact any relation to the famous large Greek meal of the same name.

Monday 18 March 2024

System-testing embedded code in Rust, part two: Things I learned testing SSDP

Previously on #rust:

     
With the basic system-test infrastructure now in place thanks to the previous post in this series, it’s time to wire the STM32F746-Nucleo development board up to Ethernet and start testing actual code: the cotton-ssdp crate.

Having said that, in fact the first thing to do after plugging-in Ethernet, is to write a test that verifies basic connectivity. If packets can’t flow between the Nucleo and the rest of the network for any reason, then there’s no point disparaging the SSDP code for its failure to communicate. The basic test will establish that simple networking is operational: that the Ethernet interface sees link (i.e., that the Ethernet cable is actually connected to the Nucleo, and also to something at the other end), and also that DHCP can succeed and the Nucleo obtain an IP address.

And of course writing that code isn’t throwaway effort: every other network-related test will need to do all those things first before getting on with more specific tasks. So the setup code will form part of all subsequent SSDP test binaries.

As always, I encountered problems along the way, because Rust. But all of those problems eventually had solutions, also because Rust.

The aim of the tests

It’s worth just going over what the goals are here. Why spend development time on writing these tests, and on cabling up these test rigs? What is the payback?

My answer is, that I’d like to be able to work on cotton-ssdp, and eventually other similar crates, knowing that I have automated testing that verifies new functionality and defends against regressions in existing functionality. When I push a new feature branch to the CI server, I want it to tell me, as straightforwardly and clearly as possible, whether or not my changes are OK for main.

I don’t think the testing constitutes a promise that the code is completely bug-free (even less, that the functionality it provides is objectively useful). But it does, at the very least, constitute a promise that certain types or certain show-stopping severities of bug are absent. Tom DeMarco, writing in Peopleware, describes Gilb’s Law: “Anything you need to quantify can be measured in some way that is superior to not measuring it at all”. Something similar applies here: any software you need to test can be tested automatically in some way that is superior to not testing it at all.

As a target, “superior to doing nothing” is not a very high bar to clear. These system-tests aren’t very comprehensive in terms of, say, line coverage of the cotton-ssdp crate. But then the crate, after all, is thoroughly unit-tested. These system-tests are more about testing the platform integration code — here, with the RTIC real-time operating system, and the smoltcp networking stack — and about systematically verifying the crate’s original goal of being useful to implementers of embedded systems.

Concretely, the tests presented here really just check that the device can discover resources on the network, and advertise its own resources. Once that two-way communication is proven to work, everything else about exactly what is communicated, is already covered by the unit tests.

I’m not saying that the existence of a unit test automatically renders useless any system testing of the same function. Unit tests and system tests have different (human, organisational) readership — typically, unit tests are only interesting to developers, whereas system tests are often high-level enough and visible enough to serve as technology demonstrations to project managers and beyond — and both audiences are entitled to ask for and to see evidence of all claimed functionality. But in this case, the developers, project managers and beyond are all me, and anyway adding a huge variety of tests would clog up the narrative of this blog post, which focuses more on describing the framework.

Being a good citizen of Ethernet: MAC addresses

We’ll need to start by getting this Nucleo board onto the local Ethernet. In order to participate in SSDP, it’s going to need an IP address, as handed out by the DHCP server in my router. But in order to even participate in Ethernet enough to communicate with the DHCP server, it’s going to need its own Ethernet address. This is also called a hardware address or MAC address, it’s 48 bits (six bytes) long, every device on an Ethernet network has one, and it’s often printed on the back of routers and suchlike as twelve hex digits separated by colons. Some networking hardware comes with an officially-allocated MAC address built-in (as a company, you can get ranges of them allocated to you, like IP addresses) — but STM32s don’t, probably because ST Micro sell a lot of STM32s, many of them into designs (such as the Electric Imp) where they never even use their Ethernet circuitry, and it’d be a waste of a finite resource for ST Micro to allocate each one its own MAC address from the fixed pool.

For our purposes it’d be overkill to get an official address block allocated (though you’d need to if you intended to sell actual products), so it’s fortunate that an alternative way of obtaining an address is possible. One of those 48 bits is set to zero in every official (“Universally Administered”) address, but can be set to one to indicate a “Locally Administered” address, i.e. one chosen by the local network administrator. Which is also me! So in the tests, we set that bit, then pick a device-specific value for the other 47 bits (in fact 46 as there’s another reserved one), and use the result as our MAC address. So long as the 46 device-specific bits are chosen randomly enough, the chance of an accidental collision is suitably negligible.

So we need a calculation that always provides the same answer when performed on the same device, but always provides different answers when performed on different devices. Fortunately each STM32 does include a unique chip ID, burned into each individual die at chip-manufacture time, as described in the STM32F74x reference manual (“RM0385”) section 41.1.

But it’s not a good idea to just use the raw chip ID as the MAC address, for several reasons: it’s the wrong size, it’s quite predictable (it’s not 96 random bits per chip, it encodes the die position on the wafer, so two different STM32s might have IDs that differ only in one or two bits, meaning we can’t just pick any 46 bits from the 96 in case we accidentally pick seldom-changing ones) — and, worst of all, if anyone were to use the same ID for anything else later, they might be surprised if it were very closely correlated with the device’s MAC address.

So the thing to do, is to hash the unique ID along with a key, or salt, which indicates what we’re using it for. You can see this on Github or right here:

pub fn stm32_unique_id() -> &'static [u32; 3] {
    // SAFETY: this address only valid when running on STM32
    unsafe {
        let ptr = 0x1ff0_f420 as *const [u32; 3];
        &*ptr
    }
}

pub fn unique_id(salt: &[u8]) -> u64 {
    let id = stm32_unique_id();
    let key1 = (u64::from(id[0]) << 32) + u64::from(id[1]);
    let key2 = u64::from(id[2]);
    let mut h = siphasher::sip::SipHasher::new_with_keys(key1, key2);
    h.write(salt);
    h.finish()
}

pub fn mac_address() -> [u8; 6] {
    let mut mac_address = [0u8; 6];
    let r = unique_id(b"stm32-eth-mac").to_ne_bytes();
    mac_address.copy_from_slice(&r[0..6]);
    mac_address[0] &= 0xFE; // clear multicast bit
    mac_address[0] |= 2; // set local bit
    mac_address
}

This is quite a general mechanism for producing device-specific but deterministic unique IDs; it’s also used by the SSDP test for generating the UUID for the example resource. And it can be made more sophisticated by hashing-in more information. Want a unique, but deterministic, Wifi MAC address that’s per-network to defeat tracking? Hash in the Wifi network name, or the router's MAC address. (It’s harder to avoid tracking on Ethernet, because you need the MAC address before you even start using DHCP, i.e. before you know anything about the network you’re joining. But of course passive tracking isn’t really a threat model on Ethernet, where you must have chosen to plug in the cable.) Want different UUIDs for different UPnP services? Hash in the service name. One thing you mustn’t do, though, is to use these IDs as cryptographic identifiers, as the hash function in question hasn’t been analysed for the collision-resistance and irreversibility properties which you need for such identifiers.

Spike then refactor: Adding smoltcp support

Getting the cotton-ssdp crate tested on embedded devices, was always going to involve porting it to use smoltcp as an alternative to the standard-library socket implementation that it currently uses. I fondly imagined that the result would look like the idealised “triangle of initialisation”:

fn main() {
    let mut a = A::new();
    let mut b = B::new(&a);
    let mut c = C::new(&a, &b);
    let mut d = D::new(&a, &b, &c);
    ...
    do_the_thing(&a, &b, &c, &d, ...);
}

so, in this case, perhaps:

fn main() {
    let mut stm32 = Stm32::<STM32::F746, STM32::Power::Reliable3V3>::default();
    let mut smoltcp = Smoltcp::new(stm32.ethernet());
    run_dhcp_test(&mut stm32, &mut smoltcp);
}

I didn’t quite get to that level of neatness — there’s a lot of boilerplate in most embedded software, and anyway that pseudocode appears to allocate everything on the stack, where overruns are a runtime failure, as opposed to being in the data segment, where overrun would be (as an improvement) a link-time failure.

But the first version of the DHCP test was over 550 lines of boilerplate, so most of the commits on the pdh-stm32-ssdp branch are just about tidying away the common code to make the intent of the test more obvious.

Once DHCP was working, implementing a simple test for SSDP could commence. (And adding a second network-related test spurred the factoring-out of yet more common code between the two.) Rather than immediately wade in to changing cotton-ssdp itself, though, I was able to use the exposed networking traits which cotton-ssdp already contained, to start implementing smoltcp support directly in the test. This was a fortuitous case of “pre-adaptation”: the abstractions that let cotton-ssdp’s core be agnostic on the matter of mio versus tokio, turned out to be exactly those needed for also abstracting away smoltcp. (Well, okay, that wasn’t completely fortuitous, as I did have embedded systems in mind even when developing hosted cotton-ssdp.)

Once the test was working, I could then move those trait implementations into cotton-ssdp, hiding them behind a new Cargo feature smoltcp in order not to introduce needless new dependencies for those using cotton-ssdp on hosted platforms.

I’m not sure I’m quite your (IPv4) type

If you look at the cotton-ssdp additions for smoltcp support, you’ll see that only a small part of it (lines 220-300) is the actual smoltcp API usage. A much larger part is taken up by conversions between types representing IP addresses.

The issue is, that networking APIs are typically system-specific, and not present on many embedded systems, so the Rust people very sensibly and usefully left those functions out of the embedded, no_std configuration. But, a bit less usefully for our purposes, they also left out the types representing IPv4 and IPv6 addresses. These types are not platform-specific — they’re straight from the RFCs — but, as they weren’t available to no_std builds of smoltcp, the smoltcp people were forced to invent their own versions.

Subsequently a crate called “no-std-net” was released, containing IP address types (structurally) identical to the standard-library ones (when built with no_std and just renaming the standard-library ones when built hosted. The cotton-ssdp crate uses the no-std-net names.

Now in fairness the Rust people did then realise that this situation wasn’t ideal, and standardised types are set to land in Rust 1.77 — but that’s much newer than most people’s minimum-supported-Rust-version (MSRV), and anyway tons of smoltcp users in the field are still using the smoltcp versions. So conversions were necessary.

Rust has very well-defined idioms for type conversion, via implementing the From trait — so on the face of it, all that’s needed is to implement From for the standard types on the smoltcp types, and vice versa. That doesn’t work, though, because of Rust’s “Orphan Rule”, which allows trait implementations only in the crate defining the trait or the one defining the type being extended — and cotton doesn’t define From, std, or smoltcp. The best we can do, it seems, is to invent yet another set of “Generic” IP address types so that we can define the conversions both ways. This leads to lots of double-conversions; stm32f746-nucleo-ssdp-rtic has the likes of

no_std_net::IpAddr::V4(GenericIpv4Address::from(ip).into())

and

GenericSocketAddr::from(sender.endpoint).into()

but it keeps the core code looking sane.

Parts of the generic-types code are more complex than I’d hoped because the smoltcp stack itself can be built either with or without IPv6 (and IPv4 for that matter), using Cargo features. As it stands, cotton-ssdp only uses IPv4, so it depends on smoltcp with the IPv4 feature. But users of cotton-ssdp will of course be using it in some wider system, and, because of the way Rust’s crate feature resolution works, if any part of that system enables the IPv6 feature of smoltcp, then everyone gets a smoltcp with IPv6 enabled. But the enumerated IP address types inside smoltcp — such as wire::Endpointchange if IPv6 is enabled! That means that cotton-ssdp has no way of knowing whether its usage of smoltcp will result in it getting handed the one-variant version of those enumerated types, or two-variant versions. That’s why the conversions accepting those smoltcp types must look like this:

        // smoltcp may or may not have been compiled with IPv6 support, and
        // we can't tell
        #[allow(unreachable_patterns)]
        match ip {
            wire::IpAddress::Ipv4(v4) => ...
            _ => ...
        }

in order to compile successfully however many variants ip has; if there’s only one variant, it needs the allow() in order to compile, but if there’s two or more, it needs the “_ =>” in order to compile.

“Don’t Panic” in large, friendly letters

The first version of the DHCP test worked much like the existing “Hello World” test, which was refactored to match: spawn probe-run as a child process, listen to (and run tests against) its trace output, and then shut it down (in a Drop handler) once all tests have passed.

This was a plausible first stab, and worked well every time that the tests passed, but turned out not to be sound if a test ever failed. (Edit: A previous version of this page blamed the way that a failing test panics, for not unwinding the stack, meaning Drop handlers don’t run. But that’s not actually the case — panics do unwind the stack and do run Drop handlers — so it’s now not clear why the following change was needed. But at least it’s still a valid way of doing it, even though it’s not required.)

I found the solution on the Eric Opines blog: run the test inside panic::catch_unwind(). That function takes a closure to run, and returns a Result indicating either successful exit or a (contained) panic. This required refactoring the DeviceTest struct to also use a closure, but actually the resulting tests look quite neat and well-defined — here’s “Hello World”:

fn arm_stm32f746_nucleo_hello() {
    nucleo_test(
        "../cross/stm32f746-nucleo/target/thumbv7em-none-eabi/debug/stm32f746-nucleo-hello",
        |t| {
            t.expect("Hello STM32F746 Nucleo", Duration::from_secs(5));
        },
    );
}

The two parameters to nucleo_test() are of course the compiled binary to run — remembering that the build system ensures that these are up-to-date before starting the test — and a closure containing the body of the test. The closure gets passed the DeviceTest object, on which the available methods are expect(), which waits up to the given timeout for a message to appear on the device’s (virtualised, RTT) standard output (and if the timeout elapses without seeing it, fails the test), and expect_stderr() which does just the same for the device’s standard-error stream.

Because panic::catch_unwind() requires its closure to be “unwind-safe”, so does nucleo_test(); so far, this hasn’t been an issue in practice, so I haven’t looked deeply into what to do about it otherwise. The DHCP test, at least on the host side, is just as straightforward:

fn arm_stm32f746_nucleo_dhcp() {
    nucleo_test(
        "../cross/stm32f746-nucleo/target/thumbv7em-none-eabi/debug/stm32f746-nucleo-dhcp-rtic",
        |t| {
            t.expect_stderr("(HOST) INFO  success!", Duration::from_secs(30));
            t.expect("DHCP config acquired!", Duration::from_secs(10));
        },
    );
}

The closure pattern was so appealing that I made the SSDP test use the same design — not least to ensure that it, too, was correctly shut down even following a failing test:

fn arm_stm32f746_nucleo_ssdp() {
    nucleo_test(
        "../cross/stm32f746-nucleo/target/thumbv7em-none-eabi/debug/stm32f746-nucleo-ssdp-rtic",
        |nt| {
            nt.expect_stderr("(HOST) INFO  success!", Duration::from_secs(30));
            nt.expect("DHCP config acquired!", Duration::from_secs(10));
            ssdp_test(
                Some("cotton-test-server-stm32f746".to_string()),
                |st| {
                    nt.expect("SSDP! cotton-test-server-stm32f746",
                              Duration::from_secs(20));
                    st.expect_seen("stm32f746-nucleo-test",
                              Duration::from_secs(10));
                }
            );
        }
    );
}

The implementation of ssdp_test() itself is a little more involved, because it must spawn a temporary background thread to start and run the host’s SSDP engine which communicates with the one on the device. The two parameters are an optional SSDP notification-type to advertise to the device, and a closure to contain the body of the test. Here the available method on the SsdpTest object passed to the closure, is expect_seen(), which waits with a timeout for someone on the network (hopefully, the device under test) to advertise a specific notification-type. Here the nt.expect() line checks that the device has seen the host’s advertisement, and the st.expect_seen() line checks that the host has seen the one from the device.

Those two events can occur in either order in practice, but both DeviceTest and SsdpTest buffer-up notifications, so that an expectation that has already come to pass before the expect call is made, completes immediately. In the future it might be interesting to investigate using async/await to express the asynchronous nature of this test more explicitly.

The SSDP test only works if the Nucleo board and the test running on the host, can exchange packets. Typically this means that they must be on the same Ethernet network — or, if the host is on Wifi, that the Wifi network must be bridged to the Ethernet (e.g., by cabling the Nucleo to one of the Ethernet LAN sockets on the Wifi router).

Putting it all together

As of the merge of the pdh-stm32-ssdp branch, the following command runs the system-tests on an attached STM32F746-Nucleo:

cargo test -F arm,stm32f746-nucleo

And for those without a Nucleo, the following commands still work, building all the device code but testing only the host code:

cargo build -F arm,stm32f746-nucleo
cargo test -F arm

And for those without even the cross-compiler installed, which is probably most people, the following commands still work, building and testing only the host code:

cargo build
cargo test
cargo build-all-features --all-targets
cargo test-all-features --all-targets

This all makes it easy for a developer to determine, before pushing to the central git server, whether their branch is likely to be OK for main — but the most definitive answer to that question is only available when going to the trouble of using a local Nucleo development board. If you’re working on something you think probably won’t affect embedded builds, what you really want is to not faff about with development boards (particularly multiple ones): what you want is for continuous integration to perform all of these system-tests as part of its mission to answer the question of whether your branch is OK for main. Adding a CI runner that can run these tests automatically on every push, is the topic of the third post in this series.

About Me

Cambridge, United Kingdom
Waits for audience applause ... not a sossinge.
CC0 To the extent possible under law, the author of this work has waived all copyright and related or neighboring rights to this work.