^photo by John Robert Shepherd under CC-BY-2.0 | |
Software design is a related but decidedly different skill-set to chip design. That means that the difference between silicon products succeeding or failing, can be nothing to do with the merit or otherwise of the chip design itself, but instead down to the merit or otherwise of the accompanying software. A sufficiently large chip buyer, aiming their own product at a sufficiently lucrative market, can often overcome poor “developer experience” by throwing internal software engineering at the problem: for a product that eventually sells in the millions, it can indeed be worth shaving cents off the bill-of-materials cost by specifying a chip that’s cheaper but harder-to-use than the alternative, even when taking into account the increased spending on software development.
But if you’re a chipmaker, many of your customers will not be in that position — or may accept that they are, but still underestimate the software costs, and release the product in a bug-ridden state or not at all — and ultimately sell fewer of their own products to end-users and thus buy fewer of your chips.
So the product isn’t finished just because the chip itself is sitting there, taped-out and fabbed and packaged and black and square and on reels of 1,000. For any complex or high-value chip — a microcontroller, for instance — the product is not complete until there’s also a software story, usually in the form of a Software Development Kit (SDK) and accompanying documentation. But a chipmaker staffed unremittingly and at too high a level with only chip-design experts may not even, corporately, realise when the state of the art in software has moved on. So here is a checklist — in the spirit of the famous “Joel Test” — to aid hardware specialists in assessing the maturity of a chipmaker’s software process; I’ve named it after the home-town of a chipmaker I once worked for.
The Austin Test
|
1. Could your SDK have been written using only your public documentation?
Sometimes when the silicon comes back from the fab, it doesn’t work quite the way the designers expected. There’s no shame in that. And quite often when that happens, there’s a software workaround for whatever the issue is, so you put that workaround into your SDK. There’s no shame in that either. But if the problem and the workaround are not documented, you’ve just laid a trap for anyone who’s not using your entire SDK in just the way you expected. (See Questions 4 and 8.)
Perhaps your customer is using Rust, or Micropython. Perhaps they have a range of products, based on a range of different chips, of which yours is just one. If there’s “magic” hidden in your C SDK to quietly work around chip issues, then those customers are going to have a bad time.
(The original Mastodon post of which this blog post is a lengthy elaboration.)
2. Is your register documentation generated from the chip source?
I’m pretty sure that even the chipmakers with the most mature and sensible software operations don’t actually do this: they don’t have an automated process that ingests Verilog or VHDL and emits Markdown or RTF or some other output that gets pasted straight into the “Register Layout” sections of their public documentation. (You can sort of tell they don’t, from the changelogs in reference manuals.) But it’s the best way of guaranteeing accuracy — certainly superior to having human beings painstakingly compare the two.
Because I’ve worked at chip companies, I do realise that one reason not to do this is because not everything is public. Because silicon design cycles are so long, and significant redesign is so arduous, what chipmakers do is speculatively design all manner of stuff onto the chip, then test it and only document the parts that actually work or that they can find uses for. Sometimes this is visible as mysterious gaps in memory maps or in peripheral-enable registers; sometimes it’s less visible. I can personally vouch that there is a whole bunch of stuff in the silicon of the Displaylink DL-3000 chip that has never actually been used by the published firmware or software. But this is easily dealt with by equipping the automated process with a filter that just lets through the publicly-attested blocks. It’s still a win to have an automated process for the documentation just of those blocks!
3. Are the C headers for your registers, generated from the chip source?
This is again essentially the question, do you have a process that inherently guarantees correctness, or do you have human employees laboriously curate correctness? The sheer volume of C headers for a sophisticated modern microcontroller can be enormous, and if it’s not automatically-generated then you have only your example code — or worse, customers building their own products — to chase out corner cases where they’re incorrect.
Really these first three questions are closely interlinked: if you find that you can’t write your SDK against only publicly-attested headers, that should be a big hint that you’ve filtered-out too much: that your customers won’t be able to write their own code against those headers either.
4. Are your register definitions available in machine-readable form (e.g. SVD)?
ARM’s SVD (“System View Definition”) format was created as a machine-readable description of the register layout of Cortex-M-based microcontrollers, for the consumption of development tools — so that a debugger, for instance, could describe “a write to address 0x5800_0028” more helpfully as “a write to RTC.SSR”. But the utility of such a complete description is not limited to debuggers: in the embedded Rust ecosystem, the peripheral access crates or PACs that consist of the low-level register read and write functions which enable targetting each specific microcontroller — a sort of crowd-sourced SDK — are themselves generated straight from SVD files. (Higher-level abstractions are then layered on top by actual software engineers.)
Even for C/C++ codebases that are more directly compatible with the vendor’s SDK, it might sometimes be preferable to generate at least the low-level register-accessing code in-house rather than use the vendor SDK: for instance to generate test mocks of the register-access API — or, for code targetting a range of similar microcontrollers, separating common peripherals (e.g. the STM32 GPIO block, which is identical across a large number of different STM32 variants) from per-target peripherals where each chip variant needs its own code (e.g. the STM32 “RCC” clock-control block).
At Electric Imp we did both of those things: our range of internet-of-things products spanned several generations of (closely-related) microcontroller, with all of these products remaining in support and building in parallel from the same codebase for many years, and so we needed a better answer to Question 8 than our silicon vendor provided at the time. (In Rust terms, we needed something that looked like stm32-metapac, not stm32-rs.) And using the SVD files to generate test mocks of the register API, let us achieve good unit-test coverage even of the lowest-level device-driver code (a topic I hope to return to in a future blog post).
Basically, having a good SVD file available (perhaps itself generated from the chip source) gives your customers an “escape hatch” if they find their needs aren’t met by your published SDK. Although SVD was invented by ARM to assist takeup of their very successful Cortex-M line, it is so obviously useful that SVD files are becoming the standard way of programmatically defining the peripheral registers of non-ARM-based microcontrollers too.
5. Is your pinmux definition available in machine-readable form?
Most microcontrollers have far more peripherals on-board than they have pins available to dedicate to them, so several different peripherals or signals are multiplexed onto each physical pin; that way, customers interested in, say, UART-heavy designs and those interested in SPI-heavy designs, can all use the same microcontroller and just set the multiplex, the pinmux, appropriately to connect the desired peripherals with the available pins. Often, especially in low-pin-count packages, up to sixteen different functions can be selected on each pin.
This muxing information, like the register definitions themselves, is helpful metadata about the chip — and, thus, about software targetting it. A machine-readable version of this information can be used to make driver code more readable, and more amenable to linting or other automated checks for correctness.
Pinmux information in datasheets is typically organised into a big table where each row is a pin and the columns are the signals available; at the very least, having this mapping available as a CSV or similar would make it easy to invert it in order to allow the reverse lookup: for this signal, or this peripheral, what pins is it available on? Laboriously manually creating that inverse map was always one of the first tasks to be done whenever a new microcontroller crossed my path at Electric Imp.
6. Are all the output pins tristated when the chip is held in reset, and on cold boot?
Honestly this question is a specific diss-track of one particular microcontroller which failed to do this. I mean, lots of perfectly sensible microcontrollers tristate everything on cold boot except the JTAG or SWD pins, and that exception is completely reasonable. But this part drove a square wave out of some of its output pins while held in reset. It’s hard to fathom how that could have come about, without some pretty fundamental communication issues inside that chipmaker about what a reset signal even is and what it’s for.
(The microcontroller in question was part of a product line later sold-on to Cypress Semi; it might have been more fitting to sell it to Cypress Hill, as the whole thing was insane in the brain.)
7. Is it straightforward to get notifications when your public documentation changes (e.g. new datasheet revision released)?
I’ve said a few somewhat negative things about chipmakers so far, so here comes a solidly positive one: Renesas do this really well. Every time you download a datasheet or chip manual from their website, you get a popup offering to email you whenever a newer version is released of the document that you’ve just downloaded. Particularly at the start of a chip’s lifecycle, when these documents may include newly-discovered chip errata that require customer code changes, this service can be a huge time-saver. Chipmakers who don’t already do this should seriously consider it: it’s not rocket-science to implement, especially as compared to, say, the designing of a microcontroller.
8. Is it straightforward to use your SDK as a software component in a larger system?
The “developer experience” of a microcontroller SDK typically looks like: “Thank you for choosing the Arfle Barfle 786DX/4, please select how you’d like it configured, clickety-click, ta-da! here’s the source for a hello-world binary, now just fill in these blanks with your product’s actual functionality.” And so it should, of course, because that’s where every customer starts out. But it’s not where every customer ends up, especially in the case of a successful product (which, really, is the case you ought to be optimising for): there’s a bigger, more complex version 2 of the customer’s hardware, or there’s a newer generation of your microcontroller that’s faster or has more SRAM — or perhaps you’ve “won the socket” to replace your competitor’s microcontroller in a certain product, but the customer has a big wodge of their own existing hard-fought code that they’d like to bring across to the new version of that product.
In all of these cases, your SDK suddenly stopped being a framework with little bits of customer functionality slotted-in in well-documented places, and started being just one, somewhat replaceable, component in a larger framework organised by your customer. You don’t get to pick the source-tree layout. You probably don’t get to write main() (or the reset handler); if there’s things that need doing there for your specific chip to work, then see Question 1.
Being a software component in a larger system doesn’t preclude also having a friendly out-of-box developer experience; it just means that, when customers peer beneath the hood of the software framework that you’ve built them, what they see is that the core of the framework is a small collection of well-designed fundamental components — built with all the usual software-engineering values such as separation-of-concerns, composeability, unit-testing, and documentation — which can be used on their without the customer needing to understand your entire SDK.
When I worked at Sigmatel on their SDKs for microcontroller-DSPs for the portable media player market, it was clear that successful customers came in two types: there were the opportunist, low-engineering folks who took the SDK’s example media player firmware, stuck their own branding on it if that, and shipped the whole thing straight out again to their own end-users; and the established, high-engineering folks who already had generations of media-player firmware in-house, and just wanted the device drivers so that they could launch a new product with the familiar UI of their existing products. And there was nobody in-between these extremes, so it was not a good use of time to try to serve that middle-ground well.
In a sense the importance of a good answer to this question was emphasised by Finnish architect — that’s buildings architect, not software architect — Eliel Saarinen, who famously said “Always design a thing by considering it in its next larger context — a chair in a room, a room in a house, a house in an environment, an environment in a city plan.” (Quoted posthumously by his son, also an architect.)
I wish I’d seen that quote when I was just starting in software engineering. One of the most useful and widely-applicable tenets that I’ve been learning the hard way since, is this: Always keep in mind whether you’re designing a system, or whether you’re designing a component in a larger system. Hint: it’s never the former.
No comments:
Post a Comment