The Embedded Maintenance Trap

Most embedded bugs aren’t hardware problems. They’re design problems , born from code that grew organically across board revisions, tightly knotted around pin numbers, register addresses, and brittle HAL calls scattered everywhere and with scalability things can get messy, adding to the technical debt. Its important we talk about how not to make your embedded software tightly coupled and leave room for extension & scalibility
Embedded products are supposed to live long lives with little or no maintenance (often 10 to 20 years). But with time, Hardware revisions happen. MCUs get discontinued. Pin assignments shift. New team members arrive. And through all of it, the codebase has to hold together.
The typical trajectory looks like this:
- a prototype gets written, with direct register writes and HAL calls sprinkled throughout the application logic.
- Run all tests on Simulator & Hardware (Prototype board)
- It works. It ships. Three years later, the board changes, and a demand to update the product with latest market needs and technology. Or a new MCU in the market with extreme low power, but high performance at much lower cost. Customers will love the upgrade.
- Looking back at the legacy code, and you find that motor logic, display timing, and sensor reads are all entangled with pin numbers and peripheral names that no longer exist.
Some examples of why your existing code is in a mess:
- Application logic directly calls GPIO, I2C, SPI with hardcoded pin numbers
- Global driver singletons accessed from anywhere in the codebase
#ifdefchains multiplying across files to handle hardware variants- Unit testing is impossible without the actual target hardware
- Bugs only surface late - on hardware, in the field
Most embedded problems are not hardware problems, they’re design problems. The hardware is just where they become visible.
The Anti-Patterns in Detail
Direct Hardware Access Buried in Application Logic
The most common offender: application code that “knows” things it shouldn’t (Direct coupling), like:
- GPIO pin numbers
- peripheral base addresses
- HAL function names
all of it leaking up into business logic.
// Application logic shouldn't know about GPIOA pin 5
void motor_start() {
GPIO_SetPin(GPIOA, 5); // What is pin 5? Why GPIOA?
TIM2->CR1 |= TIM_CR1_CEN; // And what's TIM2 doing here?
}
Change the board, change the timer, or port to a different MCU family, and you’re rewriting this logic from scratch. Worse, you can’t test motor_start() without the hardware in front of you.
Hidden Dependencies and Magic Globals
Singleton HALs and global driver instances are a related problem. Why? because the Functions that silently access hardware through global state are impossible to reason about in isolation. You can never be sure what a function does without tracing its entire dependency chain through globals, which goes against good software testing principles
Dependency Inversion - What It Actually Means
The Dependency Inversion Principle (DIP) is the “D” in SOLID, but you don’t need to care about the acronym. The idea is simple and practical:
High-level logic should not depend on low-level hardware details. Both should depend on an abstraction. Hence, Your application logic should not know which pin an LED is connected to.
In embedded terms: your
- state machine
- your control loop
- your protocol handler these are high-level. While :
- The GPIO writes
- I2C transactions
- timer configurations these are low-level details.
DIP says they shouldn’t be directly coupled.
Abstractions in C and C++ can take different forms. In C, a struct of function pointers acts as an interface. In C++, abstract base classes serve the same purpose. The implementation or the actual hardware access code lives behind that interface, invisible to the application.
Violates DIP
void led_on() {
HAL_GPIO_WritePin(
LED_PORT,
LED_PIN,
GPIO_PIN_SET
);
}
DIP Applied
typedef struct {
void (*on)(void);
void (*off)(void);
} led_t;
// App only sees led_t
The application code depends on led_t. It has no idea whether the LED is on GPIOA, GPIOB, active-high, or active-low. That’s someone else’s problem - and that’s the point.
Dependency Injection - The Missing Piece
Dependency Inversion tells you what your code should depend on (abstractions, not concretions). Dependency Injection (DI) tells you how those abstractions actually reach your code.
Don’t let your module create or find its own dependencies. Have them passed in from the outside. Someone else gives your code what it needs.
If you’ve ever passed a driver struct into an initialization function, or written a BSP that hands hardware handles, to application modules - you’ve already done dependency injection. You just didn’t call it that.
// Module receives its dependency - doesn't create it
void app_init(led_t *status_led) {
status_led->on();
}
// main.c or BSP wires it together
int main(void) {
led_t hw_led = { .on = hw_led_on, .off = hw_led_off };
app_init(&hw_led);
}
The application module has no idea it’s talking to a real LED. It just calls the function pointer. Who sets that up? main(), the BSP, or a system init layer - wherever is appropriate. That’s the injection point.
The Full Pattern - DIP + DI Together
DIP defines the abstraction while, DI wires the implementation to it. Used together, they fully decouple your hardware from your application logic. Here’s the complete pattern for an LED:
// led.h - the abstraction
// This is the interface. Application code only includes this.
typedef struct {
void (*on)(void);
void (*off)(void);
void (*toggle)(void);
} led_t;
// hw_led.c - the hardware implementation
#include "led.h"
static void hw_led_on(void) { HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_SET); }
static void hw_led_off(void) { HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_RESET); }
static void hw_led_toggle(void) { HAL_GPIO_TogglePin(LED_GPIO_Port, LED_Pin); }
led_t hw_led = {
.on = hw_led_on,
.off = hw_led_off,
.toggle = hw_led_toggle
};
// heartbeat.c - pure application logic
#include "led.h"
void heartbeat_task(led_t *led) {
led->toggle();
// No hardware knowledge. No pin numbers. No HAL includes.
}
heartbeat_task is now completely portable. It’ll compile and run on any platform - STM32, RP2040, ESP32, or a Linux host - as long as you provide a valid led_t.
Testing Without Hardware
This is where the design pays for itself immediately. Once your application logic depends only on abstractions, you can provide a fake implementation that runs on your development machine.
// fake_led.c - for host-side testing
#include "led.h"
#include <stdio.h>
static int led_state = 0;
static void fake_led_on(void) { led_state = 1; printf("[LED] ON\n"); }
static void fake_led_off(void) { led_state = 0; printf("[LED] OFF\n"); }
static void fake_led_toggle(void) {
led_state ^= 1;
printf("[LED] %s\n", led_state ? "ON" : "OFF");
}
led_t fake_led = {
.on = fake_led_on,
.off = fake_led_off,
.toggle = fake_led_toggle
};
Now run your application logic - state machines, control algorithms, protocol handlers on your laptop. No MCU. No debugger. No JTAG. Fast feedback loops. CI-friendly tests. Bugs caught before the board even arrives.
What This Unlocks
- Run unit tests with a standard C compiler on the host
- Integrate into GitHub Actions, GitLab CI, or any pipeline
- Test corner cases without triggering real hardware behavior
- Multiple developers can work without needing physical boards
- Faster iteration - compile and test in seconds, not minutes
Performance Concerns - Addressed Honestly
The first pushback from embedded engineers is always performance. Like:
- “Function pointers have overhead.”
- “Virtual functions are slow.”
Legimate concern actually, let’s look at this honestly.
A function pointer call is a single indirect branch - on most Cortex-M cores, that’s 1-3 clock cycles. For a heartbeat LED or a sensor poll running at 100 Hz, this is unmeasurable noise. For a context where every cycle matters (bit-banged SPI at 10 MHz, tight DSP loops, ISRs), you wouldn’t be using abstractions at all - you’d be writing the implementation directly.
For everything else, there are zero-cost options. Static structs placed in flash with const are read directly. Link-time binding lets the linker resolve the implementation at build time. Compile-time DI using C++ templates or preprocessor configuration can achieve the same separation with no runtime cost.
Measure before you optimize. In my personal experience with embedded work, the performance bottleneck is almost never the abstraction layer - it’s the algorithm, the protocol, or the peripheral.
DI and DIP in C++
In C++, the same pattern uses abstract base classes.
// C++ - constructor injection
// Abstraction
class ILed {
public:
virtual void on() = 0;
virtual void off() = 0;
virtual void toggle() = 0;
virtual ~ILed() = default;
};
// Hardware implementation
class GpioLed : public ILed {
public:
void on() override { HAL_GPIO_WritePin(LED_PORT, LED_PIN, GPIO_PIN_SET); }
void off() override { HAL_GPIO_WritePin(LED_PORT, LED_PIN, GPIO_PIN_RESET); }
void toggle() override { HAL_GPIO_TogglePin(LED_PORT, LED_PIN); }
};
// Application receives dependency via constructor
class HeartbeatTask {
ILed& led_;
public:
explicit HeartbeatTask(ILed& led) : led_(led) {}
void run() { led_.toggle(); }
};
// main.cpp
GpioLed hw_led;
HeartbeatTask task(hw_led); // injection - no heap needed
This pattern is safe for embedded:
- no heap allocation
- no RTTI
- no exceptions.
- The vtable overhead is typically a single pointer indirection
Embedded Linux and Yocto Context
On Embedded Linux - whether built with Yocto, Buildroot, or a custom distro - the same principles apply at the userspace level. Drivers are accessed through device files, sysfs, or character devices. Applications that hardcode these paths and access patterns become fragile across BSP changes.
DI in userspace means your application receives a hardware interface (a file descriptor wrapper, a driver abstraction struct, a class hierarchy for a sensor) rather than constructing it directly. The BSP layer or board configuration code does the wiring. Swapping hardware variants becomes a matter of changing what gets injected at startup - not rewriting the application.
This also makes your application layers testable in a Linux host environment without requiring the target board, using mock file descriptors, pipe-based fakes, or simple simulation stubs.
When Not To Use This
Design principles are tools, not religion. There are legitimate cases where DI and DIP add unnecessary complexity:
- Bootloaders and startup code with extreme size constraints
- ISRs where every instruction is accounted for and no indirection is acceptable
- Quick throwaway prototypes that will be completely rewritten
- Extremely resource-constrained MCUs (sub-8KB flash) where even a vtable pointer is a budget item
If your code cannot tolerate a function pointer call, that limitation should be measurable and visible in the timing budget or ISR deadline. Otherwise, the maintenance and testability costs of tight coupling almost always outweigh the runtime overhead of abstraction
Migrating an Existing Codebase
You don’t need a big-bang rewrite of your legacy code. The path forward is incremental, and it starts at module boundaries:
- Identify a natural seam - a driver, a peripheral abstraction, a communication module
- Wrap the existing HAL - create a thin struct or class that exposes only what the application needs
- Replace direct calls - update the module to receive the abstraction instead of calling HAL directly
- Add a fake - write a simple host-side implementation and run a first test
- Repeat at the next boundary - work outward, module by module
Each step is independently valuable. Even wrapping just one driver creates a seam you can test through. You don’t need to refactor the entire codebase to see the benefit.
“High‑quality embedded software is not defined by low‑level register manipulation; it is defined by a clean, well‑structured software architecture.”
Summary
- DIP separates what your application does from how the hardware implements it
- DI delivers those implementations cleanly, from the outside, without hidden globals
- Together they make your code portable across board revisions and testable without hardware
- Works in bare-metal C, C++, and Embedded Linux - no frameworks required
- The overhead is negligible in almost every real-world embedded context
- Migration is incremental - start at one module boundary and work outward
The goal is software that survives hardware changes, supports a growing team, can be tested in CI, and doesn’t collapse under its own weight after three board revisions. That’s worth a struct of function pointers.