Pytest for Embedded Developers- Reliable, Maintainable Test Design

Pytest for Embedded Developers- Reliable, Maintainable Test Design

Embedded Software testing used to be dominated by only C or C++ with a bit of bash scripts, however in recent times the framework of firmware and embedded-software testing has been changing rapidly integrating Python’s flexibility & ease of usage, with the CI/CD integration the testing has never been faster & secured

Testing Firmware with PyTest: A Beginner’s Guide

1. Introduction: Why Firmware Testing Is Changing

Firmware development has relied on manual testing, hardware-in-the-loop setups, and unit tests written in C. These approaches work, but they have limitations:

graph TD
    A[Traditional Firmware Testing] --> B[Manual Testing]
    A --> C[Hardware-in-the-Loop]
    A --> D[C Unit Tests]
    
    B --> E[❌ Slow Feedback]
    B --> F[❌ Human Error]
    
    C --> G[❌ Expensive Setup]
    C --> H[❌ Limited Availability]
    
    D --> I[❌ Verbose Code]
    D --> J[❌ Hard to Maintain]
    
    style E fill:#ffcccc
    style F fill:#ffcccc
    style G fill:#ffcccc
    style H fill:#ffcccc
    style I fill:#ffcccc
    style J fill:#ffcccc

Key challenges:

  • Slow feedback loops: Flashing firmware to hardware and manually verifying behavior takes time
  • Limited test coverage: Writing comprehensive tests in C is verbose and time-consuming
  • Difficult debugging: Tracking down issues on embedded devices can be painful
  • Poor CI/CD integration: Hardware-dependent tests don’t run well in continuous integration pipelines

💡 Key Insight

Modern firmware projects need fast, maintainable, automated testing. Python and pytest deliver exactly that.

2. Why Python for Firmware Testing

Python works well for firmware testing because:

mindmap
  root((Python for<br/>Firmware Testing))
    Rapid Development
      Concise Syntax
      Quick Iterations
      Less Boilerplate
    Rich Ecosystem
      pyserial
      pytest
      Mock Libraries
      Protocol Tools
    Excellent Tooling
      Fixtures
      Parametrization
      Detailed Reports
      IDE Support
    Easy Abstraction
      Hardware Mocking
      Interface Layers
      Clean APIs
  • Rapid test development: Python’s concise syntax lets you write tests quickly
  • Rich ecosystem: Libraries for serial communication (pyserial), protocol handling, mocking, and more
  • Excellent tooling: pytest provides powerful features like fixtures, parametrization, and detailed reporting
  • Easy hardware abstraction: Python can interface with hardware while keeping tests maintainable

🎯 Key Point

You don’t need to test firmware only in C. Python handles test orchestration, drives interfaces, and verifies behavior while your firmware runs on target or in simulation.

⚠️ Common Misconception

“Python is too slow for real-time testing!” — While Python isn’t used for real-time control, it’s perfect for test orchestration, protocol handling, and verification. The firmware still runs in C at full speed.

3. PyTest in the Testing Pyramid

Firmware testing follows the testing pyramid:

pyramid

Where pytest fits:

  • Unit level: Test compiled C functions directly using Python bindings (via ctypes or CFFI)
  • Integration level: Test communication between modules, protocol handling, state machines
  • System level: Drive hardware interfaces and verify end-to-end behavior

📊 Test Distribution

Aim for:

  • 70% Unit tests (fast, no dependencies)
  • 20% Integration tests (some mocking)
  • 10% System/E2E tests (hardware or full simulation)

This approach tests at the right level of abstraction.

4. Testing C Firmware Artifacts: ELF Files, Shared Libraries, and Functions

To test C firmware with Python, you need to make your code callable. Here are the main approaches:

graph LR
    A[C Firmware Code] --> B{Compilation Strategy}
    
    B --> C[Shared Library<br/>.so / .dll]
    B --> D[Static Library<br/>.a / .lib]
    B --> E[Standalone Binary<br/>ELF]
    
    C --> F[Python ctypes]
    C --> G[Python CFFI]
    
    D --> H[Link into Test Binary]
    
    E --> I[Serial/Network<br/>Communication]
    E --> J[Hardware Interface]
    
    F --> K[✅ Direct Function Calls]
    G --> K
    H --> K
    I --> L[✅ Black Box Testing]
    J --> L
    
    style K fill:#90EE90
    style L fill:#90EE90

Approach 1: Compile to Shared Libraries

Compile your firmware modules as shared libraries (.so on Linux, .dll on Windows):

# Linux/macOS
gcc -shared -fPIC -o libfirmware.so firmware.c

# Windows (MinGW)
gcc -shared -o libfirmware.dll firmware.c

# Windows (MSVC)
cl /LD firmware.c /Fe:libfirmware.dll

Then load and call functions from Python using ctypes:

import ctypes
import platform

# Load the shared library (cross-platform)
if platform.system() == 'Windows':
    lib = ctypes.CDLL('./libfirmware.dll')
else:
    lib = ctypes.CDLL('./libfirmware.so')

# Define function signature
lib.calculate_checksum.argtypes = [ctypes.POINTER(ctypes.c_uint8), ctypes.c_size_t]
lib.calculate_checksum.restype = ctypes.c_uint16

# Call the function
data = (ctypes.c_uint8 * 10)(*range(10))
checksum = lib.calculate_checksum(data, 10)

⚠️ Critical: Type Matching

Always define argtypes and restype for C functions. Mismatched types can cause crashes or silent data corruption that’s hard to debug!

🖥️ Platform-Specific Type Sizes

Be careful with types that have platform-dependent sizes:

  • int, long - Size varies (32-bit vs 64-bit)
  • size_t - Use ctypes.c_size_t (matches platform)
  • float vs double - Use ctypes.c_float and ctypes.c_double

Always use explicit fixed-width types like uint8_t, uint16_t, uint32_t in your C code for portability!

Common ctypes type mapping issues and solutions:

# ❌ Problem: Using wrong float type
lib.get_temperature.restype = ctypes.c_double  # C function returns float!
temp = lib.get_temperature()  # Wrong value!

# ✅ Solution: Match the C type exactly
lib.get_temperature.restype = ctypes.c_float  # Now correct
temp = lib.get_temperature()

# ❌ Problem: Platform-dependent long
lib.process_data.argtypes = [ctypes.c_long]  # Size varies!

# ✅ Solution: Use fixed-width types
lib.process_data.argtypes = [ctypes.c_int32]  # Always 32-bit

# ❌ Problem: Incorrect size_t usage
lib.buffer_size.restype = ctypes.c_int  # Wrong on 64-bit!

# ✅ Solution: Use c_size_t
lib.buffer_size.restype = ctypes.c_size_t  # Platform-appropriate

# ❌ Problem: Pointer type mismatch
lib.read_buffer.argtypes = [ctypes.c_void_p]  # Too generic!

# ✅ Solution: Use specific pointer types
lib.read_buffer.argtypes = [ctypes.POINTER(ctypes.c_uint8)]

🔍 Debug Type Issues

If you get strange values or crashes, check:

  1. Print ctypes.sizeof(ctypes.c_long) vs your C sizeof(long)
  2. Use a debugger to inspect actual C return values
  3. Enable Python warnings: python -Wd your_test.py
  4. Consider using CFFI instead for better type safety

Approach 2: Use CFFI for Complex Interfaces

For more complex scenarios, CFFI provides a better interface:

from cffi import FFI

ffi = FFI()
ffi.cdef("""
    typedef struct {
        uint8_t status;
        uint16_t value;
    } sensor_data_t;
    
    sensor_data_t read_sensor(void);
""")

lib = ffi.dlopen('./libfirmware.so')
result = lib.read_sensor()

💡 Pro Tip

CFFI is better for complex structs, callbacks, and maintaining type safety. Use ctypes for simple function calls, CFFI for everything else.

Approach 3: Test via Serial/Network Interfaces

For firmware running on actual hardware or emulators:

import serial

def test_firmware_response():
    port = serial.Serial('/dev/ttyUSB0', 115200, timeout=1)
    port.write(b'GET_STATUS\n')
    response = port.readline()
    assert response == b'OK\n'

5. Defining a Clean Test Architecture

A well-organized test architecture separates concerns and makes tests maintainable:

Test Directory Structure:

DirectoryPurposeExamples
unit/Pure C function teststest_protocol.py, test_utilities.py
integration/Module interaction teststest_state_machine.py, test_communication.py
system/End-to-end teststest_firmware_behavior.py
fixtures/Shared test fixtureshardware.py, firmware_loader.py
mocks/Mock implementationsmock_hardware.py
conftest.pypytest configurationShared fixtures and settings
tests/
├── unit/                  # Pure C function tests
│   ├── test_protocol.py
│   └── test_utilities.py
├── integration/           # Module interaction tests
│   ├── test_state_machine.py
│   └── test_communication.py
├── system/               # End-to-end tests
│   └── test_firmware_behavior.py
├── fixtures/             # Shared test fixtures
│   ├── hardware.py
│   └── firmware_loader.py
├── mocks/               # Mock implementations
│   └── mock_hardware.py
└── conftest.py          # Shared pytest configuration

📁 Organization

Organize by test level (unit/integration/system), not by feature. This makes it easier to run specific test categories and understand scope.

Key principles:

  • Keep unit tests fast and isolated
  • Use fixtures to manage setup/teardown
  • Abstract hardware access into reusable components
  • Organize by test level, not by feature

⚠️ Avoid This

Don’t organize tests by C source file names (test_module1.py, test_module2.py). This creates tight coupling between test and implementation structure.

6. Writing Behavior-Driven PyTest Test Cases

Good tests describe what the firmware should do, not how it does it. Use descriptive test names and arrange tests clearly:

graph TD
    A[Test Case] --> B{Well Structured?}
    
    B -->|Yes| C[Arrange<br/>Set up test conditions]
    C --> D[Act<br/>Execute the behavior]
    D --> E[Assert<br/>Verify the outcome]
    E --> F[✅ Clear, Maintainable Test]
    
    B -->|No| G[❌ Common Problems]
    G --> H[Testing implementation<br/>details]
    G --> I[Multiple behaviors<br/>in one test]
    G --> J[Unclear test names]
    G --> K[Missing setup/cleanup]
    
    style F fill:#90EE90
    style H fill:#FF6B6B
    style I fill:#FF6B6B
    style J fill:#FF6B6B
    style K fill:#FF6B6B
def test_led_blinks_on_button_press():
    """When button is pressed, LED should blink 3 times"""
    # Arrange
    firmware.reset()
    firmware.set_led_state('off')
    
    # Act
    firmware.button_press()
    
    # Assert
    blinks = firmware.get_led_blink_count(timeout=1.0)
    assert blinks == 3

def test_invalid_command_returns_error_code():
    """Firmware should return ERR_INVALID for unknown commands"""
    # Act
    response = firmware.send_command('INVALID_CMD')
    
    # Assert
    assert response.status == ErrorCode.ERR_INVALID
    assert 'unknown command' in response.message.lower()

✅ The AAA Pattern

Arrange: Set up test preconditions Act: Execute the behavior being tested Assert: Verify the expected outcome

This pattern keeps tests readable.

Best practices:

  • Use the Arrange-Act-Assert pattern
  • One logical assertion per test (but multiple assert statements are fine)
  • Test names should read like specifications
  • Include docstrings explaining the expected behavior

💡 Test Naming Convention

Use the format: test_<what>_<condition>_<expected_result>

Examples:

  • test_sensor_reading_when_initialized_returns_valid_value
  • test_uart_transmission_with_invalid_data_raises_error
  • test_state_machine_in_idle_state_accepts_start_command

7. Using PyTest Fixtures for Firmware Lifecycle Management

Fixtures handle test setup and teardown automatically. They’re crucial for firmware testing:

sequenceDiagram
    participant PT as PyTest
    participant FX as Fixture
    participant TEST as Test Function
    participant FW as Firmware
    
    PT->>FX: Setup fixture
    FX->>FW: Initialize device
    FX->>FW: Connect
    FX->>FW: Reset
    FX-->>TEST: Yield device
    
    Note over TEST: Test executes here
    
    TEST->>FW: Run test operations
    FW-->>TEST: Return results
    
    TEST-->>FX: Test completes
    FX->>FW: Cleanup
    FX->>FW: Disconnect
    FX-->>PT: Fixture teardown
import pytest

@pytest.fixture
def firmware_device():
    """Initialize firmware device for testing"""
    device = FirmwareDevice(port='/dev/ttyUSB0')
    device.connect()
    device.reset()
    
    yield device  # Test runs here
    
    # Cleanup
    device.disconnect()

@pytest.fixture
def loaded_firmware(firmware_device):
    """Device with firmware loaded and initialized"""
    firmware_device.load_firmware('build/firmware.hex')
    firmware_device.wait_for_boot(timeout=5.0)
    return firmware_device

def test_sensor_reading(loaded_firmware):
    """Test uses the loaded_firmware fixture"""
    reading = loaded_firmware.read_sensor()
    assert 0 <= reading <= 1023

🎯 Fixture Scopes

  • function (default): Run before/after each test
  • class: Shared across test class
  • module: Shared across file
  • session: Shared across entire test run

Use wider scopes for expensive setup (like hardware initialization) to speed up tests!

Advanced fixture patterns:

@pytest.fixture(scope='module')
def shared_hardware():
    """Expensive setup shared across all tests in module"""
    hw = HardwareSimulator()
    hw.initialize()
    yield hw
    hw.shutdown()

@pytest.fixture(params=['uart', 'spi', 'i2c'])
def communication_interface(request):
    """Parametrized fixture to test multiple interfaces"""
    interface = Interface(protocol=request.param)
    yield interface
    interface.close()

⚠️ Fixture Anti-Pattern

Don’t put test logic in fixtures! Fixtures should only handle setup and teardown. Keep assertions and test logic in test functions.

8. Abstracting Hardware and Communication Protocols

Never let tests depend directly on hardware details. Use abstraction layers:

graph TD
    subgraph "❌ Wrong: Direct Hardware Coupling"
        A[Test Code] -->|Direct calls| B[Serial Port]
        A -->|Direct calls| C[SPI Device]
        A -->|Direct calls| D[I2C Device]
    end
    
    subgraph "✅ Right: Hardware Abstraction Layer"
        E[Test Code] -->|Uses interface| F[Hardware Abstraction Layer]
        F -->|Implements| G[Serial Interface]
        F -->|Implements| H[SPI Interface]
        F -->|Implements| I[I2C Interface]
        F -->|Implements| J[Mock Interface]
        
        G --> K[Real Hardware]
        H --> K
        I --> K
        J --> L[No Hardware Needed!]
    end
    
    style A fill:#FF6B6B
    style B fill:#FF6B6B
    style C fill:#FF6B6B
    style D fill:#FF6B6B
    
    style E fill:#90EE90
    style F fill:#90EE90
    style J fill:#FFD700
    style L fill:#FFD700
# Bad: Test directly depends on serial port details
def test_firmware_bad():
    port = serial.Serial('/dev/ttyUSB0', 115200)
    port.write(b'\x02GET_STATUS\x03')
    response = port.read(100)
    assert response[0] == 0x06

# Good: Test uses abstraction layer
def test_firmware_good(firmware):
    status = firmware.get_status()
    assert status.is_ok()

🏗️ Dependency Inversion

High-level test code shouldn’t depend on low-level hardware details. Both should depend on abstractions (interfaces).

Creating abstraction layers:

class FirmwareInterface:
    """Abstract interface for firmware communication"""
    
    def send_command(self, cmd: str) -> Response:
        raise NotImplementedError
    
    def read_sensor(self, sensor_id: int) -> int:
        raise NotImplementedError

class SerialFirmwareInterface(FirmwareInterface):
    """Concrete implementation using serial port"""
    
    def __init__(self, port: str, baudrate: int = 115200):
        self.serial = serial.Serial(port, baudrate)
    
    def send_command(self, cmd: str) -> Response:
        self.serial.write(f"{cmd}\n".encode())
        raw_response = self.serial.readline()
        return Response.parse(raw_response)

class MockFirmwareInterface(FirmwareInterface):
    """Mock implementation for fast testing"""
    
    def __init__(self):
        self.state = {}
    
    def send_command(self, cmd: str) -> Response:
        # Simulate firmware behavior
        return Response(status='OK', data=self.state.get(cmd))

💡 Benefits of Abstraction

  • Tests run without hardware (faster CI/CD)
  • Easy to test error conditions
  • Same tests work with different hardware
  • Simpler test code

Same test code, different environments:

EnvironmentInterface TypeBenefits
DevelopmentMockFast execution, no setup
CI/CDMockNo hardware needed
Hardware TestRealActual verification
flowchart LR
    A[Same Test Code] --> B{Environment}
    B -->|Development| C[Mock Interface]
    B -->|CI/CD| C
    B -->|Hardware Test| D[Real Interface]
    
    C --> E[Fast Execution<br/>No Setup Needed]
    D --> F[Real Verification<br/>On Actual Hardware]
    
    style C fill:#FFD700
    style D fill:#90EE90
    style E fill:#FFD700
    style F fill:#90EE90

Now tests can work with either real hardware or mocks!

9. Handling Asynchrony, Timing, and Non-Determinism

Firmware often involves timing-dependent behavior. Here’s how to handle it:

Polling Pattern:

  1. Start waiting for event
  2. Check condition repeatedly (poll)
  3. If condition met → Success
  4. If max time reached → Timeout with clear error message
  5. Keep checking with small intervals

Polling with Timeouts

import time

def wait_for_condition(check_func, timeout=5.0, interval=0.1):
    """Wait for a condition to become true"""
    start = time.time()
    while time.time() - start < timeout:
        if check_func():
            return True
        time.sleep(interval)
    return False

def test_led_eventually_turns_on(firmware):
    firmware.trigger_led()
    
    assert wait_for_condition(
        lambda: firmware.get_led_state() == 'on',
        timeout=2.0
    ), "LED did not turn on within 2 seconds"

⏱️ Timeout Guidelines

  • Unit tests: < 100ms timeout
  • Integration tests: 1-5 seconds
  • System tests: 10-30 seconds

Make timeouts configurable for slower hardware!

Using pytest-timeout

@pytest.mark.timeout(10)
def test_firmware_responds_quickly(firmware):
    """Test will fail if it takes more than 10 seconds"""
    response = firmware.long_running_operation()
    assert response.success

Handling Non-Deterministic Timing

graph TD
    A[Timing Issue] --> B{Solution Strategy}
    
    B --> C[Multiple Samples]
    B --> D[Statistical Analysis]
    B --> E[Retry Logic]
    B --> F[Mock Time]
    
    C --> G[✅ Test average behavior]
    D --> H[✅ Test variance/jitter]
    E --> I[✅ Handle flaky operations]
    F --> J[✅ Control time in tests]
    
    style G fill:#90EE90
    style H fill:#90EE90
    style I fill:#90EE90
    style J fill:#90EE90
def test_sensor_reads_within_range(firmware):
    """Test multiple readings for stability"""
    readings = [firmware.read_sensor() for _ in range(10)]
    
    # Check all readings are within expected range
    assert all(900 <= r <= 1100 for r in readings)
    
    # Check variance isn't too high
    import statistics
    assert statistics.stdev(readings) < 50

⚠️ Avoid Flaky Tests

Never use fixed time.sleep() to wait for firmware! Always poll with timeouts. Fixed sleeps either waste time (too long) or cause flaky failures (too short).

Mocking Time-Dependent Behavior

from unittest.mock import patch
import time

def test_timeout_mechanism():
    """Test that timeout works correctly"""
    with patch('time.time') as mock_time:
        # Simulate time passing
        mock_time.side_effect = [0, 1, 2, 3, 6]  # Jumps to 6 seconds
        
        result = firmware.operation_with_timeout(timeout=5)
        assert result == 'TIMEOUT'

💡 Testing Race Conditions

For timing-sensitive code, test with both minimal and extended delays. Use stress testing to expose race conditions that occur rarely.

10. Structuring, Naming, and Classifying Firmware Tests

Good organization makes tests easy to find and run selectively.

Three ways to organize tests:

MethodExamplePurpose
Namingtest_uart_send_returns_success()Clear, descriptive test names
Markers@pytest.mark.hardwareFilter tests by category
Classesclass TestSensorInterface:Group related tests

Naming Conventions

# Unit tests: test_<module>_<function>_<scenario>
def test_crc_calculate_returns_correct_checksum():
    pass

def test_crc_calculate_handles_empty_buffer():
    pass

# Integration tests: test_<feature>_<scenario>
def test_uart_communication_sends_and_receives_data():
    pass

# System tests: test_<behavior>_<condition>
def test_device_boots_successfully_after_power_cycle():
    pass

📝 Naming Best Practice

Test names should be so clear that you can understand what failed without looking at the code. Think: “Test validates that [BEHAVIOR] when [CONDITION]”

Using Pytest Markers

Markers let you run specific test groups:

CommandRunsUse Case
pytest -m unitOnly unit testsFast feedback
pytest -m "not hardware"Skip hardware testsCI/CD pipelines
pytest -m "integration and not slow"Fast integration testsQuick validation
import pytest

# Define custom markers in pytest.ini or conftest.py
# [pytest]
# markers =
#     unit: Unit tests (fast, no hardware)
#     integration: Integration tests (may use mocks)
#     hardware: Tests requiring real hardware
#     slow: Tests that take >1 second

@pytest.mark.unit
def test_checksum_calculation():
    pass

@pytest.mark.integration
@pytest.mark.slow
def test_state_machine_transitions():
    pass

@pytest.mark.hardware
@pytest.mark.slow
def test_actual_sensor_reading():
    pass

Run specific test categories:

pytest -m unit              # Only unit tests
pytest -m "not hardware"    # Everything except hardware tests
pytest -m "integration and not slow"  # Fast integration tests

🎯 Marker Strategy

Use markers to categorize tests by:

  • Speed: @pytest.mark.slow
  • Dependencies: @pytest.mark.hardware, @pytest.mark.network
  • Test Level: @pytest.mark.unit, @pytest.mark.integration
  • Features: @pytest.mark.bluetooth, @pytest.mark.sensor

Test Classes for Grouping

class TestProtocolHandling:
    """Group related protocol tests"""
    
    def test_valid_packet_parsing(self, firmware):
        pass
    
    def test_invalid_checksum_rejected(self, firmware):
        pass
    
    def test_timeout_on_incomplete_packet(self, firmware):
        pass

class TestSensorInterface:
    """Group sensor-related tests"""
    
    @pytest.fixture
    def sensor(self, firmware):
        return firmware.get_sensor(id=0)
    
    def test_read_returns_valid_range(self, sensor):
        pass
    
    def test_calibration_adjusts_readings(self, sensor):
        pass

💡 When to Use Classes

Use test classes when:

  • Tests share common fixtures
  • You want to group related functionality
  • You need class-scoped setup/teardown

Don’t use classes just for organization—flat functions are fine!

11. Common Anti-Patterns and How to Avoid Them

Anti-PatternProblemSolution
Testing implementationTests break on refactoringTest behavior instead
Test dependenciesTests fail randomlyMake tests independent
Complex testsHard to debug failuresOne assertion per test
Hidden dependenciesFlaky test behaviorUse explicit fixtures
Poor cleanupState pollutionProper teardown

Anti-Pattern 1: Testing Implementation Details

# Bad: Tests internal implementation
def test_buffer_index_increments():
    buffer = firmware.get_internal_buffer()
    assert buffer.index == 0
    firmware.write_byte(0x42)
    assert buffer.index == 1  # Brittle!

# Good: Tests behavior
def test_can_write_and_read_bytes():
    firmware.write_byte(0x42)
    assert firmware.read_byte() == 0x42

⚠️ Why This Matters

Testing implementation details creates brittle tests that break when you refactor. Test the public API and observable behavior instead.

Anti-Pattern 2: Tests That Depend on Execution Order

sequenceDiagram
    participant T1 as Test 1
    participant T2 as Test 2
    participant State as Shared State
    
    rect rgb(255, 200, 200)
    Note over T1,State: ❌ Anti-Pattern
    T1->>State: Modify state
    T2->>State: Expects T1's state
    Note over T2: Fails if T1 doesn't run!
    end
    
    rect rgb(200, 255, 200)
    Note over T1,T2: ✅ Good Pattern
    Note over T1: Setup own state
    T1->>T1: Test behavior
    Note over T2: Setup own state
    T2->>T2: Test behavior
    end
# Bad: Tests depend on each other
def test_step_1_initialize():
    firmware.initialize()
    
def test_step_2_configure():  # Fails if step_1 doesn't run first!
    firmware.configure()

# Good: Each test is independent
def test_initialize_and_configure():
    firmware.initialize()
    firmware.configure()
    assert firmware.is_ready()

🔑 Test Independence

Every test should pass when run alone or in any order. Use fixtures to ensure clean state.

Anti-Pattern 3: Overly Complex Tests

# Bad: Test does too much
def test_everything():
    firmware.init()
    firmware.configure_uart(115200)
    firmware.configure_spi(SPI_MODE_0)
    data = firmware.read_sensor()
    firmware.process_data(data)
    result = firmware.get_result()
    firmware.send_via_uart(result)
    # ... 50 more lines ...

# Good: Split into focused tests
def test_sensor_reading_returns_valid_data():
    data = firmware.read_sensor()
    assert 0 <= data <= 1023

def test_data_processing_calculates_average():
    data = [100, 200, 300]
    result = firmware.process_data(data)
    assert result == 200

📏 Test Size Guideline

If your test has more than 3-5 assertions or tests multiple behaviors, it’s probably too complex. Split it up!

Anti-Pattern 4: Hidden Test Dependencies

# Bad: Hidden dependency on global state
config_file = 'config.json'

def test_a():
    write_config(config_file, {'mode': 'A'})
    assert firmware.get_mode() == 'A'

def test_b():  # Might fail if config_file wasn't cleaned up
    assert firmware.get_mode() == 'default'

# Good: Use fixtures for clean state
@pytest.fixture
def clean_config():
    config_file = 'test_config.json'
    yield config_file
    if os.path.exists(config_file):
        os.remove(config_file)

def test_a(clean_config):
    write_config(clean_config, {'mode': 'A'})
    firmware.load_config(clean_config)
    assert firmware.get_mode() == 'A'

⚠️ Hidden State is Evil

Global variables, shared files, and persistent state are the top causes of flaky tests. Always use fixtures to manage test dependencies explicitly.

Anti-Pattern 5: Ignoring Teardown

flowchart TD
    A[Test Starts] --> B[Initialize Hardware]
    B --> C[Run Test Logic]
    C --> D{Proper Cleanup?}
    
    D -->|No| E[❌ Hardware Left in Bad State]
    E --> F[Next Test Fails]
    F --> G[Debugging Nightmare]
    
    D -->|Yes| H[✅ Clean Teardown]
    H --> I[Next Test Starts Fresh]
    I --> J[Reliable Test Suite]
    
    style E fill:#FF6B6B
    style F fill:#FF6B6B
    style G fill:#FF6B6B
    
    style H fill:#90EE90
    style I fill:#90EE90
    style J fill:#90EE90
# Bad: Leaves hardware in bad state
def test_sensor():
    firmware.power_on_sensor()
    reading = firmware.read_sensor()
    assert reading > 0
    # Sensor left powered on!

# Good: Proper cleanup
@pytest.fixture
def powered_sensor(firmware):
    firmware.power_on_sensor()
    yield firmware
    firmware.power_off_sensor()

def test_sensor(powered_sensor):
    reading = powered_sensor.read_sensor()
    assert reading > 0

🧹 Cleanup

Always use fixtures with yield for setup/teardown. Teardown code runs even if the test fails, keeping state clean for the next test.

12. Making Firmware Tests CI-Friendly

Continuous integration is crucial for catching regressions early. Here’s how to make your firmware tests work in CI:

graph TD
    A[CI Pipeline] --> B{Test Strategy}
    
    B --> C[Fast Tests<br/>Unit + Mocked Integration]
    B --> D[Hardware Tests<br/>Real Device Required]
    
    C --> E[Run on Every Commit]
    C --> F[Use Test Doubles]
    C --> G[Parallel Execution]
    
    D --> H[Run on Schedule/Manual]
    D --> I[Use Hardware Runner]
    D --> J[Sequential Execution]
    
    E --> K[✅ Fast Feedback<br/>< 5 minutes]
    H --> L[✅ Complete Validation<br/>10-30 minutes]
    
    style K fill:#90EE90
    style L fill:#FFD700

Separate Hardware-Dependent Tests

# conftest.py
def pytest_configure(config):
    config.addinivalue_line(
        "markers", "hardware: tests requiring physical hardware"
    )

# In CI, run: pytest -m "not hardware"

🎯 CI Strategy

Fast Pipeline (every commit): Unit + mocked integration tests (~5 min) Full Pipeline (nightly/release): All tests including hardware (~30 min)

Use Test Doubles for Hardware

flowchart LR
    A[Factory Pattern] --> B{Environment?}
    
    B -->|CI=true| C[Mock Hardware]
    B -->|CI=false| D[Real Hardware]
    
    C --> E[Fast Tests<br/>No Setup]
    D --> F[Real Validation<br/>Actual Hardware]
    
    style C fill:#FFD700
    style D fill:#90EE90
    style E fill:#FFD700
    style F fill:#90EE90
# hardware_abstraction.py
def get_hardware_interface():
    """Factory function that returns real or mock hardware"""
    if os.getenv('CI') == 'true':
        return MockHardwareInterface()
    else:
        return RealHardwareInterface()

💡 Environment Detection

Use environment variables to automatically switch between real and mock implementations. This keeps test code identical across environments.

Generate Test Reports

Test reports document your test results and make them accessible to your team. The most common approach is using CI/CD pipelines to automatically generate and archive reports.

When to generate reports:

  • On every commit/pull request (fast feedback)
  • Nightly builds (comprehensive testing)
  • Before releases (quality gate)
  • After manual test runs (debugging)

Where reports are used:

  • GitHub Actions, GitLab CI, Jenkins, Azure DevOps
  • Stored as CI artifacts
  • Published to coverage services (Codecov, Coveralls)
  • Shared via pull request comments

Basic CI Report Generation

Here’s a GitHub Actions workflow that compiles firmware, runs tests, and generates reports:

# .github/workflows/test.yml
# This file goes in your repository at .github/workflows/test.yml
name: Firmware Tests

on: [push, pull_request]  # Run on every push and PR

jobs:
  test:
    runs-on: ubuntu-latest  # Can be: ubuntu-latest, windows-latest, macos-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: '3.9'  # Change to match your Python version
      
      - name: Install dependencies
        run: |
          pip install pytest pytest-html pytest-cov
          
      - name: Compile firmware libraries
        run: |
          # Adjust this to match your build process
          gcc -shared -fPIC -o libfirmware.so src/*.c
          
      - name: Run tests
        run: |
          pytest -m "not hardware" \  # Skip hardware-dependent tests
            --html=report.html \      # Generate HTML report
            --cov=firmware \           # Measure coverage
            --cov-report=xml           # XML for coverage services
      
      - name: Upload coverage
        uses: codecov/codecov-action@v2
        with:
          files: ./coverage.xml
          
      - name: Archive test reports
        if: always()  # Run even if tests fail
        uses: actions/upload-artifact@v2
        with:
          name: test-reports
          path: |
            report.html
            coverage.xml
          retention-days: 30

Customization Based on Common Needs

1. Multi-Platform Testing (Windows, Linux, macOS)

jobs:
  test:
    strategy:
      matrix:
        # Test on multiple OS and Python versions
        os: [ubuntu-latest, windows-latest, macos-latest]
        python-version: ['3.8', '3.9', '3.10']
    runs-on: $  # Dynamic OS selection from matrix
    steps:
      - uses: actions/checkout@v2
      
      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: $  # Dynamic version from matrix
      # ... rest of steps (install, compile, test)

2. Custom Build Steps for Different Firmware Architectures

- name: Compile firmware
  run: |
    # ARM Cortex-M (real hardware target)
    arm-none-eabi-gcc -mcpu=cortex-m4 -mthumb -c src/*.c
    
    # Or x86 simulation (for CI testing without hardware)
    gcc -shared -fPIC -DSIMULATION -o libfirmware.so src/*.c
    
    # Or use your existing build system
    make test-lib  # Calls your Makefile or CMake configuration

3. Coverage Thresholds (Fail if Coverage Too Low)

- name: Run tests with coverage check
  run: |
    pytest \
      --cov=firmware \              # Measure coverage for 'firmware' package
      --cov-report=term \            # Show coverage in terminal output
      --cov-fail-under=80            # Fail build if coverage < 80%

4. Multiple Test Suites with Different Speeds

jobs:
  fast-tests:
    runs-on: ubuntu-latest
    steps:
      - name: Run unit tests only
        run: pytest -m unit --maxfail=1  # Stop on first failure (fail fast)
  
  slow-tests:
    runs-on: ubuntu-latest
    needs: fast-tests  # Only run if fast tests pass (dependency)
    steps:
      - name: Run integration tests
        run: pytest -m integration --timeout=300  # 5 minute timeout per test

5. Scheduled Hardware Tests

on:
  push:
    branches: [main]  # Run on pushes to main branch
  schedule:
    - cron: '0 2 * * *'  # Run at 2 AM UTC daily (cron format: min hour day month weekday)
    
jobs:
  hardware-tests:
    runs-on: self-hosted  # Use self-hosted runner with physical hardware attached
    steps:
      - uses: actions/checkout@v2
      
      - name: Run hardware tests
        run: pytest -m hardware --verbose  # Only run tests marked with @pytest.mark.hardware

6. Publishing Reports to GitHub Pages

- name: Deploy reports to GitHub Pages
  if: github.ref == 'refs/heads/main'  # Only deploy from main branch
  uses: peaceiris/actions-gh-pages@v3
  with:
    github_token: $  # Automatically provided by GitHub
    publish_dir: ./htmlcov                      # Source directory with HTML reports
    destination_dir: coverage-reports           # Target path on gh-pages branch
    # Reports will be available at: https://username.github.io/repo/coverage-reports/

7. Custom Report Formats and Locations

- name: Generate multiple report formats
  run: |
    pytest \
      --html=reports/test-report.html \      # HTML report for human reading
      --junitxml=reports/junit.xml \          # XML for CI tools (Jenkins, etc.)
      --cov=firmware \                        # Measure coverage
      --cov-report=html:reports/coverage \    # HTML coverage report
      --cov-report=term-missing               # Show missing lines in terminal

Common Customization Factors:

NeedCustomization
Different Python versionsAdd to matrix.python-version
Cross-platform testingAdd to matrix.os
Custom compiler flagsModify Compile firmware step
Specific test categoriesChange -m marker in pytest command
Coverage thresholdAdd --cov-fail-under=X
Report retentionChange retention-days in upload
Conditional runsUse if: conditions on steps
Hardware accessUse runs-on: self-hosted
Report publishingAdd deploy steps

💡 Pro Tip

Start with a simple workflow and add complexity as needed. Test your workflow changes in a feature branch before merging to main.

📊 CI Metrics to Track

  • Test execution time
  • Code coverage percentage
  • Number of tests passing/failing
  • Flaky test rate
  • Hardware availability uptime

Optimize Test Execution Time

graph TD
    A[Test Optimization] --> B[Parallel Execution]
    A --> C[Smart Caching]
    A --> D[Conditional Running]
    
    B --> B1[pytest-xdist<br/>Run tests in parallel]
    C --> C1[Cache compiled libraries<br/>Cache dependencies]
    D --> D1[Skip slow tests on PRs<br/>Full suite on main]
    
    E[Results] --> F[✅ 10x Faster CI]
    E --> G[✅ Better Developer Experience]
    E --> H[✅ More Frequent Testing]
    
    style F fill:#90EE90
    style G fill:#90EE90
    style H fill:#90EE90
# Run slow tests only on main branch
import os
import pytest

def pytest_collection_modifyitems(config, items):
    if os.getenv('CI_BRANCH') != 'main':
        skip_slow = pytest.mark.skip(reason="Slow test skipped on PR builds")
        for item in items:
            if "slow" in item.keywords:
                item.add_marker(skip_slow)

⚡ Performance Tips

  • Use pytest-xdist for parallel execution: pytest -n auto
  • Cache compiled firmware between runs
  • Run only changed tests first (fail fast)
  • Use module-scoped fixtures for expensive setup

Cache Build Artifacts

# In CI config - speeds up builds by caching compiled artifacts
- name: Cache compiled firmware
  uses: actions/cache@v2
  with:
    path: |                                                   # Directories/files to cache
      build/
      *.so
    key: ${{ runner.os }}-firmware-${{ hashFiles('src/**/*.c') }}  # Cache invalidates when C files change
    # Cache is shared across workflow runs for the same OS and source file hash

Complete CI/CD Pipeline Architecture

flowchart TD
    A[Push to Branch] --> B[CI Trigger]
    
    B --> C[Fast Pipeline]
    C --> D[Lint & Format Check]
    D --> E[Compile Firmware]
    E --> F[Unit Tests]
    F --> G[Mocked Integration Tests]
    
    G --> H{Pass?}
    H -->|Yes| I[✅ Allow Merge]
    H -->|No| J[❌ Block Merge]
    
    I --> K[Merge to Main]
    K --> L[Full Pipeline]
    
    L --> M[All Tests<br/>Including Hardware]
    M --> N[Generate Artifacts]
    N --> O[Deploy to Staging]
    
    O --> P{Manual Approval}
    P -->|Yes| Q[Deploy to Production]
    
    style C fill:#FFD700
    style I fill:#90EE90
    style J fill:#FF6B6B
    style L fill:#4A90E2

🎯 Pipeline Tips

  1. Fail Fast: Run fastest tests first
  2. Use Parallelism: Speed up test execution
  3. Clear Feedback: Make failures obvious
  4. Retry Flaky Tests: Auto-retry once before failing
  5. Save Artifacts: Keep logs and reports for debugging

💪 Remember

Every test improves your firmware’s reliability and your team’s productivity.

Resources

Happy testing! 🚀