Software Architecture for Embedded Linux

Goal: Learn how to structure an embedded Linux application so it survives real-world complexity — multiple sensors, a display, network access, error recovery, and a team of developers — without collapsing into unmaintainable spaghetti.

1. The Three Mistakes

Before the patterns, here are the three architectural mistakes that break almost every first embedded Linux project:

Mistake 1: The God Process

Everything in one main.c — sensor reading, display rendering, network logging, error handling, configuration parsing — in a single 2000-line file with global variables.

Why it fails: One bug crashes everything. Blocking I/O (network timeout, slow sensor) freezes the display. Adding a feature means reading the entire codebase. Two developers cannot work on it simultaneously.

Mistake 2: Too Many Processes

Every function gets its own process: one for the sensor, one for the display, one for logging, one for configuration, one for the watchdog. They communicate through a web of pipes, shared memory segments, and signal handlers.

Why it fails: Debugging 8 cooperating processes is exponentially harder than debugging 2. Startup ordering becomes fragile. Shared state management becomes a distributed systems problem on a single board. IPC overhead dominates on a single-core SoC.

Mistake 3: No Error Plan

The application works perfectly on the lab bench but crashes in production when: the sensor disconnects, the SD card fills up, the network drops, or the display driver returns an unexpected error code. Every read() call succeeds during development. None of them are checked.

Why it fails: An embedded device runs unattended. The first unhandled error is the last frame your user sees.

2. Process Architecture Patterns

The most important decision: how many processes, and what does each one do?

Pattern A: Monolithic (One Process)

┌─────────────────────────────────┐
│          Application            │
│  ┌─────────┐   ┌──────────┐     │
│  │ Sensor  │   │ Display  │     │
│  │ Thread  │   │ Thread   │     │
│  └────┬────┘   └────┬─────┘     │
│       └─── shared ──┘           │
│            state                │
└─────────────────────────────────┘

One process with multiple threads. Sensor data is shared via mutex-protected variables. The display thread reads the shared state and renders.

When to use: - Small projects with 1-2 data sources and 1 output - Tight latency requirements (no IPC overhead) - Single developer

Examples from this course: - level_sdl2.c — one thread polls the IMU, the main loop renders - qt_dashboard — QTimer polls sensors, QML renders via property bindings

Risks: A crash in any thread kills the entire application. A blocking call in one thread can starve others if you do not manage priorities.

Pattern B: Supervisor + Workers (Two-Three Processes)

┌──────────────────────────────────────────┐
│  Supervisor (systemd or custom)          │
│  - starts workers                        │
│  - restarts on crash                     │
│  - manages startup ordering              │
└────┬──────────────┬──────────────────────┘
     │              │
┌────┴─────┐   ┌────┴─────┐
│  Sensor  │   │  Display │
│  Service │   │  Service │
│  (C/Py)  │   │  (Qt/SDL)│
└────┬─────┘   └────┬─────┘
     │              │
     └──── IPC ─────┘
      (pipe, socket,
       shared file)

Two or three processes, each responsible for a single concern. systemd supervises them (restarts on crash, manages ordering). They communicate through a simple IPC mechanism.

When to use: - Medium projects with independent failure domains - When you want the display to survive a sensor crash (or vice versa) - Team of 2-3 developers (each owns a service)

Examples from this course: - Doom kiosk — doom_kiosk.sh orchestrates DRM overlay + touch overlay + IMU input + Doom as separate processes - Data Logger — logger service + systemd supervision + overlayfs persistence - Qt App Launcher — launcher process manages child app processes

Risks: IPC complexity. Startup ordering bugs. But each process is small, testable, and independently restartable.

Pattern C: Pipeline (Producer → Processor → Consumer)

┌──────────┐    ┌───────────┐    ┌──────────┐
│  Sensor  │ ──►│  Process  │ ──►│  Display │
│  Reader  │pipe│  Filter   │pipe│  Render  │
└──────────┘    └───────────┘    └──────────┘

Data flows in one direction through a chain of processes connected by pipes or sockets. Each stage transforms the data and passes it forward.

When to use: - Data flows naturally in one direction (sensor → filter → display) - Each stage has different resource requirements (CPU-bound processing vs I/O-bound display) - You want to swap or add stages without changing the others

Examples from this course: - Camera pipeline — libcamera-vid → ffmpeg → display - Ball detection — capture → threshold → morphology → contour → PID → servo (stages are toggleable)

Risks: Latency accumulates across stages. Backpressure is tricky — what happens when the display is slower than the sensor?

Choosing a Pattern

Factor	Monolithic	Supervisor + Workers	Pipeline
Complexity	Low	Medium	Medium
Fault isolation	None	Per-process	Per-stage
Latency	Lowest	IPC overhead	Cumulative
Team scalability	1 developer	2-3 developers	2-4 developers
Best for	Dashboards, gauges	Kiosks, appliances	Data processing

Tip

Start monolithic, split when you have a reason. Do not pre-architect a three-process system for a project that reads one sensor and draws one gauge. Add processes when you need fault isolation, independent restart, or team parallelism — not because "microservices are good practice."

3. Inter-Process Communication

When you split into multiple processes, you need IPC. Here is when to use each mechanism:

Pipes (Anonymous and Named)

# Anonymous: parent → child only
sensor_reader | display_renderer

# Named (FIFO): unrelated processes
mkfifo /tmp/sensor_pipe
sensor_reader > /tmp/sensor_pipe &
display_renderer < /tmp/sensor_pipe &

Use when: Data flows in one direction. Simple text or binary stream. No need for bidirectional communication.

Course example: The camera pipeline uses pipes between libcamera-vid and ffmpeg.

Unix Domain Sockets

int fd = socket(AF_UNIX, SOCK_STREAM, 0);
struct sockaddr_un addr = { .sun_family = AF_UNIX, .sun_path = "/tmp/sensor.sock" };
bind(fd, (struct sockaddr*)&addr, sizeof(addr));

Use when: Bidirectional communication. Multiple clients connect to one server. Request-response patterns.

Advantages over TCP: No network stack overhead. File-based permissions for access control. Can pass file descriptors between processes.

Shared Memory

int fd = shm_open("/sensor_data", O_CREAT | O_RDWR, 0600);
ftruncate(fd, sizeof(struct SensorState));
struct SensorState *state = mmap(NULL, sizeof(*state), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

Use when: High-frequency data sharing (100+ Hz). Lowest latency. Multiple readers.

Warning: Requires explicit synchronization (mutex, semaphore). The most powerful IPC mechanism and the easiest to get wrong. A missed lock creates data races that are nearly impossible to debug.

Course example: level_sdl2.c uses shared state between sensor and render threads (with mutex).

Shared Files (sysfs Pattern)

# Writer (sensor service)
echo "23.5" > /run/sensor/temperature

# Reader (display service)
cat /run/sensor/temperature

Use when: Low-frequency data (<1 Hz). Human-readable for debugging. Natural fit for /run or /tmp on an embedded device. The "sysfs pattern" — the same architecture the kernel uses to expose hardware state.

Advantages: Zero IPC code. Any language can read a file. inotify provides change notification. Survives process restarts (data persists in the file).

Course example: The kernel's sysfs (/sys/class/thermal/thermal_zone0/temp) is exactly this pattern. The Qt launcher's SystemInfo class reads these files.

Signals

kill(child_pid, SIGUSR1);  // notify child

Use when: Simple notifications (wake up, reload config, shut down gracefully). No data payload.

Warning: Signals are asynchronous interrupts. They are hard to reason about, easy to mishandle, and cannot carry data beyond the signal number. Prefer sockets or files for data transfer.

Choosing IPC

Need to share data?
├── One direction → pipe
├── Bidirectional → socket
├── High frequency (>100 Hz) → shared memory + mutex
├── Low frequency (<1 Hz) → shared file
└── Just a notification → signal

Warning

D-Bus is the standard IPC bus for desktop Linux (GNOME, systemd). It is powerful but heavy for embedded — it requires a daemon, XML introspection files, and serialization overhead. Use it when integrating with existing D-Bus services (NetworkManager, BlueZ). Avoid it for custom application-to-application communication on a single-purpose embedded device.

4. Layered Architecture

Within a single process, organize code into layers. Each layer depends only on the layer below it, never above:

┌─────────────────────────────────────┐
│  UI / Presentation                  │
│  QML, SDL2 rendering, terminal      │
│  "How it looks"                     │
├─────────────────────────────────────┤
│  Application Logic                  │
│  State machines, control loops,     │
│  business rules                     │
│  "What it does"                     │
├─────────────────────────────────────┤
│  Hardware Abstraction               │
│  Sensor readers, actuator writers,  │
│  sysfs/devfs wrappers               │
│  "How to talk to hardware"          │
└─────────────────────────────────────┘

Why layers matter

Without layers, the display code reads directly from /sys/bus/iio/devices/iio:device0/in_accel_x_raw. When you switch from IIO to a custom chardev driver, you change the display code. When you add a second sensor, you change the display code. Every change touches everything.

With layers, the hardware abstraction returns a struct SensorData { float accel_x, accel_y, accel_z; }. The application logic does not know or care whether this came from IIO, a chardev, a network socket, or a test file. Switching the sensor backend means changing one file, not twenty.

Practical example: the level display

/* hardware_abstraction.h */
struct ImuData { float roll, pitch; };
int imu_open(const char *device);
struct ImuData imu_read(int fd);

/* app_logic.h */
struct DisplayState { float roll, pitch; bool sensor_ok; };
struct DisplayState update_state(struct ImuData raw, struct DisplayState prev);

/* ui_render.h */
void render_horizon(SDL_Renderer *r, struct DisplayState state);

The level_sdl2.c tutorial does this implicitly — the sensor read, state update, and render are separate functions. Making this separation explicit (separate files or at least separate sections with clear interfaces) is what prevents the God Process problem.

How much layering is enough?

For a 200-line lab exercise: functions in one file with clear separation is fine. For a 2000-line project: separate source files with header interfaces. For a 10,000-line product: separate libraries or modules.

Do not create abstraction layers for a project that fits in one screen. The goal is to make the code maintainable, not to satisfy a UML diagram.

5. State Management

The second most common architectural problem after "too many globals" is "state scattered everywhere." When the sensor fails, what state is the display in? When the user presses a button during a network timeout, what happens?

Explicit State Machine

For any system with more than two modes of operation, draw the state machine first:

                   ┌───────────┐
         ┌─────────│   INIT    │
         │         └─────┬─────┘
         │               │ sensors OK
         │         ┌─────▼─────┐
         │    ┌────│  RUNNING  │◄────┐
         │    │    └─────┬─────┘     │
         │    │          │ sensor    │ sensor
         │    │          │ error     │ recovered
         │    │    ┌─────▼─────┐     │
         │    │    │  DEGRADED │─────┘
         │    │    └─────┬─────┘
         │    │          │ fatal error
         │    │    ┌─────▼─────┐
         │    └───►│   ERROR   │
         │         └─────┬─────┘
         │               │ restart
         └───────────────┘

In code, this is a simple enum and a switch:

enum AppState { STATE_INIT, STATE_RUNNING, STATE_DEGRADED, STATE_ERROR };

enum AppState state = STATE_INIT;

while (running) {
    switch (state) {
    case STATE_INIT:
        if (sensor_open() == 0) state = STATE_RUNNING;
        else state = STATE_ERROR;
        break;
    case STATE_RUNNING:
        data = sensor_read();
        if (data.error) { state = STATE_DEGRADED; break; }
        render(data);
        break;
    case STATE_DEGRADED:
        render_warning("Sensor lost — last known values");
        if (sensor_reconnect() == 0) state = STATE_RUNNING;
        break;
    case STATE_ERROR:
        render_error("Fatal error — restarting in 5s");
        sleep(5);
        state = STATE_INIT;
        break;
    }
}

This is more code than ignoring errors. It is also the difference between a device that recovers and one that shows a frozen screen until someone power-cycles it.

State in Qt (QML)

Qt's property binding system is a built-in state management framework. Instead of tracking state in global variables, you expose state as Q_PROPERTYs and QML reacts automatically:

Q_PROPERTY(bool sensorOk READ sensorOk NOTIFY sensorOkChanged)

Text {
    text: backend.sensorOk ? backend.temperature.toFixed(1) + " °C" : "SENSOR ERROR"
    color: backend.sensorOk ? "#dcdcdc" : "#ff4444"
}

The QML engine tracks which properties are used by which expressions and redraws only what changed. This eliminates manual dirty-flag management.

6. Error Handling Strategy

The Two Rules

Rule 1: At system boundaries, check everything. File opens, sensor reads, network connections, memory allocations. Every open(), read(), write(), ioctl(), mmap() can fail. Check them.

Rule 2: Inside your own code, trust your invariants. If your function takes a valid struct SensorData*, do not check for NULL — fix the caller. Validate at the boundary, trust internally.

Boundary errors in embedded Linux

Boundary	Common failures	Strategy
Sensor read	Device disconnected, driver unloaded, bus error	Retry 3x, then degrade
File I/O	SD card full, file locked, permission denied	Log error, use fallback path
Network	Timeout, DNS failure, connection reset	Exponential backoff, offline mode
Display	DRM master lost, resolution changed	Hide/show window, re-init
Child process	Crash, hang, unexpected exit code	Restart with `Restart=on-failure`

systemd as error handler

For multi-process architectures, systemd provides error handling for free:

[Service]
Restart=on-failure     # restart if process exits non-zero
RestartSec=3           # wait 3 seconds before restart
WatchdogSec=30         # kill and restart if no heartbeat for 30s
StartLimitBurst=5      # after 5 restarts in StartLimitIntervalSec...
StartLimitIntervalSec=60  # ...stop trying (prevents restart storms)

This is the "supervisor" in Pattern B. You do not need to write restart logic — systemd does it better than you will.

Watchdog integration

For critical applications, integrate with the hardware watchdog:

int wd = open("/dev/watchdog", O_WRONLY);
while (running) {
    do_work();
    write(wd, "k", 1);   /* heartbeat — must happen every N seconds */
}
/* If we stop heartbeating, the hardware reboots the system */

Or let systemd manage it (see WatchdogSec above). systemd pings the hardware watchdog for you — your service only needs to ping systemd via sd_notify("WATCHDOG=1").

7. Project Structure

When a project grows beyond 3-4 files, organize it predictably. Here is a structure that works for embedded Linux projects:

my_project/
├── CMakeLists.txt          # top-level build
├── src/
│   ├── main.c              # entry point, argument parsing, init
│   ├── sensor.c / .h       # hardware abstraction for sensors
│   ├── display.c / .h      # rendering (SDL2, Qt, framebuffer)
│   ├── state.c / .h        # application state machine
│   └── config.c / .h       # configuration file parsing
├── services/
│   └── my-app.service      # systemd unit file
├── overlay/
│   └── my-device.dts       # device tree overlay (if needed)
└── tests/
    └── test_state.c        # unit tests for state logic

Key principles:

main.c is thin. Parse arguments, open resources, start the main loop. No business logic.
One file per concern. sensor.c knows how to read hardware. display.c knows how to render. They do not know about each other.
Headers define interfaces. sensor.h declares struct SensorData and sensor_read(). The implementation can change (IIO → chardev → test stub) without touching display code.
Services live in the repo. The systemd unit file is part of the project, not an afterthought.

For Qt projects

my_qt_app/
├── CMakeLists.txt
├── main.cpp                # QObject backends + main()
├── Main.qml                # root window
├── MyComponent.qml         # reusable component
├── services/
│   └── my-qt-app.service
└── config/
    └── kms.json            # EGLFS KMS configuration (optional)

The Qt build system (qt_add_qml_module) embeds QML into the binary, so no runtime file deployment is needed. The C++ backend follows the same layered pattern: hardware reads in one QObject, application logic in another, QML handles presentation.

8. Systemd Service Design

systemd is not just a process launcher — it is your application's supervisor, logger, and dependency manager. Use it properly:

Service dependencies

[Unit]
After=network-online.target    # wait for network
Wants=network-online.target    # but don't fail if no network
After=dev-bmi160.device        # wait for sensor device node

After= controls startup order. Wants= / Requires= control whether a dependency failure stops your service.

Conflicts for display exclusion

On EGLFS, only one app can hold DRM master. Use Conflicts= to enforce this:

Conflicts=qt-dashboard.service level-sdl2.service

Starting your service automatically stops conflicting services. This prevents "black screen" bugs from two apps fighting for the display.

Environment for display apps

Environment=SDL_VIDEODRIVER=kmsdrm
Environment=QT_QPA_PLATFORM=eglfs
ExecStartPre=/bin/sh -c 'echo 0 > /sys/class/vtconsole/vtcon1/bind'

The ExecStartPre suppresses the text console so graphics fill the screen.

Socket activation (advanced)

For network services, systemd can listen on a socket and start your service only when a connection arrives:

# my-server.socket
[Socket]
ListenStream=8080

# my-server.service
[Service]
ExecStart=/usr/local/bin/my-server
StandardInput=socket

Your service does not need to manage socket lifecycle — systemd does it. This also means zero-downtime restarts: systemd holds the socket while restarting the service.

9. Real-World Architecture Examples

Example 1: Temperature Dashboard (Monolithic)

┌─────────────────────────────────────────────┐
│  qt_dashboard (single process)              │
│                                             │
│  QTimer (10Hz) ──► poll sysfs ──► Q_PROPERTY│
│                                    │        │
│                              QML binding    │
│                                    │        │
│                              render gauge   │
└─────────────────────────────────────────────┘
systemd: Restart=on-failure

One process, one timer, property bindings handle state. Works because there is only one data source and one output.

Example 2: Doom Kiosk (Supervisor + Workers)

┌──────────────────────────────────────────────┐
│  doom_kiosk.sh (supervisor)                  │
│  ├── drm_overlay (background, sleeps)        │
│  ├── doom_touch_overlay.py (background)      │
│  ├── imu_doom.py (background)                │
│  └── chocolate-doom (foreground, waited)     │
│       on exit → kill all children            │
└──────────────────────────────────────────────┘
systemd: Restart=on-failure, Conflicts=other-display

Four processes, each doing one thing. The shell script manages lifecycle. systemd handles the outer restart. Each component can be developed and tested independently.

Example 3: Industrial Data Logger (Pipeline + Persistence)

sensor_reader ──pipe──► data_filter ──pipe──► csv_writer
                                                │
                                         /data/log.csv
                                         (overlayfs)
                                                │
                                         systemd-journal
                                         (structured logs)

Three-stage pipeline. csv_writer uses overlayfs for power-loss safety. systemd-journal captures structured logs. Each stage is a simple program that does one thing.

Example 4: App Launcher (Process Manager)

┌──────────────────────────────────────────────┐
│  qt_launcher (manager process)               │
│  ├── show home screen (SwipeView)            │
│  ├── on tap: hide window → QProcess child    │
│  ├── on child exit: show window              │
│  └── system info polling (QTimer, 2s)        │
│                                              │
│  Children: qt_dashboard, chocolate-doom,     │
│            level_sdl2, pong_fb               │
└──────────────────────────────────────────────┘
systemd: Restart=on-failure, Conflicts=other-display

The launcher is a process manager — it starts and stops children, handling DRM master handoff. Each child is an independent application that knows nothing about the launcher.

10. Architecture Decision Checklist

Before writing code, answer these questions:

[ ] How many processes? One per failure domain. If the sensor crashing should not kill the display, they are separate processes.
[ ] How do they communicate? Choose the simplest IPC that works. Pipe for one-way data, socket for request-response, file for low-frequency state.
[ ] What happens when X fails? For every external dependency (sensor, network, display), write down the failure mode and recovery strategy.
[ ] Who restarts what? systemd restarts services. Your application restarts internal state machines. Do not mix these responsibilities.
[ ] Where is the state? Draw the state machine. If you cannot draw it, you do not understand your own application.
[ ] What is the startup order? Draw the dependency graph. Encode it in systemd After= / Wants= / Conflicts=.
[ ] Can each component be tested alone? If your display code cannot run without a real sensor, your abstraction layers are missing.

Summary

Principle	What It Means
Start monolithic, split when needed	Do not pre-architect; add processes when you have a reason
One process per failure domain	Sensor crash should not freeze the display
Simplest IPC that works	Pipe > socket > shared memory, in order of preference
Layers, not spaghetti	Hardware abstraction → app logic → UI, each in separate files
Explicit state machines	`enum State` + `switch` beats scattered `if` checks
Check every boundary	`open()`, `read()`, `ioctl()` can all fail
Let systemd supervise	`Restart=on-failure` is better than your hand-rolled watchdog loop
Test components alone	If it needs the real hardware to test, you are missing an abstraction