Skip to content

Lesson 7: Architecture & Integration

Óbuda University -- Linux in Embedded Systems

"Draw the whole system. Where does data flow? Where can it fail? How did we get here?"


Problem First

Your data logger starts before the network is ready -- the first cloud upload fails.

The display service starts before the sensor driver has probed -- stale data for 10 seconds.

Logs are split between journalctl, dmesg, a CSV file, and stdout -- nobody can reconstruct what happened when the device rebooted at 3 AM.

These are not coding bugs. They are architecture bugs.

No amount of fixing individual services will help if the startup order is fragile, the data flow is undefined, and the failure modes are unplanned.


Today's Map

  • Block 1 (45 min): Full-stack system view: stack from hardware to app, hardware contract, data flow as architecture, four architecture views, layering rules.
  • Block 2 (45 min): Draw your system plus history: architecture exercise, peer review, Unix/Linux timeline, obsolete vs emerging components, regulatory landscape.

What Is Architecture?

Architecture defines three things before code is written:

  • Which components exist -- services, drivers, hardware blocks
  • How they communicate -- protocols, interfaces, data formats
  • What happens when something breaks -- recovery, fallback, degradation

A system without documented architecture works by accident. It breaks the moment someone changes something that was never written down.


The Full Stack

 +-------------------------------------------------------+
 |                  Application Layer                     |
 |  Product logic: read sensors, render display, upload   |
 +--------------------------+----------------------------+
 |       Init System        |     System Services        |
 |  systemd: ordering,      |  logging, networking,      |
 |  supervision, restart    |  monitoring                |
 +--------------------------+----------------------------+
 |                  Linux Kernel                          |
 |  Drivers (I2C, SPI, GPIO, DRM)  |  Scheduler  |  FS  |
 +--------------------------+------+---------+---+-------+
 |                  Bootloader                            |
 |  U-Boot / RPi firmware: HW init, kernel loading       |
 +--------------------------+----------------------------+
 |                  Hardware                              |
 |  SoC (CPU, GPU, peripherals)  |  Sensors, Displays   |
 +-------------------------------------------------------+

Each layer has one job and communicates only with its neighbors.


Layer Responsibilities

Layer Owns Stops At
Bootloader Clock init, RAM init, load kernel Kernel handoff
Kernel Drivers, scheduling, memory, filesystems Stable /dev/ and sysfs interfaces
Init (systemd) Service ordering, supervision, restart Application readiness
Application Product behavior, data processing User-visible output

When a layer reaches into another layer's job, the system becomes fragile.


center

Operating system architecture layers: hardware, kernel (drivers, scheduler, memory management), system libraries, and user-space applications.


Block 1

Full-Stack System View


The Hardware Contract

Three layers form a contract:

 +---------------------+
 |   Application       |  reads /dev/mcp9808
 |   (Python script)   |  or /sys/bus/iio/...
 +----------+----------+
            |  stable interface (read/write syscalls)
 +----------v----------+
 |   Kernel Driver      |  knows register map,
 |   (mcp9808.ko)       |  timing, bus protocol
 +----------+----------+
            |  I2C bus address from Device Tree
 +----------v----------+
 |   Device Tree        |  declares: "MCP9808 at
 |   (.dtb / overlay)   |  address 0x18 on i2c-1"
 +---------------------+

The app does not know the I2C address. The driver does not know what the app does with the temperature. The Device Tree does not know either.


When the Contract Breaks

Broken Layer Symptom Example
Device Tree Driver never probes Wrong I2C address in overlay
Driver Interface changes between kernels sysfs attribute renamed in 6.x
Application Bypasses driver App reads /dev/mem directly

Respecting the contract means your application runs on any board that provides the same driver interface -- Raspberry Pi, custom PCB, or industrial SoM.

Portability comes from discipline, not from luck.


Data Flow as Architecture

The most useful way to view an embedded system is as a data pipeline:

 +--------+    +---------+    +----------+    +---------+    +---------+
 | Sensor |--->| Driver  |--->| Service  |--->| Storage |--->| Display |
 | (HW)   |    | (kernel)|    | (user)   |    | (CSV/DB)|    | (OLED)  |
 +--------+    +---------+    +----------+    +---------+    +---------+
    I2C bus      /dev/ or        Python        /data/         SPI/DRM
                 sysfs           process       log.csv

Every embedded product follows this pattern. Thinking in data flow reveals the questions that matter.


Data Flow Questions

At every arrow in the pipeline, ask:

Question Why It Matters
Where is data buffered? Determines memory usage and latency
What format is the data? Raw ADC counts? Degrees C? JSON?
Who owns the format conversion? Avoids duplicate conversions
What happens when this link breaks? Defines failure behavior
How much latency does this stage add? Sets end-to-end timing

A pipeline that is documented can be debugged by anyone. A pipeline that exists only in the developer's head becomes unmaintainable.


Example: Temperature Monitoring

 +----------+      +-------------+      +--------------+
 | MCP9808  |--I2C-| mcp9808     |--/dev| Logger       |
 | sensor   |      | kernel drv  |      | (Python)     |
 +----------+      +-------------+      +---+-----+----+
                                            |     |
                              +-------------+     +----------+
                              |                              |
                        +-----v------+              +--------v-------+
                        | CSV file   |              | Cloud endpoint |
                        | /data/     |              | HTTP POST      |
                        +------------+              +----------------+

                        +-------------+      +--------------+
                        | mcp9808     |--/dev| Display svc  |---SPI---OLED
                        | kernel drv  |      | (updater)    |
                        +-------------+      +--------------+

Each arrow is a potential failure point. The architecture must define what happens when any link breaks.


Link Breaks System Behavior
Sensor disconnected Driver returns error; logger records "N/A"
Network down Logger buffers locally; retries on reconnect
Display fails System continues logging (display is optional)
CSV disk full Rotate logs; alert via network
Cloud rejects data Queue locally; log rejection reason

Rule: No single component failure should take the whole system down. The architecture must define graceful degradation.


Four Architecture Views

A single diagram cannot capture a system. Different questions need different views:

View Answers Example Question
Boot / Startup What runs when? "Why does the logger fail on cold boot?"
Process / Service Who supervises what? "What restarts the display service?"
Data Flow How does data move? "Why is the OLED showing stale data?"
Failure / Recovery What happens when X breaks? "What caused the 3 AM reboot?"

Without all four, debugging degrades to guessing.


View 1: Boot / Startup

 Power On
    |
    v
 U-Boot .............. 1 s
    |
    v
 Linux Kernel ........ 2 s
    |   (drivers probe, Device Tree parsed)
    v
 systemd ............. 5 s
    |   (mount filesystems, start networking)
    v
 data-logger.service . 0.5 s
    |
    v
 App Ready ........... Total: ~8.5 s
                       Budget: 10 s

Every stage has a time budget. If boot exceeds the budget, find which stage grew and fix it.


View 2: Process / Service

 systemd (PID 1)
    |
    +-- data-logger.service
    |     Restart=always, WatchdogSec=10
    |
    +-- display-updater.service
    |     Restart=always, After=data-logger.service
    |
    +-- sshd.service
    |     Restart=on-failure (remote access)
    |
    +-- NetworkManager.service
    |     Restart=on-failure
    |
    +-- systemd-watchdog
          HW watchdog, reboot if systemd hangs

Every service has an explicit restart policy and dependency ordering. No service starts "whenever it feels like it."


View 3: Data Flow

(See the temperature monitoring pipeline from earlier.)

Key additions to document:

  • Buffering: logger holds 1000 samples in memory before writing to CSV
  • Format: driver provides millidegrees (integer); logger converts to degrees (float)
  • Rate: sensor polled every 1 s; display updated every 2 s; cloud push every 60 s

Different rates at different stages create a natural fan-out that must be designed, not discovered.


View 4: Failure / Recovery

 +--------------------+-----------------+------------------+
 | Failure            | Detection       | Recovery         |
 +--------------------+-----------------+------------------+
 | Service crash      | systemd         | Restart (3 s)    |
 +--------------------+-----------------+------------------+
 | System hang        | HW watchdog     | Reboot (15 s)    |
 +--------------------+-----------------+------------------+
 | Bad firmware       | A/B partition   | Rollback (30 s)  |
 | update             | boot counter    |                  |
 +--------------------+-----------------+------------------+
 | Sensor failure     | Driver error    | Log "N/A",       |
 |                    | code            | continue running  |
 +--------------------+-----------------+------------------+
 | Disk full          | inotify /       | Log rotation,    |
 |                    | threshold check | alert             |
 +--------------------+-----------------+------------------+

Why You Need All Four Views

You Have Only... You Cannot Debug...
Boot view Why sensor data stopped at 3 AM
Data flow view Why the system hangs during boot
Process view Where data gets corrupted in transit
Failure view Why startup takes 30 seconds

Each view is a lens on the same system. Together they give full coverage of engineering and operational questions.


Layering Rules

These rules exist because every team that violates them rediscovers the same problems:

  • Kernel drivers expose raw data -- not "temperature too high"
  • Bug in threshold logic crashes a process, not the kernel

  • Business logic stays in user space -- where crashes are recoverable

  • Service supervision stays in systemd -- not in application code

  • Restart policies, ordering, watchdog belong to init

  • Cross-layer shortcuts are documented exceptions

  • mmap for DMA? Fine -- but document why and isolate it

Layering Violations and Their Cost

Violation Consequence
Business logic in kernel driver Bug causes kernel panic, not app crash
Application supervises itself Must reimplement systemd restart logic
App reads /dev/mem directly Loses portability, risks security holes
Driver formats data as JSON Kernel does string processing (slow, fragile)

The rule: each layer does its job and trusts the adjacent layers to do theirs.


View 5: Performance View

The four views tell you what runs, how data flows, and what happens when things break. The performance view tells you whether it runs fast enough.

Metric Where to Measure Tool
Pipeline latency per stage Each arrow in the data flow clock_gettime instrumentation
CPU usage per process Process view mpstat -P ALL, pidstat -t
Hotspot functions Inside each service perf record → perf report
Blocking calls System boundary strace -c
Scheduling jitter RT threads cyclictest

Reference: Performance Profiling


Quick Architecture Checks

Five yes/no questions to validate any embedded Linux architecture:

  1. Can one service fail without taking the whole system down?
  2. Are all dependencies explicit in systemd unit files?
  3. Can you trace one sensor value through the full stack?
  4. Is each layer's ownership clear and documented?
  5. Have you measured latency at each pipeline stage?

If any answer is "no" or "I don't know," the architecture has a gap.


center

Linux kernel subsystems: process management, memory management, filesystems, networking, and device drivers — all running in privileged mode.


Block 1 Summary

  • The stack is layered -- hardware, bootloader, kernel, init, application
  • The hardware contract -- Device Tree + driver + app interface = portability
  • Data flow is the backbone -- every product is a sensor-to-output pipeline
  • Four views are needed -- boot, process, data flow, failure/recovery
  • Layering rules prevent cross-layer bugs -- each layer has one job

Block 2

Draw Your System + History


Exercise: Draw Your Level Display System (25 min)

Task: In your team, draw the complete architecture of your IMU level display system.

Include: - All hardware components (IMU, display, SoC, buses) - All software layers (driver, service, application) - Data flow with latency at each stage - Failure modes at each link

Use all four views: boot, process, data flow, failure/recovery.

Paper, whiteboard, or digital -- your choice.


Exercise Template

 +--------+       +-----------+       +------------+
 | BMI160 |--SPI--| spi driver|--/dev-| level_app  |
 | IMU    |       | (kernel)  |       | (Python)   |
 +--------+       +-----------+       +-----+------+
                                            |
                                     +------v-------+
                                     | SDL2 / DRM   |
                                     | framebuffer  |
                                     +------+-------+
                                            |
                                     +------v-------+
                                     | OLED / LCD   |
                                     | display (HW) |
                                     +--------------+

Add: latency per arrow, failure mode per component, boot order, restart policy.


Peer Review (10 min)

Swap diagrams with another team.

Review checklist:

  • [ ] All four views present (boot, process, data flow, failure)?
  • [ ] Latency annotated at each pipeline stage?
  • [ ] Failure mode defined for each component?
  • [ ] Service dependencies explicit?
  • [ ] Can you trace a single IMU sample from sensor to pixel?

Write two strengths and two gaps on the diagram. Return it.


How Did We Get Here?

Embedded Linux did not appear overnight. Understanding history explains why certain tools exist -- and which ones are disappearing.

 1969  Unix (Bell Labs, PDP-7)
   |
 1972  Rewritten in C --> first portable OS
   |
 1977  BSD adds TCP/IP, sockets --> the Internet
   |
 1983  GNU project (Stallman) --> free tools
   |
 1988  POSIX standard --> portable API
   |
 1991  Linux 0.01 (Torvalds) --> the kernel
   |
 1999  BusyBox + uClibc --> Linux on small ARM
   |
 2003  Buildroot --> reproducible embedded builds

From Hobby to Everywhere

 2006  Android announced --> Linux goes mobile
   |
 2010  Yocto Project --> industrial embedded Linux
   |
 2015  Device Tree mandatory for ARM --> no more board files
   |
 2017  100% of Top 500 supercomputers run Linux
   |
 2023  PREEMPT_RT merged into mainline
   |
 2024  Rust modules accepted in mainline kernel
   |
 Today Linux runs on everything from $0.50 MCUs
       to Mars rovers

From a Finnish student's hobby to the OS that runs the world -- in 33 years.


center

Unix/Linux timeline: from Bell Labs PDP-7 (1969) through BSD, GNU, POSIX, and Linux (1991) to today's embedded and cloud deployments.


What's Obsolete

Old Technology Replaced By Year Why
devfs udev 2004 Dynamic, rules-based device management
fbdev (/dev/fb0) DRM/KMS ~2012 Atomic page flips, VSync, multi-plane
sysvinit systemd ~2014 Parallel boot, dependencies, watchdog
HAL daemon udev + sysfs ~2010 Simpler, kernel-integrated
Board files (C) Device Tree 2015 HW description separated from code

If you find tutorials using the "old" column, the approach still works but is not recommended for new designs.


What's Emerging

Technology What It Is Why It Matters
eBPF Attach programs to kernel events without recompiling Production tracing, zero overhead when off
Rust in kernel Memory-safe driver language (since Linux 6.1) Fewer use-after-free, buffer overflow bugs
Zephyr RTOS Modern RTOS for MCUs What PREEMPT_RT cannot reach
RISC-V Open ISA, no license fees Buildroot/Yocto support; gaining momentum

These are not speculative -- they are in mainline kernels and production systems today.


Regulatory Landscape

Embedded devices increasingly face mandatory regulations:

Standard Scope Key Requirement
IEC 62443 Industrial cybersecurity Security lifecycle management with zone/conduit model — covers entire product life from design to decommission
ETSI EN 303 645 Consumer IoT No default passwords, mandatory security updates for minimum 5 years, vulnerability disclosure process
EU Cyber Resilience Act All connected products sold in EU Security by design, 5-year update obligation, applies to all connected products starting 2027

These are not optional. The EU CRA affects what you can legally sell starting 2027.

Design for compliance from the start -- retrofitting security is expensive.


Connecting History to Today

Historical Decision Impact on Your Work Today
Unix rewritten in C (1972) Linux is portable across ARM, x86, RISC-V
"Everything is a file" (1970s) You read sensors via /dev/ and /sys/
POSIX standard (1988) Same API on embedded Linux and desktop Linux
GPL license (1991) Linux is free; you must share kernel changes
Device Tree (2015) Hardware described in data, not compiled in code
PREEMPT_RT (2023) Real-time is no longer a separate patchset

History is not trivia -- it explains why things are the way they are.


Key Takeaways

  • Architecture is a system view, not a code file -- define components, communication, and failure modes before writing code

  • Data flow is the backbone -- every embedded product is a pipeline from sensor to output

  • Four views give full coverage -- boot, process, data flow, failure/recovery

  • Respect the layers -- kernel drivers expose raw data; business logic lives in user space

  • 30 years of history shaped today's tools -- learn from what was replaced and why

  • Regulations are becoming mandatory -- design for compliance from the start


Hands-On Next

Apply these architecture concepts in practice:

Capstone: Level Display System Build a complete sensor-to-display pipeline that exercises all four architecture views.

You will: - Document the boot sequence with timing budgets - Define systemd services with explicit dependencies - Trace data from IMU sensor to rendered pixels - Plan failure modes and recovery for each component

This is where architecture stops being theory and becomes engineering.