Lesson 9: Reliability, Updates & Security
Óbuda University -- Linux in Embedded Systems
"The power goes out during an update. Your device is on the public internet. What happens?"
Today's Map
- Block 1 (30 min): Storage & filesystems — storage types (SD, eMMC, NOR/NAND), why ext4 loses data on power cut, write endurance, embedded-friendly filesystems.
- Block 2 (45 min): Reliability — read-only rootfs, overlayfs, A/B partition updates, watchdogs (hardware + systemd), design rules.
- Block 3 (45 min): IoT networking and security — physical layers, MQTT, IoT architecture, attack surfaces, CVE case studies, hardening, OTA frameworks.
Storage & Filesystems
Why your filesystem choice determines whether the device survives a power cut
Embedded Storage Types
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ SD Card │ │ eMMC │ │ NOR Flash │ │ NAND Flash │
│ │ │ │ │ │ │ │
│ Removable │ │ Soldered │ │ Soldered │ │ Soldered │
│ FTL inside │ │ FTL inside │ │ No FTL │ │ No FTL │
│ Block dev │ │ Block dev │ │ MTD device │ │ MTD device │
│ /dev/mmcXX │ │ /dev/mmcXX │ │ /dev/mtdX │ │ /dev/mtdX │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
▲ ▲ ▲ ▲
Prototyping Production Bootloader, Large storage
Hobby boards Consumer HW small config Raw NAND + UBI
Pi, BeagleBone Phones, TVs Execute-in-place Industrial
Your Pi boots from an SD card — a removable flash device with a built-in Flash Translation Layer (FTL). Production devices typically use soldered eMMC or raw NAND.
How Flash Memory Works
Flash memory cannot overwrite data in place. It must erase a large block before writing new data.
HDD / RAM: Flash (NAND / NOR):
───────────────── ─────────────────────────────
Write anywhere, any time 1. Erase entire block (128 KB)
Overwrite single byte 2. Write pages (4 KB) into block
No wear from writes 3. Block wears out after N cycles
P/E cycle = one Program/Erase cycle. Each erase physically degrades the floating-gate transistors. After enough cycles, the cell can no longer reliably hold a charge.
| Flash type | Typical P/E endurance | Why |
|---|---|---|
| SLC (1 bit/cell) | 100,000 cycles | One voltage level, easy to distinguish |
| MLC (2 bits/cell) | 3,000-10,000 | Four levels, tighter margins |
| TLC (3 bits/cell) | 1,000-3,000 | Eight levels, most consumer SD cards |
Consumer SD cards are mostly TLC — cheap but the lowest endurance.
FTL: The Flash Translation Layer
Raw flash is hard to use: you can't overwrite, blocks wear out, bad blocks appear. The FTL is a firmware layer inside SD cards and eMMC that hides this complexity.
┌─────────────────────────────────────────┐
│ Linux sees: /dev/mmcblk0 │
│ "Just a block device, like a hard disk"│
└──────────────┬──────────────────────────┘
│ block read/write
▼
┌─────────────────────────────────────────┐
│ FTL firmware (inside the card) │
│ │
│ • Logical → physical block mapping │
│ • Wear leveling (spread writes evenly) │
│ • Bad block management │
│ • Garbage collection │
│ • Write buffering │
└──────────────┬──────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Raw NAND flash cells │
└─────────────────────────────────────────┘
NOR and raw NAND have no FTL — Linux must manage wear leveling itself (via UBI/UBIFS or JFFS2).
Storage Comparison
| Property | SD Card | eMMC | NOR Flash | NAND Flash |
|---|---|---|---|---|
| Capacity | 1-512 GB | 4-256 GB | 256 KB - 64 MB | 128 MB - 8 GB |
| Read speed | 25-100 MB/s | 100-400 MB/s | 50-100 MB/s | 20-100 MB/s |
| Write speed | 10-60 MB/s | 50-200 MB/s | 0.1-1 MB/s | 5-40 MB/s |
| Write endurance | 1K-10K cycles | 3K-100K cycles | 100K cycles | 1K-10K cycles |
| Boot from? | Yes (slow) | Yes (fast) | Yes (XIP) | Yes (with loader) |
| Removable? | Yes | No | No | No |
| Filesystem | ext4, F2FS | ext4, F2FS | JFFS2, raw | UBIFS + UBI |
| Typical cost | Low | Medium | High/byte | Low/byte |
Pi 4: boots from SD card or USB. Pi CM4: has eMMC option (faster, more reliable).
What Happens When Power Dies Mid-Write
Application: write(fd, data, 4096)
│
▼
┌─────────────────────────────────────────┐
│ Page cache (RAM) │
│ Data sits here -- NOT on disk yet │
│ Linux delays writes for performance │
└──────────────┬──────────────────────────┘
│ writeback (kernel decides when)
▼
┌─────────────────────────────────────────┐
│ Block device (SD / eMMC / flash) │
│ 4 KB block partially written │◄── POWER DIES HERE
└─────────────────────────────────────────┘
Result: file contains garbage. Metadata (directory entries, inode tables) may also be corrupted. Filesystem may not mount on next boot.
ext4: The Desktop Filesystem
ext4 uses a journal — a write-ahead log that records what it intends to do before doing it.
Normal write:
1. Write intent to journal ──► "will update inode 42 + block 1000"
2. Write actual data to disk
3. Mark journal entry complete
Power cut after step 1:
──► On boot, replay journal ──► metadata consistent
──► But DATA may be lost (only metadata is journaled by default)
| ext4 mode | Journals | Power-cut safety |
|---|---|---|
journal (data=journal) |
metadata + data | Safest, slowest, 2× write amplification |
ordered (default) |
metadata only | Metadata safe, data may be stale |
writeback |
metadata only | Fastest, data can be garbage after crash |
ext4 default mode protects the filesystem structure, NOT your data.
Why ext4 Is Problematic for Embedded
| Problem | Impact |
|---|---|
| Write amplification | Journal writes + data writes = 2-3× more wear |
| Large erase blocks | Flash erases 128-512 KB at a time, ext4 writes 4 KB |
| No wear leveling awareness | ext4 doesn't know about flash internals |
| Recovery time | Journal replay on large partition can take minutes |
| Not power-cut safe for data | Default mode loses data, only metadata survives |
ext4 works on SD cards and eMMC because the FTL inside the card handles wear leveling and bad block management. But it's still not ideal.
Write Endurance: The Hidden Limit
SD card rated 3,000 P/E cycles. System logs 1 MB/minute. Card has 32 GB.
Wear-leveled writes per day:
1 MB/min × 60 × 24 = 1,440 MB/day
FTL spreads writes across the card:
32,000 MB ÷ 1,440 MB/day = ~22 days per full pass
Card lifetime:
3,000 cycles × 22 days = ~180 years ✓ (if only data writes)
But: journal writes, temp files, swap, apt cache multiply actual writes 5-10×.
Realistic lifetime: 180 years ÷ 10 = ~18 years ✓ (still OK)
With swap on SD: Swap writes 10-100× more ──► ~2-18 months ✗
Rule: Never enable swap on flash. Use zram (compressed RAM) if needed.
Embedded-Friendly Filesystems
| Filesystem | Type | Best for | Key property |
|---|---|---|---|
| SquashFS | Read-only, compressed | Root filesystem images | 50-70% smaller, no writes = no corruption |
| EROFS | Read-only, compressed | Newer alternative to SquashFS | Lower CPU overhead, better random read |
| F2FS | Read-write, log-structured | eMMC/SD data partitions | Designed for flash, lower write amplification |
| JFFS2 | Read-write, log-structured | Raw NOR flash (<64 MB) | Wear leveling built-in, mounts slow |
| UBIFS | Read-write, journaling | Raw NAND flash | Works on top of UBI volume manager |
| tmpfs | RAM-backed | /tmp, overlayfs upper | Zero flash wear, lost on reboot |
The safest filesystem is the one that never writes. That leads us to the most important embedded reliability technique...
Read-Only Root Filesystem
Now that you know why writes corrupt — eliminate them
The Core Idea
You just saw: writes can corrupt files, metadata, or the entire filesystem on power loss. Even ext4's journal only saves the structure, not your data.
The simplest solution: don't write to the root filesystem at all.
If the root filesystem is never modified during operation, it cannot be corrupted by a power cut.
- No dirty write buffers to lose
- Every boot is identical — known-good state
- Rollback is trivial — the image never changes
Consumer routers, set-top boxes, and ATMs all use this pattern.
But applications need to write... /tmp, /var/log, /etc all expect writes. Solution: Overlayfs.
How Overlayfs Works
┌─────────────────────────────────────────┐
│ Merged View ( / ) │
│ Applications see a normal writable │
│ filesystem -- reads and writes work │
└──────────┬──────────────┬───────────────┘
│ │
┌──────────▼──────┐ ┌────▼───────────────┐
│ Upper Layer │ │ Lower Layer │
│ (read-write) │ │ (read-only) │
│ │ │ │
│ tmpfs in RAM │ │ SD card rootfs │
│ or separate │ │ base image │
│ partition │ │ never modified │
└─────────────────┘ └────────────────────┘
Overlayfs Read and Write Paths
READ: WRITE:
───────────────────────── ─────────────────────────
App reads /etc/hostname App writes /etc/hostname
│ │
▼ ▼
Check upper layer Copy-up from lower to
── found? ──► return it upper (if needed)
│ no │
▼ ▼
Check lower layer Write lands in upper
── found? ──► return it layer ONLY
│ no │
▼ Lower layer is NEVER
File not found modified
Upper layer shadows lower. If a file exists in both, the upper version wins.
Reboot = Clean Slate + Persistent Data
If the upper layer is RAM-backed (tmpfs), power cycle discards all changes. System boots with pristine lower layer -- known-good state, every time.
But sensor logs, calibration, and config must survive. Solution: separate data partition.
┌────────────────────────────────────────┐
│ Partition Layout │
├────────────────────────────────────────┤
│ /dev/mmcblk0p1 │ boot │ FAT32 │
│ /dev/mmcblk0p2 │ rootfs │ ext4 RO │
│ /dev/mmcblk0p3 │ data │ ext4 RW │
└────────────────────────────────────────┘
Corrupted data partition does NOT prevent boot. Use fsync() after every critical write.
A/B Partition Updates
Atomic, verifiable, recoverable
The Update Problem
Deploying firmware to a remote device is dangerous. If an update is interrupted -- power cut, network drop, corrupted download -- the device must NOT end up unbootable.
Three principles of safe updates:
| Principle | Meaning |
|---|---|
| Atomic | Fully succeeds or fully fails -- no half-updated state |
| Verifiable | Check cryptographic signature + checksum before switching |
| Recoverable | If new image fails, automatically fall back to previous |
The most common strategy: A/B partition layout with automatic rollback.
A/B Partition Layout
┌─────────────────────────────────────────────┐
│ Storage Layout │
├──────────────────┬──────────────────────────┤
│ boot partition │ Bootloader + flags │
├──────────────────┼──────────────────────────┤
│ Partition A │ rootfs v2.1 [ACTIVE] │
├──────────────────┼──────────────────────────┤
│ Partition B │ rootfs v2.0 [INACTIVE] │
├──────────────────┼──────────────────────────┤
│ data partition │ Persistent user data │
└──────────────────┴──────────────────────────┘
Cost: doubles the rootfs storage requirement. Benefit: the device is never bricked by a bad update.
A/B Update Flow
1. Device checks server ──► "v2.2 available"
│
2. Download image to INACTIVE partition (B)
(active partition A is untouched)
│
3. Verify checksum + signature
── fail? ──► discard, keep running A
│
4. Set boot flag: "try B next boot"
│
5. Reboot into B
│
6. B boots OK? ──► Mark B as GOOD
B fails? ──► Watchdog fires
──► Reboot into A (rollback)
Key: never modify the running partition.
What "Atomic" Really Means
WRONG (in-place update): RIGHT (A/B update):
┌──────────────┐ ┌──────────────┐
│ rootfs v2.1 │ ◄── overwrite │ rootfs v2.1 │ untouched
│ │ in progress │ [ACTIVE] │
│ CORRUPTED │ ├──────────────┤
│ if power │ │ rootfs v2.2 │ ◄── write
│ cuts here │ │ [INACTIVE] │ here
└──────────────┘ └──────────────┘
Device bricked. Power cut? A still works.
In-place updates are never safe for field-deployed devices.
The Boot Flag State Machine
┌─────────┐ update written ┌───────────┐
│ A good │ ─────────────────► │ try B │
│ B idle │ │ A fallback│
└─────────┘ └─────┬─────┘
▲ │
│ watchdog timeout │ B boots OK
│ (B failed) ▼
│ ┌───────────┐
└──────────────────────── │ B good │
rollback │ A idle │
└───────────┘
The bootloader reads these flags on every boot to decide which partition to load.
Watchdogs
The last line of defense
Why Watchdogs Exist
Even with a reliable filesystem and robust updates, software can still hang: memory leak over days, deadlock from a rare race condition, driver stuck in infinite retry loop.
On a remote device, nobody is watching. Watchdogs recover automatically.
┌──────────────────────────────────────────────┐
│ Hardware Watchdog Timer │
│ │
│ ┌──────┐ kick ┌──────┐ kick ┌──────┐ │
│ │ 15s │ ──────► │ 15s │ ──────► │ 15s │ │
│ │ │ reset │ │ reset │ │ │
│ └──────┘ └──────┘ └──────┘ │
│ │
│ If no kick arrives: 0s ──► HARD REBOOT │
└──────────────────────────────────────────────┘
Two Levels of Defense
┌─────────────────────────────────────────────┐
│ Level 1: systemd software watchdog │
│ │
│ Monitors individual services │
│ Restarts hung service within seconds │
│ "My app crashed" ──► restart just the app │
├─────────────────────────────────────────────┤
│ Level 2: Hardware watchdog (SoC timer) │
│ │
│ Monitors the entire system │
│ Forces hard reboot if systemd itself hangs │
│ "Everything is frozen" ──► full reboot │
└─────────────────────────────────────────────┘
Level 1 catches app failures. Level 2 catches kernel/init failures. Together: defense in depth.
Hardware Watchdog in Practice
On Raspberry Pi:
# Enable the hardware watchdog kernel module
sudo modprobe bcm2835_wdt
# Opening the device starts the timer!
cat /dev/watchdog
# System reboots in ~15 seconds if nothing kicks
Warning: opening /dev/watchdog starts the countdown. If your kicking process crashes, the hardware will reboot the system. This is a feature, not a bug.
Try It Now: Watchdog Status (5 min)
Check if the hardware watchdog is available and inspect its configuration:
# Is the watchdog device present?
ls -la /dev/watchdog*
# Check watchdog state (if module is loaded)
cat /sys/class/watchdog/watchdog0/state 2>/dev/null
# See the timeout value
cat /sys/class/watchdog/watchdog0/timeout 2>/dev/null
# Check if systemd is configured to use it
systemctl show -p WatchdogDevice -p RuntimeWatchdogUSec
Is the watchdog currently active? What is the timeout? Do NOT open /dev/watchdog unless you are prepared for a reboot.
Tutorial: Data Logger Appliance — Section 4: Watchdog Theory: Section 3: Watchdogs
systemd Watchdog Integration
[Service]
Type=notify
ExecStart=/usr/bin/my-sensor-app
WatchdogSec=30
Restart=on-watchdog
RestartSec=5
| Setting | Meaning |
|---|---|
WatchdogSec=30 |
App must call sd_notify("WATCHDOG=1") every 30s |
Restart=on-watchdog |
If notification stops, restart the service |
RestartSec=5 |
Wait 5s before restarting |
Application side — the app must notify systemd periodically:
import sdnotify
n = sdnotify.SystemdNotifier()
while running:
do_work()
n.notify("WATCHDOG=1") # "I'm alive" — must arrive within WatchdogSec
systemd handles service-level recovery; hardware watchdog handles system-level recovery.
Combined Watchdog Architecture
┌──────────────────────────────────┐
│ my-sensor-app │
│ calls sd_notify("WATCHDOG=1") │
│ every 30 seconds │
└──────────┬───────────────────────┘
│ notify
▼
┌──────────────────────────────────┐
│ systemd │
│ monitors service health │
│ kicks /dev/watchdog │
└──────────┬───────────────────────┘
│ kick
▼
┌──────────────────────────────────┐
│ Hardware watchdog (bcm2835_wdt) │
│ 15s timeout │
│ no kick ──► hard reboot │
└──────────────────────────────────┘
App crash --> systemd restarts app. systemd crash --> hardware reboots system.
Reliability Design Rules
Lessons from thousands of field deployments
Rules 1 & 2: Storage and Separation
Rule 1: Treat storage as failure-prone. SD cards wear out. eMMC has write limits. Design assuming writes can fail at any point.
Rule 2: Separate mutable data from system image.
┌──────────────────────────────────────────┐
│ System image (read-only) │
│ ── kernel, libs, binaries, config ── │
│ Cannot be corrupted during operation │
├──────────────────────────────────────────┤
│ Data partition (read-write) │
│ ── sensor logs, user config, state ── │
│ May corrupt, but won't prevent boot │
└──────────────────────────────────────────┘
Use fsync() for critical writes. Use checksums for integrity verification.
Rules 3 & 4: Rollback and Observability
Rule 3: Design rollback before first release.
✗ Ship v1.0 ──► field brick ──► add rollback to v1.1
(too late -- v1.1 can't reach bricked devices)
✓ Build rollback into v1.0 ──► safe updates forever
Rule 4: Make failures observable.
| Metric | What It Reveals |
|---|---|
| Uptime | Unexpected reboots (watchdog fired?) |
| Last sensor read | Sensor cable disconnected? |
| Watchdog kick count | System under stress? |
| Partition boot count | Rollback happened? |
A device that fails silently is worse than one that fails loudly.
Testing Reliability (Not Just Function)
| Functional Test | Reliability Test |
|---|---|
| "Does it read the sensor?" | "Does it boot after power cut?" |
| "Does the update install?" | "Does it rollback if update fails?" |
| "Does the service start?" | "Does watchdog restart it if hung?" |
| Passes once = OK | Must pass 5 consecutive times |
Power-cut test: run system for 60s, pull power cable (no shutdown), reconnect. Verify: system boots, app starts, data before last fsync() is present. Repeat 5 times.
Putting It Together: Storage Strategy
┌──────────────────────────────────────────────────────┐
│ Prototyping (Pi + SD card) │
│ │
│ SD: /boot (FAT32) + / (ext4 RO) + /data (F2FS) │
│ RAM: tmpfs for /tmp, overlayfs upper │
└──────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────┐
│ Production (custom board + eMMC) │
│ │
│ eMMC: /boot + rootfs-A (SquashFS) + rootfs-B │
│ + /data (F2FS) │
│ RAM: tmpfs for /tmp and logs │
│ Optional NOR: bootloader + recovery image │
└──────────────────────────────────────────────────────┘
RO root eliminates corruption. F2FS reduces wear. SquashFS compresses 50-70%. Never swap on flash.
Block 1 & 2 Summary
| Mechanism | Protects Against |
|---|---|
| Read-only rootfs | Filesystem corruption from power loss |
| Overlayfs | Need for writes without touching base image |
| A/B partitions | Bricked device from failed update |
| Hardware watchdog | Frozen system with no operator |
| systemd watchdog | Hung application service |
| SquashFS / EROFS | Compressed, corruption-proof root image |
| F2FS | Flash wear from journaling overhead |
| No swap on flash | Premature storage death |
Reliability is designed, not assumed. Storage choice is part of that design.
Failure Mode Exercise
Apply what you just learned to three real scenarios
Exercise Overview
Format: Teams of 3-4 students, 30 minutes total
Each team designs a recovery strategy for three failure scenarios.
Deliverable per scenario: 1. What state is the device in after the failure? 2. What mechanism detects the problem? 3. What recovery action happens automatically? 4. What data is lost (acceptable losses)?
Prepare a 3-minute presentation of your strategies.
Scenario 1: Power Loss During Update
┌──────────────────────────────────────────┐
│ SITUATION │
│ │
│ Remote weather station, solar powered. │
│ Cloud passes over panel during OTA │
│ update. Power dies at 73% of image │
│ write to flash. │
│ │
│ QUESTIONS │
│ ── What partition layout prevents brick? │
│ ── What happens on next boot? │
│ ── How does the device retry the update? │
│ ── What if this happens 3 times? │
└──────────────────────────────────────────┘
Scenario 2: SD Card Corruption
┌──────────────────────────────────────────┐
│ SITUATION │
│ │
│ Industrial data logger running 24/7. │
│ After 18 months of continuous logging, │
│ the SD card develops bad sectors. │
│ Boot fails -- kernel panic on rootfs │
│ mount. │
│ │
│ QUESTIONS │
│ ── How could the system have detected │
│ degradation before total failure? │
│ ── What storage strategy survives this? │
│ ── How do you get the device running │
│ again without physical access? │
│ ── What changes for eMMC vs SD card? │
└──────────────────────────────────────────┘
Scenario 3: Runaway Process
┌──────────────────────────────────────────┐
│ SITUATION │
│ │
│ Smart irrigation controller in a field. │
│ After a rare sensor glitch, the main │
│ application enters an infinite loop. │
│ CPU usage hits 100%. The MQTT client │
│ stops reporting. Valves are stuck open. │
│ │
│ QUESTIONS │
│ ── What detects the hung application? │
│ ── What stops the valves from flooding? │
│ ── How do you distinguish "busy" from │
│ "hung" programmatically? │
│ ── What is the safe default valve state? │
└──────────────────────────────────────────┘
Discussion Framework
For each scenario, work through this checklist:
┌──────────────────────────────────────────┐
│ 1. DETECT │
│ What mechanism notices the failure? │
│ (watchdog / health check / monitor) │
├──────────────────────────────────────────┤
│ 2. CONTAIN │
│ How do you prevent cascading damage? │
│ (safe defaults / kill switch / fence) │
├──────────────────────────────────────────┤
│ 3. RECOVER │
│ What action restores operation? │
│ (restart / rollback / reboot) │
├──────────────────────────────────────────┤
│ 4. REPORT │
│ How does the team learn it happened? │
│ (log / alert / metric / LED) │
└──────────────────────────────────────────┘
Presentation & Discussion (15 min)
Each team presents their three recovery strategies (3 minutes per team).
Class evaluates:
- Did the strategy cover detection, containment, recovery, and reporting?
- Is there a failure mode the team missed?
- What is the worst-case data loss?
- Would you trust this device in a field deployment?
There is no single correct answer -- but there are answers that ignore failure modes.
Key Takeaways
-
Reliability is designed, not assumed -- if you did not engineer it, it will not survive the field
-
Updates are an engineering discipline -- atomic, verifiable, recoverable, or do not ship
-
Watchdogs are a standard embedded safety tool -- not optional, not a nice-to-have
-
Test failure recovery, not just function -- pull the power cable, kill the process, corrupt the image
-
Every write is a risk -- minimize writes, separate mutable from immutable, always
fsync()
Block 2
Embedded Networking, Protocols, and IoT Architecture
Embedded Networking -- The Starting Point
Linux provides the full network stack out of the box:
- Same socket API as desktop Linux
- Managed by systemd-networkd or NetworkManager
- Same ip, ss, tcpdump tools you already know
The challenge is not "can Linux do networking?"
The challenge is choosing the right physical layer and protocol for your constraints -- power, bandwidth, range, and cost.
Physical Layer Comparison
| Technology | Range | Bandwidth | Power |
|---|---|---|---|
| Ethernet | 100 m (cable) | 100 Mbps+ | Medium |
| WiFi (2.4/5 GHz) | ~50 m indoor | 50--300 Mbps | High |
| BLE 5.0 | ~100 m | 2 Mbps | Very low |
| LoRa | 2--15 km | 0.3--50 kbps | Very low |
| Cat-M1 / NB-IoT | Cellular | 100 kbps--1 Mbps | Low--Med |
Physical Layer -- Use Cases
| Technology | Best For |
|---|---|
| Ethernet | Factory, fixed install, high throughput |
| WiFi | Consumer devices, flexible placement |
| BLE 5.0 | Wearables, beacons, short-range sensors |
| LoRa | Agriculture, remote monitoring |
| Cat-M1 / NB-IoT | Wide-area, licensed spectrum |
Design rule: Choose the lowest-power, lowest-bandwidth technology that meets your requirements. WiFi is convenient but drains batteries.
The Networking Stack
How data moves from your application to the wire:

The OSI/TCP-IP networking stack: application protocols (MQTT, HTTP) at the top, transport (TCP/UDP) and network (IP) in the middle, physical layer at the bottom.
Transport Layer: TCP vs UDP
Application protocols ride on top of transport protocols that handle delivery mechanics.

TCP vs UDP: TCP guarantees ordered delivery with retransmission; UDP trades reliability for lower latency and overhead.
TCP -- Connection-Oriented
Client Server
│ │
│──── SYN ─────────────────────▶│
│◀─── SYN+ACK ─────────────────│
│──── ACK ─────────────────────▶│
│ │
│ (connection established) │
│ │
│──── Data ────────────────────▶│
│◀─── ACK ─────────────────────│
│ │
- 3-way handshake before any data flows
- Guaranteed delivery -- retransmits lost packets
- Ordered -- data arrives in sequence
- Higher overhead -- connection state, ACKs, retransmit timers
UDP -- Connectionless
Client Server
│ │
│──── Data ────────────────────▶│
│──── Data ────────────────────▶│
│──── Data ────────────────────▶│
│ │
│ (no handshake, no ACKs, │
│ no guarantee of delivery) │
│ │
- No connection setup -- just send
- Best-effort -- packets may be lost or reordered
- Lower overhead -- no connection state, no retransmits
- Ideal for constrained devices on lossy networks
QUIC -- The Modern Middle Ground

QUIC: a modern transport protocol built on UDP, adding reliability, multiplexed streams, and built-in TLS 1.3 encryption.
- Built on top of UDP but adds reliability
- Multiplexed streams -- no head-of-line blocking
- Built-in TLS 1.3 -- encrypted by default
- Emerging for edge-to-cloud (HTTP/3)
Transport Layer Summary
| Protocol | Connection | Reliability | Overhead | Used By |
|---|---|---|---|---|
| TCP | Yes (handshake) | Guaranteed | Higher | HTTP, MQTT, SSH |
| UDP | No | Best-effort | Lower | CoAP, DNS, NTP |
| QUIC | Yes (over UDP) | Guaranteed | Medium | HTTP/3, cloud APIs |
When you pick MQTT, you get TCP. When you pick CoAP, you get UDP. The application protocol makes this choice for you.
Application Protocols for IoT
| Protocol | Model | Overhead | Best For |
|---|---|---|---|
| MQTT | Pub/sub, broker | Low (~15–30 B typical) | IoT telemetry, events |
| CoAP | REST-like, UDP | Very low | Constrained devices |
| HTTP/REST | Req/response, TCP | High (~500+ B) | Cloud APIs, dashboards |
| gRPC | RPC, HTTP/2, Protobuf | Medium (binary) | Service-to-service |
For most embedded sensor applications: MQTT. For very constrained MCUs needing REST semantics: CoAP.
MQTT in 30 Seconds
Sensor ──publish──▶ Broker (mosquitto) ──deliver──▶ Dashboard
topic: "factory/line3/temperature"
payload: {"temp_c": 23.5, "ts": 1700000000}
Quality of Service levels:
| QoS | Name | Guarantee | Speed |
|---|---|---|---|
| 0 | Fire and forget | May lose messages | Fastest |
| 1 | At least once | Guaranteed, possible dupes | Balanced |
| 2 | Exactly once | Strongest guarantee | Slowest |
Pragmatic default: QoS 1 for sensor telemetry.
MQTT -- Why It Wins for IoT
- Tiny overhead — 2–5 byte fixed header; typical total overhead ~15–30 bytes including topic (vs ~500 bytes for HTTP headers)
- Pub/sub model -- decouple producers from consumers
- Retained messages -- new subscribers get last known value
- Last Will and Testament -- broker notifies when device disconnects
- Widespread support -- mosquitto, HiveMQ, AWS IoT Core, Azure IoT Hub
# Python example (paho-mqtt)
import paho.mqtt.client as mqtt
client = mqtt.Client()
client.connect("broker.local", 1883)
client.publish("sensors/temp", '{"c": 23.5}', qos=1)
IoT Architecture: Edge, Gateway, Cloud
┌─────────────────┐
│ Edge Devices │ BLE / LoRa / Wired
│ ┌─────┐ │ │
│ │ S1 │ S2 SN │─────────┤
│ └─────┘ │ │
└─────────────────┘ ▼
┌───────────────┐ WiFi / 4G
│ Gateway │──────────┐
│ Aggregation │ │
│ Local Rules │ ▼
└───────────────┘ ┌──────────────┐
│ Cloud │
│ ┌──────────┐ │
│ │ Broker │ │
│ │ DB │ │
│ │ Dashboard│ │
│ └──────────┘ │
└──────────────┘
This pattern scales from 1 to 100,000 devices.
Why Three Tiers?
| Tier | Role | Example |
|---|---|---|
| Edge | Sense, actuate, preprocess | Temperature sensor, motor controller |
| Gateway | Aggregate, filter, protocol translate | RPi running MQTT broker + rules |
| Cloud | Store, analyze, visualize, alert | AWS IoT Core + Grafana dashboard |
Why not connect sensors directly to the cloud? - Many sensors lack WiFi/cellular (BLE, LoRa only) - Direct cloud connections per sensor don't scale (10,000 TLS handshakes) - Gateway provides local intelligence when cloud is unreachable
Edge Computing -- Process Locally, Send Summaries
A temperature sensor sending every second: - 86,400 messages/day per sensor - x 10,000 sensors = 864 million messages/day
An edge gateway sending min/max/average per minute: - 1,440 messages/day per sensor - 60x reduction in bandwidth and cloud cost
Edge (raw): 23.5, 23.6, 23.5, 23.7, 23.4, ... (1 Hz)
│
[gateway]
│
Cloud (summary): {"min":23.4, "max":23.7, "avg":23.54} (1/min)
Edge Computing -- What Stays Local?
| Process Locally | Send to Cloud |
|---|---|
| Threshold alarms (> 80C? act NOW) | Aggregated time-series data |
| Filtering and denoising | Anomaly reports |
| Protocol translation (BLE to MQTT) | Trend analysis inputs |
| Local display updates | Long-term storage |
| Safety-critical decisions | Fleet-wide dashboards |
Rule of thumb: If it needs sub-second response, do it at the edge. If it needs historical context, send it to the cloud.
Mini Exercise: Protocol Selection
Your level display streams measurement data to a remote dashboard. Connected via Ethernet in a factory.
Discuss (2 minutes):
- Which protocol would you choose? Why?
- What QoS level? Why?
- How often should you send data? Trade-off?
Block 2 Summary
- Physical layer -- choose lowest-power, lowest-bandwidth that works
- Transport -- TCP for reliability (MQTT, HTTP), UDP for low overhead (CoAP)
- Application protocol -- MQTT for most IoT, CoAP for constrained devices
- Architecture -- Edge, Gateway, Cloud -- three tiers for scalability
- Edge computing -- process locally, send summaries (10--100x cost reduction)
Block 3
Security, OTA Updates, and Containers
"It Works on My Bench"
Your data logger works perfectly: - Reads sensor correctly - Displays on OLED - Logs to SD card
But in production, it also: - Has an IP address on the public internet - Runs 24/7 unattended - Contains credentials for your cloud service - Can be physically accessed by anyone in the building
Security is not an add-on. It is a design constraint.
Three Attack Surfaces
Every embedded device has exactly three attack surfaces:
| Surface | Examples | Difficulty |
|---|---|---|
| Physical | JTAG, UART console, SD card | Requires physical access |
| Network | Open ports, unencrypted traffic | Remote, scalable |
| Software | CVEs, unsigned firmware, defaults | Remote, automatable |
Physical Network Software
┌───────┐ ┌────────┐ ┌──────────┐
│ JTAG │ │ Open │ │ Default │
│ UART │ │ ports │ │ passwords│
│ SD │ │ Plain │ │ Old libs │
│ USB │ │ text │ │ No signing│
└───────┘ └────────┘ └──────────┘
Threat Modeling -- The Most Important Security Practice
Before writing any security code, think systematically about what can go wrong:
- List all interfaces -- network ports, debug headers, USB, wireless
- For each: who can access it? What can they do?
- Rank by likelihood x impact
- Mitigate the top risks first
A device with an open UART console giving root access is not "physically secure" -- it takes 30 seconds and a $3 USB-to-serial adapter to compromise.
CVE Case Study: Mirai Botnet (2016)
What happened:
- Malware scanned the internet for IoT devices (cameras, routers)
- Tried default credentials (admin/admin, root/root)
- Infected ~600,000 devices
- Launched DDoS attacks reaching 1.2 Tbps
Attack vector: default Telnet credentials (admin/admin). Telnet transmits passwords in plaintext and was open to the internet.
Fix: disable Telnet, change default credentials, use SSH keys.
Root cause: Default credentials never changed. No requirement for unique passwords at first boot.
Lesson: The most devastating IoT attack in history exploited the most basic mistake.
CVE Case Study: Heartbleed (2014)
CVE-2014-0160 -- OpenSSL TLS Heartbeat Extension
Client: "Hey, echo back 'hello' (5 bytes)"
Server: "hello" ← correct
Attacker: "Hey, echo back 'hi' (64000 bytes)"
Server: "hi" + 63998 bytes of server memory
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Private keys, session tokens, passwords
Root cause: Missing bounds check on the memcpy() length parameter. The fix was two lines of C code — but the vulnerability affected every OpenSSL installation worldwide.
Fix: Update OpenSSL (or use mbedTLS for embedded).
CVE Case Study: Unsigned Firmware
Generic pattern seen in routers, PLCs, medical devices:
Legitimate flow:
Server ──[signed firmware v2.1]──▶ Device ──[verify + flash]──▶ OK
Attack flow:
Attacker ──[malicious binary]──▶ Device ──[no verification]──▶ Owned
Root cause: No cryptographic signature verification on firmware images.
Fix: Sign all firmware with Ed25519/RSA. Verify before flashing.
Every one of these attacks exploited a known, preventable weakness.
The Hardening Checklist
| # | Rule | How |
|---|---|---|
| 1 | Minimal rootfs | Buildroot -- include only what you need |
| 2 | Read-only rootfs | Mount root as read-only + overlayfs |
| 3 | No default passwords | Unique credential at first boot |
| 4 | Signed firmware | Cryptographic signature on all images |
The Hardening Checklist (cont.)
| # | Rule | How |
|---|---|---|
| 5 | TLS everywhere | No plaintext MQTT, HTTP, or telnet |
| 6 | Disable debug interfaces | Remove JTAG, UART in production |
| 7 | Regular CVE scanning | Monitor NVD feeds for your libs |
| 8 | Least privilege | Run services as non-root; use capabilities |
Mirai = violated #3. Heartbleed = violated #7. Unsigned firmware = violated #4.
Basic hygiene prevents the majority of real-world attacks.
TLS on Embedded: What Goes Wrong
TLS is mandatory (#5 on the checklist), but embedded devices make it hard:
| Challenge | Why It Happens | Consequence |
|---|---|---|
| Clock drift | No battery-backed RTC, NTP unavailable at boot | Certificate validation fails (cert appears expired or not-yet-valid) |
| RAM pressure | TLS handshake needs ~200 KB RAM | On 256 KB MCUs, TLS may not fit alongside the application |
| CA bundle management | Root certificates expire every 5--20 years | Devices in the field for 10+ years need CA bundle updates |
| Certificate rotation | Server certs change every 90 days (Let's Encrypt) | Pinned certs break; must pin CA, not leaf cert |
Practical rule: Use mbedTLS (not OpenSSL) on constrained devices. Sync time before TLS. Pin the CA certificate, not the server certificate. Test certificate expiry handling — it will happen.
OTA Updates -- Essential but Dangerous
OTA (Over-The-Air) updates are essential for fixing vulnerabilities.
But the update mechanism itself is an attack vector.
| Easy Updates | No Updates | |
|---|---|---|
| Pro | Patch vulnerabilities fast | No attack surface from updater |
| Con | Updater can be exploited | Unpatched CVEs forever |
You must have OTA. You must secure it.
OTA Frameworks
| Framework | Approach | A/B Partition | Signing |
|---|---|---|---|
| SWUpdate | Image or package | Yes | RSA/CMS |
| RAUC | Image-based | Yes | X.509 |
| Mender | Image or package | Yes | RSA |
| Manual dd | Raw image | Manual | None |
All serious frameworks support rollback on failed update.
Never use manual dd in production.
A/B Update Flow
Device Server
│ │
│── Check for update (auth) ────────▶│
│◀── Signed image v2.1 available ───│
│ │
│── Download to partition B │
│── Verify signature │
│── Set bootloader: try B │
│── Reboot into B │
│ │
│── Health check... │
│ │
├── [PASS] ── Confirm B as active │
│ Report success ───────▶│
│ │
└── [FAIL] ── Reboot, fallback to A │
Report failure ───────▶│
A/B Partitions -- Why?
┌──────────────────────────────────────────┐
│ Boot │ Rootfs A │ Rootfs B │ Data│
│ loader │ (active) │ (standby) │ │
└──────────────────────────────────────────┘
│ ▲
│ update │
└──────────────┘
- Never modify the running partition -- write to standby
- Atomic switch -- bootloader changes one pointer
- Automatic rollback -- if health check fails, revert
- Zero downtime -- device is always running a known-good image
Virtualization vs Containers
As devices grow complex -- multiple apps, web UIs, different dependencies -- isolation matters.
| Virtual Machine | Container | |
|---|---|---|
| Isolation | Strong (own kernel) | Medium (shared kernel) |
| Overhead | High (full OS) | Low (MB) |
| Startup | Seconds to minutes | Milliseconds to seconds |
| Own kernel | Yes | No |
| Embedded use | Mixed-criticality | Multi-service gateway |
Containers in Embedded
Containers use Linux kernel features for isolation:
┌─────────────────────────────────────┐
│ Host Linux Kernel │
├───────────┬───────────┬─────────────┤
│ Container │ Container │ Container │
│ App A │ App B │ MQTT Broker│
│ (Python) │ (Node.js) │ (mosquitto) │
└───────────┴───────────┴─────────────┘
Namespaces: PID, mount, network, user
cgroups: CPU, memory, I/O limits
- Docker/Podman for development
- balena, snap, OCI runtimes for embedded deployment
When to Use What
| Scenario | Recommendation |
|---|---|
| Single-app appliance (level display) | No containers -- keep it simple |
| Multi-service gateway (logger + web + MQTT) | Containers -- isolate and update independently |
| Mixed-criticality (RTOS control + Linux HMI) | VM / hypervisor -- strongest isolation |
Containers require ~64 MB+ RAM and a full kernel with namespace/cgroup support.
On very constrained devices, use systemd sandboxing instead:
DynamicUser=, ProtectSystem=, MemoryMax=
Threat Model Exercise -- Your Data Logger
Consider the level display project from the tutorials.
Step 1: List attack vectors (2 min) - Physical: UART console, SD card, USB port, case access - Network: SSH, MQTT port, web UI, unencrypted traffic - Software: default Pi password, old kernel, unsigned updates
Step 2: Rate each -- Likelihood x Impact
Step 3: Propose mitigations for top 3 risks
Threat Model -- Example Analysis
| Vector | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Default SSH password | High | High | Key-only SSH, disable password auth |
| Unencrypted MQTT | High | Medium | TLS on port 8883 |
| Open UART console | Medium | High | Disable in production build |
| Old OpenSSL | Medium | High | CVE scanning, auto-rebuild |
| SD card theft | Low | High | Encrypted rootfs, signed boot |
Easiest win from the hardening checklist: #3 -- no default passwords.
Design Challenge
You are deploying 500 temperature sensors in a factory.
- Draw the network architecture (edge, gateway, cloud)
- Choose the physical layer and protocol -- justify
- List the top 3 security measures you would implement
- How would you handle OTA updates? Which framework?
(Group discussion -- 10 minutes)
Quick Checks
- Name two advantages of MQTT over HTTP for IoT telemetry.
- What is the difference between MQTT QoS 0 and QoS 1?
- What made the Mirai botnet possible? What is the fix?
- Why is a read-only rootfs a security measure?
- What is the risk of OTA updates without signature verification?
Key Takeaways
- Right protocol for your constraints -- MQTT for most IoT, CoAP for constrained
- Edge computing reduces bandwidth and cloud costs by 10--100x
- Three attack surfaces -- physical, network, software -- model all three
- Basic hygiene prevents most real attacks (Mirai, Heartbleed)
- Harden systematically -- minimal rootfs, read-only root, signed firmware, TLS
- OTA is security-critical -- use a framework with signing and rollback
- Containers for multi-service; VMs for mixed-criticality; nothing for simple appliances
Hands-On Next
Put this theory into practice:
Security Audit and Hardening -- audit ports, harden SSH, set up a basic firewall on your Raspberry Pi.
Data Logger Appliance -- build a reliable, updatable data logger with MQTT telemetry and OTA support.
Your bench prototype becomes a production-ready appliance.