IoT, Networking, and Security
Goal: Understand how embedded devices communicate, and how to protect them from real-world threats.
Your level display works perfectly on the bench. Now ship 10,000 units. They need remote monitoring, OTA updates, and must survive hostile networks. One compromised device can become a botnet node attacking others. How do you connect them safely?
1. Embedded Networking
Linux provides the full network stack out of the box — same socket API as desktop Linux, managed by systemd-networkd or NetworkManager.
Physical layer comparison:
| Technology | Range | Bandwidth | Power | Use Case |
|---|---|---|---|---|
| Ethernet | 100 m (cable) | 100 Mbps+ | Medium | Factory, fixed install |
| WiFi (2.4/5 GHz) | ~50 m indoor | 50–300 Mbps | High | Consumer, flexible placement |
| BLE 5.0 | ~100 m | 2 Mbps | Very low | Wearables, beacons, short-range sensors |
| LoRa | 2–15 km | 0.3–50 kbps | Very low | Agriculture, remote monitoring |
| Cat-M1 / NB-IoT | Cellular coverage | 100 kbps–1 Mbps | Low–Medium | Wide-area, licensed spectrum |
Design rule: Choose the lowest-power, lowest-bandwidth technology that meets your requirements. WiFi is convenient but drains batteries.
Information-Theoretic Limits
The bandwidth column above represents raw physical-layer capacity. The theoretical maximum data rate for any channel is given by the Shannon-Hartley theorem:
where \(C\) is the channel capacity (bits/s), \(B\) is the bandwidth (Hz), and SNR is the signal-to-noise ratio (linear, not dB).
Example — LoRa: With \(B = 125\text{ kHz}\) and SNR \(= -10\text{ dB}\) (= 0.1 linear):
Practical LoRa data rates (0.3–50 kbit/s depending on spreading factor) approach this limit. The extremely low SNR tolerance (LoRa works below 0 dB SNR) explains its long range — but Shannon's law shows the price: very low capacity. This is why LoRa payloads are limited to tens of bytes.
Error detection in protocols: All network protocols use checksums to detect transmission errors. CRC-32 (used in Ethernet, TCP, and storage) uses the polynomial \(x^{32} + x^{26} + \cdots + 1\) and detects all burst errors up to 32 bits in length, providing extremely high error detection probability for the typical packet sizes in IoT applications.
Info
Shannon capacity and CRC theory are covered in depth in information theory courses. The key takeaway for embedded engineers: bandwidth is a hard physical limit — no protocol optimization can exceed it.
2. Protocols
2.1 Transport Layer: TCP, UDP, and QUIC
Application protocols (MQTT, HTTP, CoAP) don't send data directly over the network — they ride on top of transport protocols that handle delivery mechanics:
| Protocol | Connection | Reliability | Overhead | Used By |
|---|---|---|---|---|
| TCP | Connection-oriented (3-way handshake) | Guaranteed delivery, ordered | Higher | HTTP, MQTT, SSH, gRPC |
| UDP | Connectionless | Best-effort, no ordering guarantee | Lower | CoAP, DNS, video streaming, NTP |
| QUIC | Connection-oriented (over UDP) | Guaranteed, multiplexed streams | Medium | HTTP/3, modern cloud APIs |
Embedded relevance: CoAP uses UDP because the lower overhead suits constrained devices on lossy networks. MQTT uses TCP because reliable delivery matters for telemetry — you don't want to lose sensor readings. QUIC is emerging for edge-to-cloud communication where HTTP/3 is used.
Choosing TCP vs UDP is choosing reliability vs latency — but in practice, the application protocol makes this choice for you. When you pick MQTT, you get TCP. When you pick CoAP, you get UDP.
2.2 Application Protocols
| Protocol | Model | Overhead | Best For |
|---|---|---|---|
| MQTT | Pub/sub, broker-mediated | Low (~2 bytes header) | IoT telemetry, sensor data, event streams |
| CoAP | REST-like, UDP-based | Very low | Constrained devices, request/response over lossy networks |
| HTTP/REST | Request/response, TCP | High (~500+ bytes headers) | Cloud APIs, web dashboards, non-constrained devices |
| gRPC | RPC, HTTP/2, Protobuf | Medium (binary) | Service-to-service, typed APIs, streaming |
MQTT in 30 seconds:
Sensor ──publish──▶ Broker (mosquitto) ──deliver──▶ Dashboard
topic: "factory/line3/temperature"
payload: {"temp_c": 23.5, "ts": 1700000000}
- QoS 0: Fire and forget (fastest, may lose messages)
- QoS 1: At least once (guaranteed delivery, possible duplicates)
- QoS 2: Exactly once (slowest, strongest guarantee)
MQTT in Code — Python Example
A minimal publisher that sends sensor data, and a subscriber that receives it:
Publisher (runs on the embedded device):
import paho.mqtt.client as mqtt
import json, time
client = mqtt.Client()
client.tls_set() # enable TLS — never send plaintext
client.username_pw_set("sensor01", "secretpass")
client.connect("broker.example.com", 8883)
while True:
payload = json.dumps({"temp_c": 23.5, "ts": int(time.time())})
client.publish("factory/line3/temperature", payload, qos=1)
time.sleep(10)
Subscriber (runs on the dashboard server):
import paho.mqtt.client as mqtt
def on_message(client, userdata, msg):
print(f"{msg.topic}: {msg.payload.decode()}")
client = mqtt.Client()
client.tls_set()
client.username_pw_set("dashboard", "secretpass")
client.on_message = on_message
client.connect("broker.example.com", 8883)
client.subscribe("factory/#", qos=1) # wildcard: all factory topics
client.loop_forever()
Install with pip install paho-mqtt. Use a local broker for testing: sudo apt install mosquitto && mosquitto -v.
Note
For most embedded sensor applications, MQTT with QoS 1 is the pragmatic default. CoAP is better when you need REST semantics on a constrained MCU.
3. IoT Architecture
The Edge → Gateway → Cloud pattern scales from 1 to 100,000 devices:
graph LR
subgraph "Edge Devices"
S1[Sensor Node 1]
S2[Sensor Node 2]
S3[Sensor Node N]
end
subgraph "Gateway"
GW[Edge Gateway<br/>Aggregation + Local Rules]
end
subgraph "Cloud"
BROKER[MQTT Broker]
DB[(Time-Series DB)]
DASH[Dashboard]
end
S1 -->|BLE/LoRa| GW
S2 -->|BLE/LoRa| GW
S3 -->|BLE/LoRa| GW
GW -->|WiFi/4G + MQTT| BROKER
BROKER --> DB
DB --> DASH
Edge computing: Process data locally, send summaries. A temperature sensor that sends a reading every second generates 86,400 messages/day. An edge gateway that sends only min/max/average per minute generates 1,440 — a 60× reduction in bandwidth and cloud cost.
Mini Exercise: Protocol Selection
Your level display streams measurement data to a remote dashboard. The display is connected via Ethernet in a factory.
- Which protocol would you choose? Why?
- What QoS level? Why?
- How often should you send data? What's the trade-off?
4. Attack Surfaces
Security in embedded systems starts with understanding where an attacker can interact with your device. Unlike a server locked in a data center, an embedded device is often physically accessible — sitting in a factory, mounted on a wall, or deployed in a public space. This means you must defend against not only network-based attacks (which are common for all connected devices) but also physical attacks (opening the case, connecting to debug ports) and software-level exploits (vulnerable libraries, unsigned firmware). Thinking systematically about these surfaces — before writing any security code — is called threat modeling, and it is the most important security practice in embedded engineering.
Every embedded device has three attack surfaces:
| Surface | Examples | Difficulty |
|---|---|---|
| Physical | JTAG debug port, UART console, SD card swap | Requires physical access |
| Network | Open ports, unencrypted traffic, weak authentication | Remote, scalable |
| Software | CVEs in libraries, unsigned firmware, default credentials | Remote, automatable |
Threat modeling basics:
- List all interfaces (network ports, debug headers, USB, wireless)
- For each: who can access it? What can they do?
- Rank by likelihood × impact
- Mitigate the top risks first
Warning
A device with an open UART console giving root access is not "physically secure" — it takes 30 seconds and a $3 USB-to-serial adapter to compromise.
5. CVE Case Studies
Mirai Botnet (2016)
What happened: Malware scanned the internet for IoT devices (cameras, routers) with default credentials (admin/admin, root/root). Infected ~600,000 devices. Launched DDoS attacks reaching 1.2 Tbps.
Root cause: Default credentials never changed, telnet open to the internet.
One-line fix: Require unique passwords at first boot; disable telnet.
Heartbleed (2014, CVE-2014-0160)
What happened: OpenSSL bug allowed reading up to 64 KB of server memory per request — including private keys, session tokens, passwords.
Root cause: Missing bounds check on TLS heartbeat extension. Two lines of C code.
One-line fix: Update OpenSSL (or use a minimal TLS library like mbedTLS for embedded).
Unsigned Firmware Updates
What happened (generic pattern): Attacker intercepts firmware update, replaces binary, device installs malicious code. Seen in routers, industrial PLCs, medical devices.
Root cause: No cryptographic signature verification on firmware images.
One-line fix: Sign all firmware with Ed25519/RSA; verify signature before flashing.
Note
Every one of these attacks exploited a known, preventable weakness. The lesson: follow the hardening checklist below.
6. Hardening Checklist
The following checklist distills the most impactful security measures for embedded Linux devices. None of these are exotic or expensive — they are basic hygiene that prevents the vast majority of real-world attacks. The CVE case studies in the previous section show that Mirai, Heartbleed, and unsigned firmware attacks all exploited the absence of these basic measures. Apply them systematically, starting from item 1, and your device will be more secure than the majority of deployed IoT products.
| # | Rule | How |
|---|---|---|
| 1 | Minimal rootfs | Use Buildroot — include only what you need. Fewer packages = fewer CVEs. |
| 2 | Read-only rootfs | Mount root as read-only + overlayfs for runtime data. See Reliability and Updates Section 1 for mechanism. |
| 3 | No default passwords | Force unique credential at first boot or use key-only SSH. |
| 4 | Signed firmware | Cryptographically sign all update images. Verify before flashing. |
| 5 | TLS everywhere | Encrypt all network traffic. No plaintext MQTT, HTTP, or telnet. |
| 6 | Disable debug interfaces | Remove or disable JTAG, UART console in production builds. |
| 7 | Regular CVE scanning | Monitor NVD/CVE feeds for your kernel version and libraries. |
| 8 | Principle of least privilege | Run services as non-root. Use Linux capabilities instead of full root. |
Info
Secure boot chain (verified boot from ROM → bootloader → kernel → rootfs) is covered in Boot Architectures Section 4. We won't repeat it here.
7. OTA Updates vs Security
OTA (Over-The-Air) updates are essential for fixing vulnerabilities — but the update mechanism itself is an attack vector.
The tension: Easy updates = potential attack surface. No updates = unpatched vulnerabilities.
Solutions:
| Framework | Approach | A/B Partition | Rollback | Signing | Notes |
|---|---|---|---|---|---|
| SWUpdate | Image or package | Yes | Yes | RSA/CMS | Popular in industrial Linux |
| RAUC | Image-based | Yes | Yes | X.509 | German engineering focus, strict |
| Mender | Image or package | Yes | Yes | RSA | SaaS or self-hosted, nice dashboard |
| Manual dd | Raw image | Manual | Manual | None | Don't do this in production |
A/B update flow:
sequenceDiagram
participant Device
participant Server
Device->>Server: Check for update (authenticated)
Server->>Device: Signed image available (v2.1)
Device->>Device: Download to partition B
Device->>Device: Verify signature
Device->>Device: Set bootloader to try B
Device->>Device: Reboot into B
Device->>Device: Health check passes?
alt Healthy
Device->>Device: Confirm B as active
Device->>Server: Report success
else Unhealthy
Device->>Device: Reboot → fallback to A
Device->>Server: Report failure
end
See Reliability and Updates Section 2 for the A/B partition mechanism in detail.
Mini Exercise: Threat Model Your Level Display
Consider the level display project from the tutorials:
- List 5 attack vectors (think: physical, network, software)
- For each, rate likelihood (Low/Medium/High) and impact (Low/Medium/High)
- Propose a mitigation for the top 3 risks
- Which item from the hardening checklist is the easiest win?
8. Virtualization and Containers in Embedded
As embedded devices grow more complex — running multiple applications, serving web UIs alongside data collection, managing different dependency versions — the same isolation problems from IT infrastructure appear on embedded hardware.
8.1 Virtual Machines
A hypervisor runs multiple complete operating systems on a single hardware platform:
- Type 1 (bare-metal): KVM, Xen — runs directly on hardware, high performance
- Type 2 (hosted): VirtualBox, QEMU — runs on top of a host OS, easier to set up
Each VM has its own kernel, its own root filesystem, and full isolation. The overhead is significant: hundreds of MB per VM, seconds to boot. In embedded, VMs are used for mixed-criticality systems — running an RTOS for safety-critical control alongside Linux for the HMI on the same hardware.
8.2 Containers
Containers share the host kernel but isolate processes using Linux kernel features:
- Namespaces provide isolation: PID (separate process tree), mount (separate filesystem view), network (own IP address), user (UID mapping)
- cgroups enforce resource limits: CPU time, memory, I/O bandwidth
Containers are much lighter than VMs: they start in milliseconds, add only MB of overhead, and share the host kernel. Docker/Podman are used in development; balena, snap, or custom OCI runtimes are common for embedded deployment.
Comparison
| Virtual Machine | Container | Native (no isolation) | |
|---|---|---|---|
| Isolation | Strong (separate kernel) | Medium (kernel features) | None |
| Overhead | High (full OS per VM) | Low (shared kernel) | None |
| Startup time | Seconds to minutes | Milliseconds to seconds | Instant |
| Own kernel | Yes | No (shares host) | N/A |
| Resource control | Hypervisor-managed | cgroups | Manual |
| Embedded use case | Mixed-criticality (RTOS + Linux) | Edge gateways, multi-app appliances | Simple single-app devices |
When to Use in Embedded
- Single-application appliance (level display, data logger): no containers needed — keep it simple
- Multi-service gateway (data logger + web UI + MQTT broker): containers help isolate and update services independently
- Mixed-criticality (safety-critical control + Linux HMI): hypervisor/VM separation provides the strongest isolation
Tip
Containers require a relatively capable system (~64 MB+ RAM, full Linux kernel with namespace/cgroup support). On very constrained embedded devices, native processes with systemd sandboxing (DynamicUser=, ProtectSystem=, MemoryMax=) are the pragmatic alternative.
Quick Checks
- Name two advantages of MQTT over HTTP for IoT telemetry.
- What is the difference between MQTT QoS 0 and QoS 1?
- What made the Mirai botnet possible? What's the fix?
- Why is a read-only rootfs a security measure (not just reliability)?
- What's the risk of OTA updates without signature verification?
Mini Exercise
Design Challenge
You are deploying 500 temperature sensors in a factory.
- Draw the network architecture (edge, gateway, cloud)
- Choose the physical layer and protocol. Justify.
- List the top 3 security measures you would implement
- How would you handle OTA updates? Which framework would you pick?
Key Takeaways
- Choose the right protocol for your constraints: MQTT for most IoT, CoAP for very constrained devices
- Edge computing reduces bandwidth and cloud costs by 10–100×
- Every embedded device has physical, network, and software attack surfaces
- Real CVEs (Mirai, Heartbleed) show that basic hygiene prevents most attacks
- Harden systematically: minimal rootfs, read-only root, signed firmware, TLS, no defaults
- OTA updates are security-critical — use a proper framework with signing and rollback
- Regulatory requirements are becoming mandatory; design for compliance from the start
Related Tutorials
For hands-on practice, see: Network Security | Data Logger
Regulatory Landscape
Embedded devices increasingly face regulatory requirements:
| Standard | Scope | Key Requirement |
|---|---|---|
| IEC 62443 | Industrial cybersecurity | Security lifecycle, zone/conduit model |
| ETSI EN 303 645 | Consumer IoT | No default passwords, secure updates, vulnerability disclosure |
| EU Cyber Resilience Act | All connected products (EU) | Security by design, vulnerability handling, 5-year update commitment |
These regulations are not optional — they affect what you can legally sell in the EU starting 2027.