IoT, Networking, and Security

Goal: Understand how embedded devices communicate, and how to protect them from real-world threats.

Your level display works perfectly on the bench. Now ship 10,000 units. They need remote monitoring, OTA updates, and must survive hostile networks. One compromised device can become a botnet node attacking others. How do you connect them safely?

1. Embedded Networking

Linux provides the full network stack out of the box — same socket API as desktop Linux, managed by systemd-networkd or NetworkManager.

Physical layer comparison:

Technology	Range	Bandwidth	Power	Use Case
Ethernet	100 m (cable)	100 Mbps+	Medium	Factory, fixed install
WiFi (2.4/5 GHz)	~50 m indoor	50–300 Mbps	High	Consumer, flexible placement
BLE 5.0	~100 m	2 Mbps	Very low	Wearables, beacons, short-range sensors
LoRa	2–15 km	0.3–50 kbps	Very low	Agriculture, remote monitoring
Cat-M1 / NB-IoT	Cellular coverage	100 kbps–1 Mbps	Low–Medium	Wide-area, licensed spectrum

Design rule: Choose the lowest-power, lowest-bandwidth technology that meets your requirements. WiFi is convenient but drains batteries.

Information-Theoretic Limits

The bandwidth column above represents raw physical-layer capacity. The theoretical maximum data rate for any channel is given by the Shannon-Hartley theorem:

\[C = B \cdot \log_2(1 + \text{SNR})\]

where $C$ is the channel capacity (bits/s), $B$ is the bandwidth (Hz), and SNR is the signal-to-noise ratio (linear, not dB).

Example — LoRa: With $B = 125\text{ kHz}$ and SNR $= -10\text{ dB}$ (= 0.1 linear):

\[C = 125{,}000 \times \log_2(1.1) = 125{,}000 \times 0.137 \approx 17.2\text{ kbit/s}\]

Practical LoRa data rates (0.3–50 kbit/s depending on spreading factor) approach this limit. The extremely low SNR tolerance (LoRa works below 0 dB SNR) explains its long range — but Shannon's law shows the price: very low capacity. This is why LoRa payloads are limited to tens of bytes.

Error detection in protocols: All network protocols use checksums to detect transmission errors. CRC-32 (used in Ethernet, TCP, and storage) uses the polynomial $x^{32} + x^{26} + \cdots + 1$ and detects all burst errors up to 32 bits in length, providing extremely high error detection probability for the typical packet sizes in IoT applications.

Info

Shannon capacity and CRC theory are covered in depth in information theory courses. The key takeaway for embedded engineers: bandwidth is a hard physical limit — no protocol optimization can exceed it.

2. Protocols

2.1 Transport Layer: TCP, UDP, and QUIC

Application protocols (MQTT, HTTP, CoAP) don't send data directly over the network — they ride on top of transport protocols that handle delivery mechanics:

Protocol	Connection	Reliability	Overhead	Used By
TCP	Connection-oriented (3-way handshake)	Guaranteed delivery, ordered	Higher	HTTP, MQTT, SSH, gRPC
UDP	Connectionless	Best-effort, no ordering guarantee	Lower	CoAP, DNS, video streaming, NTP
QUIC	Connection-oriented (over UDP)	Guaranteed, multiplexed streams	Medium	HTTP/3, modern cloud APIs

Embedded relevance: CoAP uses UDP because the lower overhead suits constrained devices on lossy networks. MQTT uses TCP because reliable delivery matters for telemetry — you don't want to lose sensor readings. QUIC is emerging for edge-to-cloud communication where HTTP/3 is used.

Choosing TCP vs UDP is choosing reliability vs latency — but in practice, the application protocol makes this choice for you. When you pick MQTT, you get TCP. When you pick CoAP, you get UDP.

2.2 Application Protocols

Protocol	Model	Overhead	Best For
MQTT	Pub/sub, broker-mediated	Low (~2 bytes header)	IoT telemetry, sensor data, event streams
CoAP	REST-like, UDP-based	Very low	Constrained devices, request/response over lossy networks
HTTP/REST	Request/response, TCP	High (~500+ bytes headers)	Cloud APIs, web dashboards, non-constrained devices
gRPC	RPC, HTTP/2, Protobuf	Medium (binary)	Service-to-service, typed APIs, streaming

MQTT in 30 seconds:

Sensor ──publish──▶ Broker (mosquitto) ──deliver──▶ Dashboard
  topic: "factory/line3/temperature"
  payload: {"temp_c": 23.5, "ts": 1700000000}

QoS 0: Fire and forget (fastest, may lose messages)
QoS 1: At least once (guaranteed delivery, possible duplicates)
QoS 2: Exactly once (slowest, strongest guarantee)

MQTT in Code — Python Example

A minimal publisher that sends sensor data, and a subscriber that receives it:

Publisher (runs on the embedded device):

import paho.mqtt.client as mqtt
import json, time

client = mqtt.Client()
client.tls_set()                         # enable TLS — never send plaintext
client.username_pw_set("sensor01", "secretpass")
client.connect("broker.example.com", 8883)

while True:
    payload = json.dumps({"temp_c": 23.5, "ts": int(time.time())})
    client.publish("factory/line3/temperature", payload, qos=1)
    time.sleep(10)

Subscriber (runs on the dashboard server):

import paho.mqtt.client as mqtt

def on_message(client, userdata, msg):
    print(f"{msg.topic}: {msg.payload.decode()}")

client = mqtt.Client()
client.tls_set()
client.username_pw_set("dashboard", "secretpass")
client.on_message = on_message
client.connect("broker.example.com", 8883)
client.subscribe("factory/#", qos=1)     # wildcard: all factory topics
client.loop_forever()

Install with pip install paho-mqtt. Use a local broker for testing: sudo apt install mosquitto && mosquitto -v.

Note

For most embedded sensor applications, MQTT with QoS 1 is the pragmatic default. CoAP is better when you need REST semantics on a constrained MCU.

3. IoT Architecture

The Edge → Gateway → Cloud pattern scales from 1 to 100,000 devices:

graph LR
    subgraph "Edge Devices"
        S1[Sensor Node 1]
        S2[Sensor Node 2]
        S3[Sensor Node N]
    end
    subgraph "Gateway"
        GW[Edge Gateway<br/>Aggregation + Local Rules]
    end
    subgraph "Cloud"
        BROKER[MQTT Broker]
        DB[(Time-Series DB)]
        DASH[Dashboard]
    end
    S1 -->|BLE/LoRa| GW
    S2 -->|BLE/LoRa| GW
    S3 -->|BLE/LoRa| GW
    GW -->|WiFi/4G + MQTT| BROKER
    BROKER --> DB
    DB --> DASH

Edge computing: Process data locally, send summaries. A temperature sensor that sends a reading every second generates 86,400 messages/day. An edge gateway that sends only min/max/average per minute generates 1,440 — a 60× reduction in bandwidth and cloud cost.

Mini Exercise: Protocol Selection

Your level display streams measurement data to a remote dashboard. The display is connected via Ethernet in a factory.

Which protocol would you choose? Why?
What QoS level? Why?
How often should you send data? What's the trade-off?

4. Attack Surfaces

Security in embedded systems starts with understanding where an attacker can interact with your device. Unlike a server locked in a data center, an embedded device is often physically accessible — sitting in a factory, mounted on a wall, or deployed in a public space. This means you must defend against not only network-based attacks (which are common for all connected devices) but also physical attacks (opening the case, connecting to debug ports) and software-level exploits (vulnerable libraries, unsigned firmware). Thinking systematically about these surfaces — before writing any security code — is called threat modeling, and it is the most important security practice in embedded engineering.

Every embedded device has three attack surfaces:

Surface	Examples	Difficulty
Physical	JTAG debug port, UART console, SD card swap	Requires physical access
Network	Open ports, unencrypted traffic, weak authentication	Remote, scalable
Software	CVEs in libraries, unsigned firmware, default credentials	Remote, automatable

Threat modeling basics:

List all interfaces (network ports, debug headers, USB, wireless)
For each: who can access it? What can they do?
Rank by likelihood × impact
Mitigate the top risks first

Warning

A device with an open UART console giving root access is not "physically secure" — it takes 30 seconds and a $3 USB-to-serial adapter to compromise.

5. CVE Case Studies

Mirai Botnet (2016)

What happened: Malware scanned the internet for IoT devices (cameras, routers) with default credentials (admin/admin, root/root). Infected ~600,000 devices. Launched DDoS attacks reaching 1.2 Tbps.

Root cause: Default credentials never changed, telnet open to the internet.

One-line fix: Require unique passwords at first boot; disable telnet.

Heartbleed (2014, CVE-2014-0160)

What happened: OpenSSL bug allowed reading up to 64 KB of server memory per request — including private keys, session tokens, passwords.

Root cause: Missing bounds check on TLS heartbeat extension. Two lines of C code.

One-line fix: Update OpenSSL (or use a minimal TLS library like mbedTLS for embedded).

Unsigned Firmware Updates

What happened (generic pattern): Attacker intercepts firmware update, replaces binary, device installs malicious code. Seen in routers, industrial PLCs, medical devices.

Root cause: No cryptographic signature verification on firmware images.

One-line fix: Sign all firmware with Ed25519/RSA; verify signature before flashing.

Note

Every one of these attacks exploited a known, preventable weakness. The lesson: follow the hardening checklist below.

6. Hardening Checklist

The following checklist distills the most impactful security measures for embedded Linux devices. None of these are exotic or expensive — they are basic hygiene that prevents the vast majority of real-world attacks. The CVE case studies in the previous section show that Mirai, Heartbleed, and unsigned firmware attacks all exploited the absence of these basic measures. Apply them systematically, starting from item 1, and your device will be more secure than the majority of deployed IoT products.

#	Rule	How
1	Minimal rootfs	Use Buildroot — include only what you need. Fewer packages = fewer CVEs.
2	Read-only rootfs	Mount root as read-only + overlayfs for runtime data. See Reliability and Updates Section 1 for mechanism.
3	No default passwords	Force unique credential at first boot or use key-only SSH.
4	Signed firmware	Cryptographically sign all update images. Verify before flashing.
5	TLS everywhere	Encrypt all network traffic. No plaintext MQTT, HTTP, or telnet.
6	Disable debug interfaces	Remove or disable JTAG, UART console in production builds.
7	Regular CVE scanning	Monitor NVD/CVE feeds for your kernel version and libraries.
8	Principle of least privilege	Run services as non-root. Use Linux capabilities instead of full root.

Info

Secure boot chain (verified boot from ROM → bootloader → kernel → rootfs) is covered in Boot Architectures Section 4. We won't repeat it here.

7. OTA Updates vs Security

OTA (Over-The-Air) updates are essential for fixing vulnerabilities — but the update mechanism itself is an attack vector.

The tension: Easy updates = potential attack surface. No updates = unpatched vulnerabilities.

Solutions:

Framework	Approach	A/B Partition	Rollback	Signing	Notes
SWUpdate	Image or package	Yes	Yes	RSA/CMS	Popular in industrial Linux
RAUC	Image-based	Yes	Yes	X.509	German engineering focus, strict
Mender	Image or package	Yes	Yes	RSA	SaaS or self-hosted, nice dashboard
Manual dd	Raw image	Manual	Manual	None	Don't do this in production

A/B update flow:

sequenceDiagram
    participant Device
    participant Server
    Device->>Server: Check for update (authenticated)
    Server->>Device: Signed image available (v2.1)
    Device->>Device: Download to partition B
    Device->>Device: Verify signature
    Device->>Device: Set bootloader to try B
    Device->>Device: Reboot into B
    Device->>Device: Health check passes?
    alt Healthy
        Device->>Device: Confirm B as active
        Device->>Server: Report success
    else Unhealthy
        Device->>Device: Reboot → fallback to A
        Device->>Server: Report failure
    end

See Reliability and Updates Section 2 for the A/B partition mechanism in detail.

Mini Exercise: Threat Model Your Level Display

Consider the level display project from the tutorials:

List 5 attack vectors (think: physical, network, software)
For each, rate likelihood (Low/Medium/High) and impact (Low/Medium/High)
Propose a mitigation for the top 3 risks
Which item from the hardening checklist is the easiest win?

8. Virtualization and Containers in Embedded

As embedded devices grow more complex — running multiple applications, serving web UIs alongside data collection, managing different dependency versions — the same isolation problems from IT infrastructure appear on embedded hardware.

8.1 Virtual Machines

A hypervisor runs multiple complete operating systems on a single hardware platform:

Type 1 (bare-metal): KVM, Xen — runs directly on hardware, high performance
Type 2 (hosted): VirtualBox, QEMU — runs on top of a host OS, easier to set up

Each VM has its own kernel, its own root filesystem, and full isolation. The overhead is significant: hundreds of MB per VM, seconds to boot. In embedded, VMs are used for mixed-criticality systems — running an RTOS for safety-critical control alongside Linux for the HMI on the same hardware.

8.2 Containers

Containers share the host kernel but isolate processes using Linux kernel features:

Namespaces provide isolation: PID (separate process tree), mount (separate filesystem view), network (own IP address), user (UID mapping)
cgroups enforce resource limits: CPU time, memory, I/O bandwidth

Containers are much lighter than VMs: they start in milliseconds, add only MB of overhead, and share the host kernel. Docker/Podman are used in development; balena, snap, or custom OCI runtimes are common for embedded deployment.

Comparison

	Virtual Machine	Container	Native (no isolation)
Isolation	Strong (separate kernel)	Medium (kernel features)	None
Overhead	High (full OS per VM)	Low (shared kernel)	None
Startup time	Seconds to minutes	Milliseconds to seconds	Instant
Own kernel	Yes	No (shares host)	N/A
Resource control	Hypervisor-managed	cgroups	Manual
Embedded use case	Mixed-criticality (RTOS + Linux)	Edge gateways, multi-app appliances	Simple single-app devices

When to Use in Embedded

Single-application appliance (level display, data logger): no containers needed — keep it simple
Multi-service gateway (data logger + web UI + MQTT broker): containers help isolate and update services independently
Mixed-criticality (safety-critical control + Linux HMI): hypervisor/VM separation provides the strongest isolation

Tip

Containers require a relatively capable system (~64 MB+ RAM, full Linux kernel with namespace/cgroup support). On very constrained embedded devices, native processes with systemd sandboxing (DynamicUser=, ProtectSystem=, MemoryMax=) are the pragmatic alternative.

Quick Checks

Name two advantages of MQTT over HTTP for IoT telemetry.
What is the difference between MQTT QoS 0 and QoS 1?
What made the Mirai botnet possible? What's the fix?
Why is a read-only rootfs a security measure (not just reliability)?
What's the risk of OTA updates without signature verification?

Mini Exercise

Design Challenge

You are deploying 500 temperature sensors in a factory.

Draw the network architecture (edge, gateway, cloud)
Choose the physical layer and protocol. Justify.
List the top 3 security measures you would implement
How would you handle OTA updates? Which framework would you pick?

Key Takeaways

Choose the right protocol for your constraints: MQTT for most IoT, CoAP for very constrained devices
Edge computing reduces bandwidth and cloud costs by 10–100×
Every embedded device has physical, network, and software attack surfaces
Real CVEs (Mirai, Heartbleed) show that basic hygiene prevents most attacks
Harden systematically: minimal rootfs, read-only root, signed firmware, TLS, no defaults
OTA updates are security-critical — use a proper framework with signing and rollback
Regulatory requirements are becoming mandatory; design for compliance from the start

Regulatory Landscape

Embedded devices increasingly face regulatory requirements:

Standard	Scope	Key Requirement
IEC 62443	Industrial cybersecurity	Security lifecycle, zone/conduit model
ETSI EN 303 645	Consumer IoT	No default passwords, secure updates, vulnerability disclosure
EU Cyber Resilience Act	All connected products (EU)	Security by design, vulnerability handling, 5-year update commitment

These regulations are not optional — they affect what you can legally sell in the EU starting 2027.