Skip to content

Debugging Embedded Linux

Time: 60 min | Prerequisites: SSH Login | Theory companion: Linux Fundamentals, Section 11


Learning Objectives

By the end of this tutorial you will be able to:

  • Follow a structured debugging flowchart to diagnose boot, service, device, and application failures
  • Use strace to trace system calls and identify permission or path errors
  • Debug a C program interactively with GDB (breakpoints, backtrace, variable inspection)
  • Set up remote debugging with gdbserver for cross-development
  • Run and debug ARM binaries on x86 using QEMU
  • Read kernel logs with dmesg and journalctl at different severity levels
  • Choose the right debugging approach (GDB/SSH, QEMU, JTAG) for a given failure mode
The Debugging Tools Hierarchy

Debugging embedded Linux follows a natural escalation path, from least invasive to most powerful:

  1. printf / logging -- the simplest tool. Add printf statements or structured log output to narrow down where the problem occurs. In kernel code, use dev_err, dev_warn, dev_info, and dev_dbg instead.
  2. strace -- intercepts every system call your program makes. Invaluable for finding permission errors, missing files, or unexpected I/O patterns without modifying the source code.
  3. GDB -- interactive debugging with breakpoints, variable inspection, and call stack examination. Use locally on the Pi, remotely via gdbserver, or with QEMU for hardware-free testing.
  4. perf / ftrace -- profiling and kernel tracing. When the bug is a performance problem (latency spikes, CPU hogging), these tools show where time is actually spent.
  5. JTAG / SWD -- hardware debug interfaces that work even when the OS is not running. Required for bootloader debugging, kernel panics with no serial output, and hard lockups.

The key principle: start simple and escalate only when needed. Most embedded bugs are found with dmesg, strace, and a few printf calls. GDB and JTAG are powerful but add setup overhead -- reserve them for crashes, race conditions, and hardware bring-up.

For the full conceptual framework, see Linux Fundamentals, Section 11.


Debugging Flowchart

When something fails, follow this decision tree:

graph TD
    A[System won't boot?] -->|Yes| B[Check serial console / dmesg]
    A -->|No| C[Service won't start?]
    B --> B1[Kernel panic → check DT / driver]
    B --> B2[Hangs at init → check systemd deps]
    C -->|Yes| D[journalctl -u SERVICE]
    C -->|No| E[Device not detected?]
    D --> D1[Check ExecStart path and permissions]
    E -->|Yes| F[Check dmesg + i2cdetect/lsmod]
    E -->|No| G[App misbehaving?]
    F --> F1[Driver not loaded → check DT overlay]
    F --> F2[Wrong address → check wiring]
    G -->|Yes| H[strace / GDB / log output]

1. Boot and Service Diagnostics

Concept: Most failures are visible in boot logs or service status.

dmesg | tail -n 50
journalctl -b
systemctl status SERVICE_NAME
systemctl list-units --failed

Example — finding a failed service:

$ systemctl list-units --failed
  UNIT                  LOAD   ACTIVE SUB    DESCRIPTION
 data-logger.service   loaded failed failed Data Logger Appliance

$ journalctl -u data-logger.service
-- No entries --
# ← This means the service never started. Check ExecStart path.


2. Driver and Device Checks

Concept: Drivers expose hardware via /dev and sysfs.

ls /dev
lsmod | grep DRIVER
modinfo DRIVER

Example — driver not loaded:

$ lsmod | grep mcp
# (empty output means the module is not loaded)
$ dmesg | grep mcp
[   12.345] mcp9808: probe failed with error -5
# ← Error -5 is EIO (I/O error). Check wiring and I2C address.


3. Process and Resource Monitoring

Concept: Embedded systems often fail due to CPU, memory, or I/O pressure.

top
htop
free -h
iostat -xz 1

4. Tracing and System Calls

Concept: strace shows what your program actually does.

strace -o trace.txt ./your_app

Example — finding why a program fails to open a device:

$ strace -e openat cat /dev/mcp9808
openat(AT_FDCWD, "/dev/mcp9808", O_RDONLY) = -1 EACCES (Permission denied)
# ← Fix with: chmod 666 /dev/mcp9808 or add a udev rule


5. Network Debugging

Concept: Many embedded apps fail due to network misconfiguration.

ip a
ping 8.8.8.8
tcpdump -i eth0 port 1883

6. I2C/SPI Debugging

Concept: Bus errors are often electrical or addressing mistakes.

i2cdetect -y 1
ls /dev/spidev*

7. A Minimal Debug Checklist

  • Is the device visible in /dev?
  • Does the driver load cleanly (dmesg)?
  • Are permissions correct?
  • Is the service running (systemctl)?
  • Is the process consuming CPU or memory unexpectedly?

Driver Debugging Checklist

  • Confirm device tree entry or overlay is loaded.
  • Check dmesg for probe errors.
  • Verify bus address (I2C/SPI) is correct.
  • Ensure power and pull-ups are present.
  • Use strace on user-space tools to confirm IO calls.

8. GDB Basics

GDB (GNU Debugger) lets you pause a running program, inspect variables, and step through code one line at a time. This section uses the sensor_reader.c program from the ELF tutorial.

8.1 Compile with Debug Symbols

# -g adds debug info, -O0 disables optimization (variables won't be "optimized out")
gcc -g -O0 -o sensor_reader sensor_reader.c

If you don't have the file yet, create sensor_reader.c:

// sensor_reader.c — Read CPU temperature from sysfs
#include <stdio.h>
#include <stdlib.h>

int main(void) {
    FILE *fp = fopen("/sys/class/thermal/thermal_zone0/temp", "r");
    if (fp == NULL) {
        perror("Failed to open thermal sensor");
        return 1;
    }

    char buf[16];
    if (fgets(buf, sizeof(buf), fp) == NULL) {
        perror("Failed to read temperature");
        fclose(fp);
        return 1;
    }
    fclose(fp);

    int raw = atoi(buf);
    printf("CPU temperature: %d.%d °C\n", raw / 1000, (raw % 1000) / 100);
    return 0;
}

8.2 GDB Walkthrough

gdb ./sensor_reader
Command What It Does
break main Set a breakpoint at the start of main()
run Start the program (stops at breakpoint)
next Execute one line, stepping over function calls
step Execute one line, stepping into function calls
print fp Print the value of variable fp
print buf Print the contents of buf
print raw Print the integer value
info locals Show all local variables
backtrace Show the call stack (where am I?)
continue Resume execution until next breakpoint or exit
quit Exit GDB

8.3 Catch a Bug with GDB

Create a buggy version — remove the NULL check so a bad path causes a segfault:

// sensor_reader_buggy.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
    FILE *fp = fopen("/nonexistent/path", "r");
    // BUG: no NULL check — fp is NULL
    char buf[16];
    fgets(buf, sizeof(buf), fp);  // segfault: dereferencing NULL

    int raw = atoi(buf);
    printf("CPU temperature: %d.%d °C\n", raw / 1000, (raw % 1000) / 100);
    return 0;
}
gcc -g -O0 -o sensor_buggy sensor_reader_buggy.c
gdb ./sensor_buggy
(gdb) run
# Program received signal SIGSEGV, Segmentation fault.
(gdb) backtrace
# Shows exactly which line crashed
(gdb) print fp
# $1 = (FILE *) 0x0    ← NULL pointer!
(gdb) quit

GDB tells you exactly where the crash happened and why (NULL pointer dereference).

Checkpoint 8

You can compile with debug symbols, set breakpoints, step through code, inspect variables, and diagnose a segfault with GDB.


9. Remote Debugging with gdbserver

In embedded development, you compile on your laptop (fast, lots of RAM) and debug on the Pi (limited resources). gdbserver bridges the two.

9.1 Install

# On the Pi
sudo apt install gdbserver

# On your laptop (host)
sudo apt install gdb-multiarch

9.2 Start gdbserver on the Pi

# On the Pi — start the program under gdbserver, listening on port 9000
gdbserver :9000 ./sensor_reader
# Process ./sensor_reader created; pid = 1234
# Listening on port 9000

9.3 Connect from Your Laptop

# On your laptop — you need a copy of the binary (same build)
gdb-multiarch ./sensor_reader
(gdb) target remote <pi-ip>:9000
(gdb) break main
(gdb) continue
# Breakpoint 1, main () at sensor_reader.c:6
(gdb) next
(gdb) print fp
(gdb) continue
(gdb) quit

Replace <pi-ip> with your Pi's IP address (e.g., 192.168.1.42).

9.4 Why Remote Debugging?

Local GDB (on Pi) Remote GDB (gdbserver)
Compile On Pi (slow) On laptop (fast)
Debug UI Terminal only Can use IDE (VS Code, CLion)
Pi resources GDB uses CPU/RAM Only gdbserver (lightweight)
Workflow Single machine Cross-development (industry standard)
Tip

VS Code with the "Native Debug" or "cortex-debug" extension can connect to gdbserver, giving you a graphical debugging experience with breakpoints, variable watch, and call stack — all running on your Pi remotely.

Checkpoint 9

You can start gdbserver on the Pi, connect with gdb-multiarch from your laptop, and debug remotely with breakpoints and variable inspection.


10. QEMU for Testing and Debugging

QEMU lets you run ARM binaries on your x86 laptop without a Pi. Combined with GDB, it gives you full debugging without any target hardware.

10.1 User-Mode Emulation

# On your laptop — run an ARM binary on x86
# First, cross-compile (see ELF tutorial Section 5)
arm-linux-gnueabihf-gcc -static -g -O0 -o sensor_reader_arm sensor_reader.c

# Run with QEMU
qemu-arm ./sensor_reader_arm
Note

The sysfs thermal path likely doesn't exist on your laptop, so the program will print an error. This is expected — QEMU emulates the CPU, not the Pi's hardware.

10.2 QEMU + GDB

# Terminal 1: Start under QEMU, waiting for GDB
qemu-arm -g 1234 ./sensor_reader_arm

# Terminal 2: Attach GDB
gdb-multiarch ./sensor_reader_arm
(gdb) target remote :1234
(gdb) break main
(gdb) continue
(gdb) next
(gdb) print fp
(gdb) quit

This is identical to the gdbserver workflow — GDB doesn't care whether the remote is a real Pi, QEMU, or an FPGA.

10.3 System Emulation (Overview)

For testing kernel changes or boot sequences, QEMU can emulate an entire ARM system:

# Boot a full ARM Linux in QEMU (example — paths vary)
qemu-system-arm -M vexpress-a9 -kernel zImage \
    -dtb vexpress-v2p-ca9.dtb -initrd rootfs.cpio.gz \
    -append "console=ttyAMA0" -nographic

This boots a complete Linux system — kernel, init, services — all emulated on your laptop. Useful for testing Buildroot images without flashing an SD card.

10.4 Comparison: When to Use Each Approach

Scenario Best Approach
Normal application debugging on Pi GDB (local) or gdbserver (remote)
No Pi available / CI testing QEMU user-mode
Need to debug with full IDE on laptop gdbserver (remote) or QEMU + GDB
Testing kernel/boot changes QEMU system emulation
Kernel panic, no dmesg output JTAG (see Section 12)
Hardware-specific bug (timing, electrical) Must use real hardware + oscilloscope
Checkpoint 10

You can run ARM binaries on x86 with QEMU, attach GDB for debugging, and choose the right approach for different scenarios.


11. Kernel Logging Deep-Dive

Kernel messages are your primary tool for diagnosing driver and boot problems.

11.1 printk and Device Logging

In kernel code, printk writes to the kernel ring buffer:

// In a kernel driver
printk(KERN_INFO "mcp9808: probe successful, temp=%d\n", temp);

// Better: use dev_* functions — automatically include device name
dev_info(&client->dev, "probe successful, temp=%d\n", temp);
dev_err(&client->dev, "failed to read register: %d\n", ret);
Function Level Use For
dev_err Error Failures that prevent operation
dev_warn Warning Recoverable issues
dev_info Info Successful initialization, key events
dev_dbg Debug Verbose tracing (off by default)
pr_debug Debug Module-level debug (no device context)

11.2 Reading Kernel Logs

# Human-readable timestamps
dmesg -T

# Only errors and warnings
dmesg --level=err,warn

# Follow in real time (like tail -f for the kernel)
dmesg -w

# Filter by keyword
dmesg | grep -i spi
dmesg | grep -i "probe\|error\|fail"

11.3 journalctl for System Logs

# Kernel messages only (same as dmesg, but persistent across reboots)
journalctl -k

# Specific service
journalctl -u data-logger.service

# Last 5 minutes
journalctl --since "5 min ago"

# Only errors and above
journalctl -p err

# Follow in real time
journalctl -f

11.4 Structured Logging

For automated log parsing, use key=value format in your applications:

# In your application or script
echo "ts=$(date +%s) sensor=mcp9808 temp_mC=45200 status=ok" >> /var/log/sensor.log

# Parse with standard tools
grep "status=error" /var/log/sensor.log | awk -F'temp_mC=' '{print $2}' | cut -d' ' -f1

This is the same pattern used by journalctl --output=json and production logging frameworks.

Checkpoint 11

You can read kernel logs at different severity levels, filter with dmesg and journalctl, and understand the dev_* logging functions used in drivers.


12. JTAG/SWD Awareness

Note

This section is informational — no hardware setup required. JTAG debugging requires a debug probe (e.g., Segger J-Link, FTDI-based adapter) which is not part of the standard kit.

12.1 What Is JTAG/SWD?

JTAG (Joint Test Action Group) and SWD (Serial Wire Debug) are hardware debug interfaces — physical pins on the processor that let an external tool:

  • Halt the CPU at any point
  • Read/write memory and registers directly
  • Set hardware breakpoints (no software instrumentation needed)
  • Debug without any OS — works from first instruction after reset

12.2 When You Need It

Symptom Why GDB/SSH Won't Work JTAG Can Help
Board doesn't boot at all No OS = no SSH, no GDB Halt at reset vector, step through bootloader
Kernel panic with no serial output dmesg buffer may be lost Read memory directly, inspect crash state
Driver causes hard lockup CPU is frozen, no response Halt CPU, inspect registers and call stack
Hardware bring-up (new board) Nothing works yet Verify CPU runs, test memory, load first code

12.3 Raspberry Pi 4 JTAG

The Pi 4 exposes JTAG on GPIO pins 22-27. OpenOCD (Open On-Chip Debugger) is the software bridge:

Debug Probe ←→ OpenOCD ←→ GDB
   (USB)        (TCP)     (TCP)

OpenOCD translates GDB commands into JTAG signals. From GDB's perspective, it looks the same as target remote — but the target is raw hardware, not a running OS.

12.4 Decision Table: Which Debug Approach?

Situation Start With Escalate To
App produces wrong output printf / logging GDB
App crashes (segfault) GDB + backtrace strace (if syscall-related)
Service won't start journalctl + systemctl strace on ExecStart
Device not detected dmesg + lsmod I2C/SPI bus scan
Driver probe fails dmesg + dev_err output GDB on kernel module (advanced)
System won't boot Serial console + dmesg JTAG
Intermittent timing issue Logging + timestamps oscilloscope + logic analyzer
Checkpoint 12

You can explain when JTAG is needed, how it differs from software debugging, and choose the right debugging approach for different failure modes.


When to Escalate

Symptom First Tool Second Tool Third Tool
App crashes silently strace GDB + backtrace dmesg
App produces wrong values printf / logging GDB + breakpoints valgrind
Intermittent failures journalctl --since GDB + watchpoints Logic analyzer
Hardware not responding i2cdetect / lsmod dmesg + strace Oscilloscope
Kernel oops/panic Serial console, dmesg journalctl -k -b -1 JTAG
Performance degradation top, htop perf stat, iostat perf record + flamegraph
Service won't start systemctl status journalctl -u strace on ExecStart
Missing library at runtime ldd readelf -d LD_DEBUG=libs

Challenge

Create a program debug_challenge.c with three deliberate bugs. Use the tools from this tutorial to find and fix each one:

  1. Bug 1 (strace): The program tries to open /dev/thermal instead of /sys/class/thermal/thermal_zone0/temp — find it with strace.
  2. Bug 2 (GDB): An off-by-one error in a loop — find it with GDB breakpoints and variable inspection.
  3. Bug 3 (dmesg): The program tries to access /dev/mem without root — find the "Permission denied" in dmesg/strace.

Deliverable: For each bug, document:

  • Which tool you used to find it
  • The exact command you ran
  • What the tool output showed
  • How you fixed it

Summary

Section Tool What You Learned
1-7 dmesg, journalctl, strace, systemctl System-level diagnostics
8 GDB Interactive debugging — breakpoints, backtrace, variable inspection
9 gdbserver Remote cross-debugging (laptop → Pi)
10 QEMU Run and debug ARM binaries without hardware
11 dmesg, journalctl Kernel logging — severity levels, filtering, structured logs
12 JTAG/OpenOCD Hardware debugging awareness — when software tools aren't enough

Course Overview