PREEMPT_RT: Latency Measurement

Time estimate: ~45 minutes Prerequisites: SSH Login

Learning Objectives

By the end of this tutorial you will be able to:

Measure scheduling latency using cyclictest
Explain what PREEMPT_RT changes in the Linux kernel
Compare baseline vs real-time kernel latency
Understand why worst-case latency matters more than average

What Makes a System "Real-Time"?

Real-time does not mean fast -- it means predictable. A system that always responds in 10 ms is more real-time than one that usually responds in 1 us but occasionally takes 100 ms.

Hard real-time: a missed deadline is a system failure (flight controller, ABS brakes).
Firm real-time: occasional misses degrade output quality (motor control, audio playback).
Soft real-time: the user notices but the system recovers (video streaming, UI refresh).

Standard Linux is optimised for throughput -- maximising total work done per second. Unpredictable latency comes from places where the kernel cannot be interrupted: hardware interrupt handlers, spinlock-protected critical sections, and RCU callbacks. The PREEMPT_RT patch set (merged into mainline Linux as of 6.12) converts these non-preemptible sections into preemptible ones. Interrupt handlers become kernel threads with tuneable priorities, spinlocks become sleeping mutexes, and priority inheritance prevents unbounded priority inversion. The result: worst-case scheduling latency drops from milliseconds to tens of microseconds, at the cost of slightly lower overall throughput.

For deeper reading see the Real-Time Systems reference page.

Introduction

Standard Linux is optimized for throughput — it tries to get the most work done overall. But embedded control systems and data acquisition need predictable timing: the guarantee that a task will execute within a known worst-case deadline.

PREEMPT_RT is a kernel patch set that makes Linux more deterministic by:

Making most kernel code preemptible (interruptible by higher-priority tasks)
Converting spinlocks to sleeping mutexes (reducing priority inversion)
Using threaded interrupt handlers (giving you control over interrupt priority)

The trade-off: overall throughput decreases slightly, but worst-case latency improves dramatically — from milliseconds down to tens of microseconds.

1. Install Test Tools

Concept: rt-tests provides standard latency benchmarks for real‑time evaluation.

sudo apt-get install -y rt-tests

2. Baseline Latency

Concept: Measure first, optimize later. You need a reference point.

Run cyclictest:

sudo cyclictest -m -Sp90 -i200 -l10000

Record max latency.

Understanding cyclictest Flags

Flag	Meaning
`-m`	Lock memory pages (prevent swapping)
`-S`	Use SCHED_FIFO scheduling (standard RT policy)
`-p90`	Set thread priority to 90 (high, but below critical kernel threads)
`-i200`	Interval of 200 microseconds between wakeups
`-l10000`	Run 10,000 measurement loops

Example Output

# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 0.12 0.08 0.03

T: 0 ( 1842) P:90 I:200 C:  10000 Min:      6 Act:   12 Avg:   11 Max:      87
T: 1 ( 1843) P:90 I:200 C:  10000 Min:      5 Act:   10 Avg:   10 Max:      73
T: 2 ( 1844) P:90 I:200 C:  10000 Min:      6 Act:   11 Avg:   11 Max:      91
T: 3 ( 1845) P:90 I:200 C:  10000 Min:      5 Act:    9 Avg:   10 Max:      68

The Max column is the most important: it shows worst-case latency in microseconds. On a standard kernel, values of 50-500 us are typical. With PREEMPT_RT, you should see max values below 50 us.

Checkpoint

Your baseline measurement should complete and show max latency values. Record these numbers — you will compare them after installing the RT kernel.

3. Install PREEMPT_RT Kernel

On Raspberry Pi OS, an RT kernel is available as a package:

sudo apt install -y linux-image-rt-arm64

If no package is available, you can check for RT kernel availability:

apt search linux-image | grep rt

After installing, reboot:

sudo reboot

Verify the kernel is RT-enabled:

uname -a

You should see PREEMPT_RT or PREEMPT RT in the kernel version string.

Stuck?

If no RT package is available for your kernel version, you may need to build the kernel from source with the PREEMPT_RT patch. This is a longer process — see Buildroot for custom kernel building.

4. Compare Results

Concept: The goal is reduced worst‑case latency, not just better average.

Create a small table:

Kernel	Max Latency (us)
Default
PREEMPT_RT

Checkpoint

After re-running cyclictest on the RT kernel, your max latency should be noticeably lower than the baseline measurement.

What Just Happened?

You measured the difference between a general-purpose kernel and a real-time kernel. The key insight is that average latency is misleading — what matters for embedded systems is the worst case.

Consider a motor control loop running at 1 kHz (1000 us period): - If max latency exceeds 1000 us, the control loop misses a deadline - Missing deadlines causes jerky motion, oscillation, or damage - A standard kernel might hit 500 us max latency — fine at 1 kHz but no margin - PREEMPT_RT typically keeps max latency under 50 us — plenty of margin

Challenges

Challenge 1: Stress Test

Install stress-ng and measure latency under CPU load:

sudo apt install -y stress-ng
stress-ng --cpu 4 &
sudo cyclictest -m -Sp90 -i200 -l10000

Compare max latency with and without load.

Challenge 2: Histogram

Run cyclictest with histogram output and plot the distribution:

sudo cyclictest -m -Sp90 -i200 -l100000 -h100 > histogram.txt

Advanced: ftrace for Latency Analysis

Tip

ftrace is the kernel's built-in function tracer. It can trace scheduling events, function calls, and interrupt handlers — helping you find the longest non-preemptible section in your system.

### Enable Function Tracer

# Check available tracers
cat /sys/kernel/debug/tracing/available_tracers

# Enable the function_graph tracer
echo function_graph > /sys/kernel/debug/tracing/current_tracer

# Trace only scheduling events
echo 1 > /sys/kernel/debug/tracing/events/sched/sched_switch/enable
echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable

# Start tracing
echo 1 > /sys/kernel/debug/tracing/tracing_on
sleep 5
echo 0 > /sys/kernel/debug/tracing/tracing_on

# Read the trace
cat /sys/kernel/debug/tracing/trace | head -100

Look for long gaps between sched_wakeup and sched_switch — these are scheduling delays.

Tip

trace-cmd Visualization

trace-cmd wraps ftrace with a user-friendly interface:

# Install trace-cmd
sudo apt install -y trace-cmd

# Record scheduling events for 10 seconds under load
stress-ng --cpu 4 &
sudo trace-cmd record -e sched_switch -e sched_wakeup sleep 10
kill %1

# Generate a latency report
sudo trace-cmd report --cpu 0 | head -80

The report shows every context switch with timestamps. Look for the longest gap between a task being woken (sched_wakeup) and actually running (sched_switch) — this is your scheduling latency.

Compare results with and without PREEMPT_RT to see the kernel's impact on worst-case scheduling delay.

Deliverable

Measurement	Default Kernel	PREEMPT_RT Kernel
Max latency (us)
Avg latency (us)
Max latency under CPU load (us)

Short analysis: Why does worst-case latency matter more than average for embedded control?

Course Overview | Next: Camera Pipeline →