Skip to content

Lesson 5: Graphics Stack

Óbuda University — Linux in Embedded Systems


Starting Point: What You Already Know

Every time you use your PC — Ubuntu, Windows, macOS — this happens:

  Browser     Editor      Terminal     File Manager
    │           │           │              │
    └───────────┴───────────┴──────────────┘
             ┌──────▼──────┐
             │   Desktop   │  Ubuntu GNOME / Windows DWM / macOS
             │ Environment │
             │  - arranges windows on screen
             │  - routes keyboard and mouse to the right app
             │  - draws shadows, taskbar, animations
             └──────┬──────┘
             ┌──────▼──────┐
             │   Display   │  kernel graphics driver
             │   Driver    │
             └──────┬──────┘
             ┌──────▼──────┐
             │   Monitor   │  HDMI / laptop panel
             └─────────────┘

This is the full desktop stack — multiple windows, taskbar, animations, drag-and-drop. You use it every day.

Now imagine: your product has one fullscreen app, no file manager, no desktop, no overlapping windows. Do you still need all those layers?


The Embedded Question

  Desktop (what you know):              Embedded (what you're building):

  Firefox  VS Code  Terminal            Your single app
     │        │        │                      │
  ┌──▼────────▼────────▼───┐           ┌──────▼──────┐
  │     Compositor         │           │  DRM/KMS    │  ← direct, no compositor
  └──────────┬─────────────┘           └──────┬──────┘
  ┌──────────▼─────────────┐           ┌──────▼──────┐
  │       DRM/KMS          │           │  Display    │
  └──────────┬─────────────┘           └─────────────┘
  ┌──────────▼─────────────┐
  │       Display          │            Removed: compositor, window manager,
  └────────────────────────┘            desktop environment, login screen

On embedded, you strip away layers until only the essential path remains. The question is: how far can you strip?

The goal of this lecture: understand what each layer does, so you can decide which ones to keep and which to remove.


Today's Map

  • Block 1 (45 min): The display hardware pipeline, three graphics levels (from simplest to desktop), fbdev vs DRM/KMS architecture, GPU stack, display interfaces.
  • Block 2 (45 min): Tearing experiment: display scan-out, tearing mechanism, VSync and page flipping, write-and-fix exercise.

What the Display Hardware Actually Does

Before comparing the three levels, understand the hardware that all of them sit on top of.

Every display system has the same pipeline — from pixels in memory to light on the screen:

center

The display controller reads from buffers in memory and scans out pixels row by row, synchronized to the pixel clock. This happens continuously — 60 times per second at 60 Hz.

The key question: how does your application tell the display controller which buffer to read?


The Display Hardware Pipeline — In Detail

Stage Hardware What it does
Framebuffer Memory buffer(s) Stores pixel data — one or more buffers in RAM
Plane Pixel mixer Rotation, scaling, format conversion, layer blending
CRTC Timing generator Generates pixel clock, HSync, VSync — drives scan-out
Encoder Interface adapter Physical adaptation — converts to the wire protocol (TMDS, DSI, LVDS)
Bridge Interface transcoder Converts between display interfaces (e.g., DSI → DPI). Optional — not all paths have one.
Connector Physical port The socket: HDMI, DSI ribbon, SPI pins
Panel / Monitor Display surface Emits or reflects light. A panel is just the LCD; a monitor integrates a panel + housing + EDID.

Pi 4 Examples

  HDMI path:  Framebuffer → Plane → CRTC → HDMI Encoder ──────────────► HDMI Monitor
                                                          (no bridge — direct TMDS)

  DSI path:   Framebuffer → Plane → CRTC → DSI Encoder → TC358762 bridge → 7" LCD Panel
                                                          (DSI → DPI transcoding)

The TC358762 is a bridge chip that converts DSI packets to parallel DPI signals — the LCD panel cannot speak DSI directly. This is common in embedded: the SoC outputs DSI, but the panel expects DPI, LVDS, or eDP, so a bridge chip translates.

This hardware chain exists whether you use fbdev, DRM, or a compositor. The difference is how much the software models it.


Inside the CRTC: Where Planes Become Pixels

The CRTC is the most complex stage. Here is what happens inside:

  Plane 0 (primary)  ──► DMA read ──┐
  Plane 1 (overlay)  ──► DMA read ──┤──► Compositor ──► Sync generator ──► Encoder
  Plane 2 (cursor)   ──► DMA read ──┘    (blend)        (HSync, VSync,
                                                          pixel clock)
Internal stage What it does
Pixel fetch (DMA) Reads pixel data from each plane's buffer in memory. The display controller has its own DMA engine — no CPU involvement.
Compositor Blends all active planes together — alpha blending, z-ordering, scaling, color conversion. This is hardware compositing, not software.
Sync generator Produces the timing signals: pixel clock, HSync (end of line), VSync (end of frame). These drive the encoder and ultimately the display.

Why planes matter: A video player puts the video stream on one plane and subtitles on an overlay plane. The CRTC composites them in hardware — zero CPU work, zero memory copies. Without planes, the CPU would have to alpha-blend every frame.

  Without planes (CPU compositing):     With planes (HW compositing):

  CPU reads video buffer                 Plane 0 → video buffer
  CPU reads subtitle buffer              Plane 1 → subtitle buffer
  CPU blends pixel-by-pixel              CRTC blends in hardware
  CPU writes to display buffer           → zero CPU work per frame
  → CPU busy every frame

Hardware planes are the reason DRM/KMS can display video + UI overlay at 60 FPS on a low-power SoC without breaking a sweat.


Three Ways to Talk to This Hardware

Now you know what the display hardware does. Linux gives you three software paths to control it — from simplest to most capable:

Level Approach What it hides What it gives you
A Raw Framebuffer (fbdev) Everything — one flat buffer open(), mmap(), write pixels
B DRM/KMS Nothing — full pipeline exposed Planes, CRTC, page flip, VSync
C Full Compositor (Wayland/X11) DRM details — apps just render Multiple windows, input routing

You started at Level C (your laptop desktop). Embedded systems work at Level A or B. Let's look at each.


Level A — Framebuffer (fbdev)

fbdev gives you the simplest possible view of this hardware: one flat buffer.

  Your Application
      │  open("/dev/fb0")
      │  mmap() → pointer to pixel memory
      │  write pixels directly
  ┌──────────────────────────────────────────────────┐
  │  fbdev kernel driver                             │
  │                                                  │
  │  ┌──────────┐                                    │
  │  │  Buffer  │ ← your pixels go here              │
  │  └────┬─────┘                                    │
  │       │                                          │
  │       ▼  (everything below is hidden from you)   │
  │  Plane → CRTC → Encoder → Connector → Panel      │
  └──────────────────────────────────────────────────┘

fbdev hides the display pipeline. You get one buffer, one resolution (set at boot or by fbset), no timing control, no page flipping. The driver handles everything internally.

The simplicity is the point. open(), mmap(), write pixels. Done.


fbdev — What You Can and Cannot Do

Capability fbdev Notes
Write pixels Yes mmap() + direct memory writes
Read resolution Yes ioctl(FBIOGET_VSCREENINFO)
Change resolution Fragile fbset — not all drivers support it
VSync / page flip No You write while the display reads → tearing
Multiple planes No One buffer, one layer
Multi-display No Each /dev/fbN is independent, no coordination
GPU acceleration No CPU draws every pixel

fbdev is excellent for quick experiments and simple displays (OLED, e-ink, small LCDs). For production on modern hardware, prefer DRM/KMS.


Level B — DRM/KMS (Kernel Mode Setting)

DRM/KMS exposes the full hardware pipeline to userspace:

  Your Application
      │  open("/dev/dri/card0")
      │  enumerate connectors → find display
      │  set mode → resolution + timing
      │  allocate dumb buffer → draw pixels
      │  page flip at VBlank → tear-free
  ┌────────────────────────────────────────────────────────────┐
  │  DRM/KMS kernel subsystem                                  │
  │                                                            │
  │  ┌──────────┐  ┌──────────┐  ┌──────────┐                  │
  │  │ Buffer A │  │ Buffer B │  │ Buffer C │  (GEM objects)   │
  │  └────┬─────┘  └────┬─────┘  └────┬─────┘                  │
  │       │             │              │                       │
  │       ▼             ▼              ▼                       │
  │  ┌──────────┐   ┌──────────┐  ┌──────────┐                 │
  │  │ Primary  │   │ Overlay  │  │ Cursor   │  (drm_plane)    │
  │  │  Plane   │   │  Plane   │  │  Plane   │                 │ 
  │  └────┬─────┘   └────┬─────┘  └────┬─────┘                 │
  │       └──────────────┴─────────────┘                       │
  │                      │                                     │
  │                ┌─────▼─────┐                               │
  │                │   CRTC    │  timing + pixel streaming     │
  │                └─────┬─────┘  (drm_crtc)                   │
  │                ┌─────▼─────┐                               │
  │                │  Encoder  │  protocol adaptation          │
  │                └─────┬─────┘  (drm_encoder)                │
  │                ┌─────▼─────┐                               │
  │                │  Bridge   │  interface transcoding (opt.) │
  │                └─────┬─────┘  (drm_bridge)                 │
  │                ┌─────▼─────┐                               │
  │                │ Connector │  physical port                │
  │                └─────┬─────┘  (drm_connector)              │
  └──────────────────────┼─────────────────────────────────────┘
                  ┌──────────────┐
                  │ Panel/Monitor│  (drm_panel)
                  └──────────────┘

You see every stage and its kernel struct. You choose which buffer maps to which plane, when the page flip happens, which connector to use. The hardware pipeline is no longer hidden — DRM models it directly.


fbdev vs DRM/KMS — Architecture Comparison

  fbdev:                              DRM/KMS:
  ┌─────────────┐                     ┌─────────────┐
  │ Application │                     │ Application │
  └──────┬──────┘                     └──────┬──────┘
         │                                   │
    open("/dev/fb0")                    open("/dev/dri/card0")
    mmap()                              libdrm / ioctl
    write pixels                        enumerate, configure, flip
         │                                   │
  ┌──────▼──────┐                     ┌──────▼──────┐
  │  fb driver  │                     │  DRM core   │
  │ (one buffer │                     │ (buffers,   │
  │  one mode   │                     │  planes,    │
  │  hidden HW) │                     │  CRTCs,     │
  └──────┬──────┘                     │  encoders,  │
         │                            │  connectors)│
         ▼                            └──────┬──────┘
    Display HW                               │
                                        Display HW
Aspect fbdev DRM/KMS
Hardware model Flat buffer — pipeline hidden Full pipeline — planes, CRTC, encoder, bridge, connector
Buffer management One buffer, driver-managed Multiple buffers, app-managed (GEM)
Mode setting fbset (fragile) drmModeSetCrtc() (reliable)
VSync / page flip Not supported drmModePageFlip() at VBlank
Multiple displays Separate /dev/fbN, no coordination Single /dev/dri/card0, coordinated
Hardware planes Not exposed Primary, overlay, cursor — HW compositing
API stability Deprecated since ~2015 Current kernel standard
Kernel code path Many fbdev drivers are DRM wrappers now Native
Device node /dev/fb0 /dev/dri/card0

The takeaway: fbdev pretends the hardware is a flat buffer. DRM/KMS models what the hardware actually is.


DRM Objects on Real Hardware

On a Raspberry Pi 4 with HDMI and DSI connected:

  Framebuffer A ──► Primary Plane 0 ──┐
  Framebuffer B ──► Overlay Plane 0 ──┤
  Framebuffer C ──► Cursor Plane 0 ───┤
                   CRTC 0 ──► HDMI Encoder ───────────────► HDMI-A-1 ──► Monitor
                                               (no bridge — direct TMDS)

  Framebuffer D ──► Primary Plane 1 ──┐
                   CRTC 1 ──► DSI Encoder ──► TC358762 ──► DSI-1 ──► 7" LCD
                                              (bridge)    (connector)

The HDMI path has no bridge — the encoder outputs TMDS directly to the connector. The DSI path has a bridge chip (TC358762) that converts DSI to DPI for the LCD panel.

Inspect on your Pi:

# List all DRM objects
sudo modetest -M vc4    # shows connectors, encoders, CRTCs, planes

# Or with Python
python3 -c "import subprocess; subprocess.run(['modetest', '-M', 'vc4', '-c'])"

Full Software Stack — Without GPU

For CPU-rendered applications (fbdev-style drawing through DRM):

  ┌─────────────────────────────────────────────────────────┐
  │  Your Application (C / Python / SDL2)                   │
  │    draw_pixel(x, y, color)                              │
  ├─────────────────────────────────────────────────────────┤
  │  libdrm (user-space library)                            │
  │    drmModeSetCrtc(), drmModePageFlip()                  │
  │    drmIoctl() → /dev/dri/card0                          │
  ├─────────────────────────────────────────────────────────┤
  │  DRM/KMS Core (kernel)                                  │
  │    mode setting, buffer management, VBlank events       │
  ├─────────────────────────────────────────────────────────┤
  │  GEM (Graphics Execution Manager)                       │
  │    allocates "dumb buffers" in video/system memory      │
  ├─────────────────────────────────────────────────────────┤
  │  Display Controller Hardware (vc4 / v3d on Pi)          │
  │    reads buffer → CRTC → encoder → connector → panel    │
  └─────────────────────────────────────────────────────────┘

No GPU involved. The CPU writes pixels to a dumb buffer. The display controller hardware scans it out. This is what modetest and our DRM/KMS tutorials use.


Full Software Stack — With GPU (OpenGL / Vulkan)

For GPU-accelerated rendering (3D, animations, Qt QML):

  ┌─────────────────────────────────────────────────────────┐
  │  Your Application                                       │
  │    glDrawArrays(), SDL_RenderPresent()                  │
  ├─────────────────────────────────────────────────────────┤
  │  OpenGL ES / Vulkan API                                 │
  ├─────────────────────────────────────────────────────────┤
  │  Mesa (user-space GPU driver)                           │
  │    translates GL calls → GPU commands                   │
  │    manages shader compilation, state tracking           │
  ├─────────────────────────────────────────────────────────┤
  │  GBM (Generic Buffer Manager)                           │
  │    allocates GPU-accessible render targets              │
  ├─────────────────────────────────────────────────────────┤
  │  EGL (platform glue)                                    │
  │    connects GL context to DRM display surface           │
  │    EGL_PLATFORM_GBM → no compositor needed              │
  ├─────────────────────────────────────────────────────────┤
  │  DRM/KMS Core (kernel)                                  │
  │    page flip rendered buffer to display                 │
  ├─────────────────────────────────────────────────────────┤
  │  GPU Hardware (V3D on Pi)        Display Controller     │
  │    executes shaders,              scans out buffer      │
  │    rasterizes triangles           CRTC → encoder → out  │
  └─────────────────────────────────────────────────────────┘

Mesa is the open-source GPU driver stack. On the Pi 4, it uses the V3D driver for the VideoCore VI GPU. Mesa translates OpenGL/Vulkan calls into GPU hardware commands.

EGL is the glue between the rendering API (OpenGL) and the display system (DRM). On embedded Linux without a compositor, EGL binds directly to GBM/DRM — this is what Qt EGLFS and SDL2 KMSDRM use.


Where SDL2 and Qt Fit

SDL2 and Qt are application toolkits — they sit on top of these stacks and choose the right path:

                    SDL2                              Qt
                     │                                 │
          ┌──────────┼──────────┐           ┌──────────┼──────────┐
          │          │          │           │          │          │
      KMSDRM      fbcon     Wayland      EGLFS     Wayland       XCB
      backend     backend   backend      plugin    plugin       plugin
          │          │          │           │          │          │
       DRM/KMS    fbdev    Compositor   EGL+DRM   Compositor     X11
       (direct)  (legacy)  (desktop)    (direct)  (desktop)   (legacy)

For embedded (no compositor): - SDL2 → KMSDRM backend → DRM/KMS directly - Qt → EGLFS plugin → EGL + DRM directly

For desktop: - SDL2 → Wayland backend → compositor → DRM/KMS - Qt → Wayland plugin → compositor → DRM/KMS

The application code does not change. The backend/plugin selection decides the display path.

# Force SDL2 to use DRM directly (no compositor)
export SDL_VIDEODRIVER=kmsdrm
./my_sdl2_app

# Force Qt to use EGLFS (no compositor)
export QT_QPA_PLATFORM=eglfs
./my_qt_app

Level C — Full Graphics Stack (Wayland/X11)

A compositor sits between your application and the display:

  ┌───────────────────────────────────────────┐
  │  App 1     App 2     App 3     Cursor     │
  │    │         │         │         │        │
  │    └─────────┴─────────┴─────────┘        │
  │                  │                        │
  │           ┌──────▼──────┐                 │
  │           │  Compositor │ (Weston, Mutter)│
  │           │  - window placement           │
  │           │  - input routing              │
  │           │  - GPU compositing            │
  │           └──────┬──────┘                 │
  │                  │                        │
  │           ┌──────▼──────┐                 │
  │           │  DRM/KMS    │                 │
  │           └──────┬──────┘                 │
  │                  │                        │
  │           ┌──────▼──────┐                 │
  │           │   Display   │                 │
  │           └─────────────┘                 │
  └───────────────────────────────────────────┘

The compositor manages window placement, input routing, and GPU-accelerated compositing. This is desktop Linux.


Full Stack Trade-offs

Pros:

  • Rich UI — multiple windows, drag-and-drop, tooltips, cursor
  • Hardware acceleration — GPU compositing, OpenGL/Vulkan
  • UI toolkits — Qt, GTK, Flutter work out of the box
  • Standard input handling — keyboard, mouse, touch, gestures

Cons:

  • Adds 2-15 seconds to boot time (depending on stack)
  • Consumes 50-200+ MB of RAM
  • Extra buffering layer between app and display
  • More components to configure, update, and debug
  • More failure points — compositor crash = black screen

For a single fullscreen embedded app, the compositor manages windows that will never appear. You pay the full cost for zero benefit.


Comparison Table

Approach Boot Impact Memory CPU Overhead Complexity
Raw framebuffer (fbdev) None ~1 MB Minimal Low
DRM/KMS (dumb buffer) None ~2-4 MB Low Medium
Wayland + Weston +2-5 s ~50-100 MB Medium High
X11 + Desktop +5-15 s ~200+ MB High Very High

The difference between fbdev/DRM and a full compositor is not incremental — it is an order-of-magnitude jump in resource consumption and complexity.

On a 256 MB device with a 10-second boot budget, a compositor consumes half your RAM and half your boot time before your application even starts.


Wayland vs X11


Embedded vs Desktop Mindset

On desktop Linux, the graphics stack is chosen for you. Your distribution ships GNOME or KDE with a Wayland compositor. You never think about it.

On embedded Linux, you choose explicitly. Every component is a decision.

Resource Desktop (8 GB RAM, SSD) Embedded (256 MB RAM, eMMC)
100 MB for compositor 1.25% of RAM 39% of RAM
5 s for compositor boot Unnoticeable 50% of boot budget
20 packages to maintain Lost in 2000+ packages 20% of total image

What is invisible on desktop dominates on embedded. This is why embedded engineers must understand the graphics stack — not just use it.


Four Decision Factors

When choosing your graphics level, evaluate these four factors:

1. Boot time — heavier stacks take longer to initialize. A compositor adds seconds. fbdev/DRM add nothing.

2. Reliability — more components = more failure points. A compositor crash means black screen. Direct DRM means one less thing to break.

3. Maintenance cost — the compositor needs configuration, updates, and debugging. Direct rendering has fewer moving parts to maintain over a 10-year product lifecycle.

4. UI complexity — only use a heavy stack if you actually need its features. Multiple overlapping windows? You need a compositor. Single fullscreen app? You do not.

Start from the lightest option that meets requirements. Move up only when you hit a concrete limitation.


"No GUI" Still Needs Graphics

A common misconception: removing the desktop environment means giving up display output.

Wrong. Most embedded Linux products with displays run without a desktop but still draw to screen.

Key distinction:

Concept What It Means
Desktop GUI Window manager, taskbar, file manager, system tray
Display output Application renders directly to hardware

Removing the desktop removes window management — not the ability to put pixels on a screen. The vast majority of embedded displays (kiosks, HMIs, dashboards, digital signage) have no desktop environment at all.


Common Headless Display Patterns

Real products that render to display without a desktop:

  • PIL/Pillow -> fbi -> framebuffer — industrial panels, point-of-sale terminals. Generate an image in Python, push it to /dev/fb0.

  • OpenCV -> framebuffer — machine vision HMIs. Process camera frames, render results directly to display.

  • DRM dumb buffer — kiosks, digital signage, transportation displays. Allocate a buffer, draw pixels, page flip.

  • Custom fb driver — LED matrices, e-ink displays, segment LCDs. Write a minimal kernel driver that exposes /dev/fb0 for non-standard display hardware.

  • SDL2 + DRM backend — games, simulators, status dashboards. SDL2 can render directly via DRM/KMS without any compositor.

All of these produce display output. None of them need a window manager.


Decision Flowchart

  Need display output?
         |
    +----+----+
    No        Yes
    |         |
  [Done]   Multiple windows needed?
              |
         +----+----+
         Yes       No
         |         |
    [Wayland/   Need HW acceleration (GPU)?
     X11]         |
             +----+----+
             Yes       No
             |         |
         [DRM/KMS   Quick prototype / simple display?
          + GPU]       |
                  +----+----+
                  Yes       No
                  |         |
             [fbdev]   [DRM/KMS
                        dumb buffer]

Follow this flowchart from top to bottom. Most embedded products land on DRM/KMS (dumb buffer) or fbdev. Only products with genuine multi-window needs should reach for a compositor.


From Software to Wire: The Physical Display Pipeline

The graphics stack (fbdev/DRM/KMS/compositor) is the software side. Below it, the SoC's display controller pushes pixels over a physical interface to the panel:

  Application ──► DRM/KMS ──► Display Controller ──► Physical Interface ──► Panel
                                    (SoC HW)
                              ┌─────────────────────────────────────────────────┐
                              │              Which interface?                   │
                              │                                                 │
                              │  HDMI:  TMDS encoding → 3 data + 1 clk pair     │
                              │  DSI:   D-PHY packets → 2 data + 1 clk lane     │
                              │  SPI:   CPU-driven → 1 data line (no GPU!)      │
                              └─────────────────────────────────────────────────┘

HDMI and DSI are GPU-driven — the display controller reads from the DRM buffer and clocks pixels out automatically. SPI is CPU-driven — your code (or DMA) must push every pixel through the SPI bus.


Physical Interface Bandwidth on the Pi

Interface Bandwidth Max Resolution GPU Driven? Cable
HDMI 2.0 18 Gbit/s 4K @ 60 FPS Yes Micro-HDMI
MIPI DSI (2-lane) ~2 Gbit/s 800×480 @ 60 FPS Yes 15-pin FPC ribbon
SPI ~32 Mbit/s 320×240 @ 25 FPS No (CPU) GPIO wires

Why does DSI use so much less bandwidth than HDMI? Smaller resolution. The 7" DSI panel (800×480) needs ~553 Mbit/s. A 4K HDMI monitor (3840×2160) needs ~12 Gbit/s. The interface matches the panel.

Quick bandwidth formula:

  BW = Width × Height × BitsPerPixel × FPS × overhead
  800 × 480 × 24 × 60 × 1.2 = 663 Mbit/s  (DSI, 7" panel)
  1920 × 1080 × 24 × 60 × 1.25 = 4.5 Gbit/s  (HDMI, 1080p monitor)

Theory: Camera and Display Interfaces — D-PHY signaling, CSI-2, DSI packets, EDID, bandwidth math


UI Toolkit: Qt vs SDL2

You've chosen DRM/KMS — now pick your application-level toolkit.

The kernel display path decides how pixels reach the screen. The toolkit decides how your application produces those pixels.

Qt + EGLFS SDL2 + KMS/DRM
What it is Full UI framework, renders via EGL directly on KMS Minimal render loop, you draw everything
Runtime footprint ~30-80 MB ~2-5 MB
GPU required? Yes (EGL/OpenGL) Optional
Best for Dashboards, menus, touch HMIs Gauges, data viz, custom rendering
UI components Widgets, QML, animations built-in None — bring your own
Cross-SoC portability Excellent Good (but UI is custom)

Qt + EGLFS = invest upfront in framework, get layout/touch/animations for free. SDL2 + KMS/DRM = minimal footprint, maximum control, build UI yourself.


The Hybrid Sweet Spot

Many production HMIs combine both approaches:

  ┌─────────────────────────────────┐
  │  Qt Quick (QML)                 │
  │  ┌───────┐ ┌───────┐ ┌───────┐  │
  │  │ Menu  │ │Status │ │ Nav   │  │  ← QML handles UI chrome
  │  └───────┘ └───────┘ └───────┘  │
  │  ┌───────────────────────────┐  │
  │  │  Custom OpenGL/Vulkan     │  │  ← GPU renders gauges,
  │  │  render area              │  │     waveforms, 3D views
  │  └───────────────────────────┘  │
  └─────────────────────────────────┘
  • QML for menus, status bars, touch navigation — saves development time
  • OpenGL/Vulkan scene node for real-time gauges and data visualization — full GPU control
  • Both run in a single process on EGLFS — no compositor needed

For the labs: start with SDL2 (smallest footprint, teaches the hardware path). Move to Qt + EGLFS for the dashboard project.


Pitfall 1 — fbdev Compatibility Shim on DRM Systems

Modern kernels may expose /dev/fb0 as a compatibility layer over a DRM driver. This looks like fbdev but does not behave identically.

  Your app thinks:       Reality:
  ┌────────────┐        ┌────────────┐
  │ /dev/fb0   │        │ /dev/fb0   │  (compat shim)
  │  (fbdev)   │        │     |      │
  └─────┬──────┘        │  DRM/KMS   │  (actual driver)
        |               │     |      │
     Hardware           │  Hardware  │
                        └────────────┘

Page flipping, mode setting, and buffer management behave differently through the shim. Double buffering may not work. Mode changes may be ignored.

Rule: If the kernel uses a DRM driver, use the DRM API directly. Do not rely on the fbdev compatibility layer for anything beyond quick tests.


Pitfall 2 — Pixel Format and Stride Mismatch

Display hardware expects pixels in a specific format. Your renderer may produce a different one.

Format Bytes/pixel Layout
RGB565 2 5 red, 6 green, 5 blue
RGB888 3 8 red, 8 green, 8 blue
ARGB8888 4 8 alpha, 8 red, 8 green, 8 blue
BGR888 3 8 blue, 8 green, 8 red (swapped)

Stride (bytes per row) may include padding for alignment. A 800-pixel-wide RGB888 display might have stride = 2400 or stride = 2432 (padded to 64-byte boundary).

If you assume the wrong format or stride, the image appears garbled, color-shifted, or diagonally skewed. Always query the actual format and stride from the driver — never hardcode them.


Pitfall 3 — Adding a Compositor When Not Needed

Every additional layer is a potential failure point:

  Without compositor:          With compositor:
  ┌──────────┐                ┌──────────┐
  │   App    │                │   App    │
  └────┬─────┘                └────┬─────┘
       |                           |
  ┌────▼─────┐                ┌────▼─────┐
  │ DRM/KMS  │                │ Wayland  │  ← can crash
  └────┬─────┘                └────┬─────┘
       |                      ┌────▼─────┐
  ┌────▼─────┐                │ Weston   │  ← can crash
  │ Display  │                └────┬─────┘
  └──────────┘                ┌────▼─────┐
                              │ DRM/KMS  │
                              └────┬─────┘
                              ┌────▼─────┐
                              │ Display  │
                              └──────────┘

Single fullscreen app + compositor = longer boot time + more complexity + more failure modes, with zero functional benefit. The compositor manages windows that will never appear.

Start minimal. Add complexity only when you hit a concrete limitation.


Pitfall 4 — No Fallback if Display Init Fails

If the display is unplugged, the cable is damaged, or the driver probe fails at boot, what happens to your application?

Bad design: Application blocks on display init, never starts, product appears dead.

Good design: Display is treated as an optional output, not a hard dependency.

  App starts
      |
      +---> Try to open display
      |         |
      |    +----+----+
      |    OK        FAIL
      |    |         |
      |    Render    Log warning, continue without display
      |    to        (network, logging, control still work)
      |    display
      |
      +---> Core logic runs regardless

Design the display as one of several outputs. The product should still function (logging, network, control) even if the screen is missing.


Understanding the Full Desktop Stack

Before we dismiss the compositor approach for embedded, let's understand what it actually does — this helps you recognize when you truly need it and when you don't.


Key Terms — What Is What?

These terms are often confused. Here's exactly what each one is:

Name What it IS What it DOES What it does NOT do
Wayland A protocol (not software) Defines how apps talk to the compositor — buffer sharing, input events, window surfaces Does not draw anything. Does not manage windows. It's a specification, like HTTP.
Weston A compositor (reference implementation) Implements the Wayland protocol. Combines app buffers, routes input, outputs to DRM. Minimal, used in embedded. Not a toolkit. Not a desktop environment. No taskbar, no app launcher.
Mutter A compositor (GNOME's) Same job as Weston but part of GNOME. Adds desktop features: workspaces, overview, animations. Not standalone — needs GNOME Shell. Too heavy for embedded.
Sway A compositor (tiling) Wayland compositor inspired by i3. Tiling window layout, keyboard-driven. No embedded profile. Desktop-focused.
KWin A compositor (KDE's) KDE Plasma's compositor. Rich effects, desktop integration. Very heavy. Not for embedded.
Xorg A display server (X11) The X11 server. Receives draw commands from apps, renders to screen, routes input. Does not composite by itself — needs a separate compositor (picom, compton) for transparency.

Wayland Is a Protocol, Not a Program

This is the most common misconception. You don't "install Wayland" — you install a compositor that speaks the Wayland protocol.

  "I use Wayland" actually means:

  ┌─────────────────────────────────────────────────────┐
  │                                                     │
  │   App (GTK/Qt)  ──── Wayland protocol ────  Weston  │
  │                                                     │
  │   The protocol         ← this is "Wayland"          │
  │   The compositor       ← this is "Weston"           │
  │                                                     │
  └─────────────────────────────────────────────────────┘

  Like saying "I use HTTP" — you mean you use a browser (Chrome)
  that speaks HTTP to a server (Nginx). HTTP is the protocol.
  Wayland is the protocol. Weston/Mutter/Sway is the "browser."

Why this matters for embedded: If someone says "use Wayland on the Pi," the real question is: which compositor? Weston is the lightweight choice. Mutter (GNOME) would be far too heavy.


Weston — The Embedded Compositor

Weston is the reference implementation of the Wayland protocol, maintained by the same team. It's designed to be minimal:

What Weston does:

  • Accepts connections from Wayland client apps
  • Receives their rendered buffers (shared GPU memory)
  • Composites all buffers into the final screen image
  • Routes input events (touch, keyboard, mouse) to the focused app
  • Outputs the composited image via DRM/KMS
  • Handles display hotplug (HDMI connected/disconnected)

What Weston does NOT do:

  • No taskbar, no app launcher, no system tray
  • No window decorations (no title bars, close buttons)
  • No file manager, no settings panel
  • No login screen (use a separate program for that)

Weston provides the plumbing — apps appear on screen and receive input. Everything else (UI, layout, interaction) is the application's responsibility.

Embedded use: Weston's "kiosk shell" plugin runs a single app fullscreen with no chrome — essentially a Wayland-speaking DRM wrapper. Some products use this instead of direct DRM access when they want Wayland protocol compatibility (e.g., for Flutter or Chromium).


Xorg — The Legacy Display Server

Xorg is the implementation of the X11 protocol. It's a single large process (~500K lines) that:

What Xorg does:

  • Owns the display — apps cannot draw directly, they ask Xorg to draw for them
  • Manages a shared 2D canvas (the "root window")
  • Routes keyboard/mouse events to the focused window
  • Provides network transparency — apps can run on one machine, display on another (ssh -X)
  • Loads input drivers (keyboard, mouse, touchpad) and display drivers

What Xorg does NOT do:

  • Does not decide where windows go — that's the window manager (a separate process)
  • Does not composite (blend) overlapping windows by default — needs a compositor (picom, compton)
  • Does not provide a desktop environment — that's GNOME/KDE/XFCE running on top

Why X11 is declining: The "apps can't draw directly" design adds latency. The "any app can read any window" design is a security hole. The shared-canvas model doesn't work well with GPUs. Wayland fixes all three by letting apps render to their own buffers.


Compositor vs Window Manager vs Desktop Environment

These three concepts are often conflated. They are different layers:

  ┌───────────────────────────────────────────────┐
  │  Desktop Environment  (GNOME, KDE, XFCE)      │  ← the "experience"
  │  Taskbar, app launcher, file manager,         │     (apps + config + theme)
  │  settings, notifications, lock screen         │
  │                                               │
  │  ┌─────────────────────────────────────────┐  │
  │  │  Window Manager  (Mutter, KWin, i3)     │  │  ← window placement
  │  │  Position, size, stacking, focus,       │  │     rules and policy
  │  │  keyboard shortcuts, tiling/floating    │  │
  │  │                                         │  │
  │  │  ┌───────────────────────────────────┐  │  │
  │  │  │  Compositor  (built-in or picom)  │  │  │  ← pixel blending
  │  │  │  Transparency, shadows, blur,     │  │  │     the "how" of
  │  │  │  animations, buffer management    │  │  │     putting it on screen
  │  │  └───────────────────────────────────┘  │  │
  │  └─────────────────────────────────────────┘  │
  └───────────────────────────────────────────────┘

On Wayland: the compositor and window manager are the same process (Mutter, Sway, Weston). You can't mix and match.

On X11: they're separate. You can run i3 (tiling WM) + picom (compositor) on Xorg. Or Openbox (floating WM) with no compositor at all.

On embedded: you skip all three. Your app talks to DRM directly.


Putting It All Together — Who Uses What?

Product / Use case Graphics approach Why
Your laptop (Ubuntu) Mutter (Wayland compositor) + GNOME Multiple apps, desktop experience
Raspberry Pi Desktop Wayfire (Wayland compositor) + RPi Desktop Full desktop for education
Automotive HMI Weston kiosk + Qt EGLFS Single app, Wayland protocol for IVI
Industrial panel Qt EGLFS on DRM Single app, no compositor overhead
Our course labs SDL2 on DRM/KMS Minimal, teaches hardware path
Our Qt launcher Qt EGLFS on DRM Rich UI without compositor
Digital signage DRM dumb buffer Static content, minimal CPU
ATM / POS terminal Weston kiosk or DRM direct Security + single app

Notice: even commercial products that "use Wayland" often use Weston's kiosk shell — which is essentially a thin layer over DRM that adds Wayland protocol compatibility for the app framework.


X11 Architecture (Legacy Desktop)

X11 (1987) uses a client-server model where the display server owns the screen:

  ┌────────┐  ┌────────┐  ┌────────┐
  │ App 1  │  │ App 2  │  │ App 3  │   ← X11 clients
  └───┬────┘  └───┬────┘  └───┬────┘
      │           │           │          X11 protocol
      └───────────┴───────────┘          (network-transparent)
           ┌──────▼──────┐
           │  X Server   │              ← owns the screen
           │  (Xorg)     │
           │  ┌────────┐ │
           │  │ Window │ │              ← decides where windows go
           │  │ Manager│ │
           │  └────────┘ │
           │  ┌────────┐ │
           │  │ Compos-│ │              ← blends overlapping windows
           │  │ itor   │ │
           │  └────────┘ │
           └──────┬──────┘
           ┌──────▼──────┐
           │  DRM/KMS    │              ← hardware
           └─────────────┘

Key idea: Apps don't touch the display — they send draw commands to the X Server, which renders on their behalf. The Window Manager is a separate process that tells the X Server where to position each window.


Wayland Architecture (Modern Desktop)

Wayland (2012) merges the server, window manager, and compositor into one process:

  ┌────────┐  ┌────────┐  ┌────────┐
  │ App 1  │  │ App 2  │  │ App 3  │   ← Wayland clients
  └───┬────┘  └───┬────┘  └───┬────┘
      │           │           │
      │    Each app renders   │         Apps render to their OWN
      │    to a buffer (GPU)  │         buffer — not shared
      │           │           │
      └───────────┴───────────┘          Wayland protocol
           ┌──────▼──────┐
           │  Compositor │              ← ONE process does everything:
           │  (Weston,   │                 window placement
           │   Mutter,   │                 input routing
           │   Sway)     │                 GPU compositing
           └──────┬──────┘                  output to display
           ┌──────▼──────┐
           │  DRM/KMS    │
           └─────────────┘

Key difference from X11: Apps render to their own buffers (not through the server). The compositor only combines the finished buffers into the final screen image. This is simpler, more secure (apps can't snoop on each other's pixels), and lower latency.


What Each Layer Does

Layer Role Desktop example Can you skip it?
Window Manager Decides window position, size, decorations (title bar, close button), stacking order GNOME Shell, KWin, i3, Sway Yes — if you have one fullscreen app
Compositor Blends multiple app buffers into one image (transparency, shadows, animations), outputs to display Mutter, Weston, Picom Yes — if you have one fullscreen app
Display Server Routes input events (keyboard, mouse) to the correct app, manages shared display access Xorg (X11) or built into compositor (Wayland) Yes — handle input yourself (evdev/SDL2)
Toolkit Draws widgets (buttons, text, lists), handles layout GTK, Qt, Flutter Optional — you can draw pixels directly

On embedded: you typically skip the first three layers entirely. Your single app opens DRM directly, renders with SDL2 or Qt EGLFS, and reads input from /dev/input/ or SDL2's event system.


X11 vs Wayland — Quick Comparison

X11 Wayland
Age 1987 (40+ years) 2012 (~13 years)
Architecture Client → Server renders Client renders → Compositor combines
Network transparency Built-in (forward over SSH) Not built-in (use pipewire/RDP)
Security Any app can read any window (screen capture is trivial) Apps are isolated by default
Tearing Common (requires compositor hacks) Solved by design
Code complexity ~500K lines (Xorg) ~50K lines (Weston)
Embedded use Mostly legacy Weston has an embedded profile

For this course: we skip both X11 and Wayland. Our apps use DRM/KMS directly (SDL2, Qt EGLFS). But knowing the layers helps when you debug a desktop system or explain to a manager why the kiosk doesn't need a desktop environment.


Qt EGLFS — The Embedded Shortcut

Qt's EGLFS platform plugin lets you run a Qt application fullscreen on DRM/KMS without any compositor:

  Desktop Qt:                    Embedded Qt (EGLFS):
  ┌─────────────┐               ┌─────────────┐
  │  Qt App     │               │  Qt App     │
  └──────┬──────┘               └──────┬──────┘
         │                             │
  ┌──────▼──────┐               ┌──────▼──────┐
  │  Wayland /  │               │  EGL + DRM  │  ← direct GPU
  │  X11        │               │  (no compositor)
  └──────┬──────┘               └──────┬──────┘
  ┌──────▼──────┐               ┌──────▼──────┐
  │ Compositor  │               │  Display    │
  └──────┬──────┘               └─────────────┘
  ┌──────▼──────┐
  │  DRM/KMS    │
  └──────┬──────┘
  ┌──────▼──────┐
  │  Display    │
  └─────────────┘

  3 extra layers                 0 extra layers

EGLFS = "EGL Full Screen." It gives you all of Qt's UI power (QML, touch, animations) with the performance of direct DRM access. This is what the Qt App Launcher uses.


FAQ — Common Student Questions

Q: Why can't I run two SDL2 apps at the same time? Because both try to open DRM and become "DRM master" — only one process can control the display at a time. This is by design. The Qt launcher solves this by releasing DRM master before spawning a child app, and reclaiming it after. A compositor would also solve it, but at the cost we discussed.

Q: Why does my app work in SSH but show a black screen on the Pi? Graphics apps need access to /dev/dri/card0 (DRM) and possibly /dev/fb0. Over SSH, you're on a different TTY. Make sure the app runs on the correct VT, or use SDL_VIDEODRIVER=kmsdrm to force DRM mode. Also check permissions: the user needs to be in the video group.

Q: Can I use OpenGL without a compositor? Yes. EGL can bind directly to a DRM device (EGL_PLATFORM_GBM). This is what SDL2's KMSDRM backend and Qt's EGLFS do. You get full GPU acceleration without any windowing system.

Q: Why is the Pi's display upside down / rotated? The display panel's physical scanning direction may not match the expected orientation. Fix with display_rotate=2 in config.txt (fbdev) or video=DSI-1:panel_orientation=upside_down (DRM). KMS also supports rotation via the rotation plane property.

Q: Why does my framebuffer app work but the colors are wrong? Pixel format mismatch. The display might expect BGR888 but you're writing RGB888 (red and blue swapped). Always query the format with ioctl(FBIOGET_VSCREENINFO) or check the DRM plane's format list. Common formats: ARGB8888, XRGB8888, RGB565.

Q: What's the difference between a window manager and a compositor? A window manager decides where windows go (position, size, stacking). A compositor blends all windows into the final image (handles transparency, shadows, animations). On Wayland, these are the same process. On X11, they can be separate (e.g., i3 window manager + picom compositor).

Q: Do I need a GPU for embedded graphics? Not necessarily. DRM "dumb buffers" are CPU-rendered. SDL2 can software-render via DRM. For simple UIs (status displays, dashboards), CPU rendering is fast enough. You need a GPU when: rendering complex 3D (OpenGL/Vulkan), running Qt QML with animations, or compositing multiple layers at high frame rates.


Block 1 Summary

Framebuffer (fbdev): simple, direct, legacy. Best for quick prototypes and simple displays. Deprecated — new drivers target DRM.

DRM/KMS: modern, hardware-aware, tear-free. The right default for embedded products. Works without GPU using dumb buffers.

Full stack (Wayland/X11): powerful, heavy, complex. Only justified when you need multiple windows or rich desktop-style UI. Wayland is simpler and more secure than X11. Both are overkill for single-app embedded.

Qt EGLFS: the embedded sweet spot when you need rich UI — full Qt power with zero compositor overhead.

Most embedded products: single-app fullscreen pipeline using DRM/KMS or framebuffer. No compositor, no window manager, no desktop.

Decision principle: start from the simplest option that meets your requirements. Move up only when you hit a concrete limitation that the simpler approach cannot solve.


Block 2

Experiment: "Write Pixels Faster Than Refresh"


The Question

What happens when your application writes pixels to the framebuffer faster than the display can show them?

The framebuffer is shared memory: your application writes to it, and the display controller reads from it — simultaneously, without coordination.

If the application writes faster than the display scans, the display will read partially updated data.

Let's understand why, and then see it happen.


Display Scan-Out Explained

The display controller reads the framebuffer line by line, top to bottom, at a fixed rate:

  Framebuffer memory            Display panel
  ┌──────────────────┐          ┌──────────────────┐
  │ Row 0            │ -------> │ Row 0            │  <- scan position
  │ Row 1            │          │ Row 1            │
  │ Row 2            │          │                  │
  │ Row 3            │          │                  │
  │ ...              │          │                  │
  │ Row 479          │          │                  │
  └──────────────────┘          └──────────────────┘

  At 60 Hz: full scan every 16.7 ms
  The controller reads ~29 rows per millisecond (for 480 rows)
  After row 479, it returns to row 0 (VBlank interval)

The scan-out is a continuous, periodic process driven by the display hardware clock. It does not wait for your application. It does not check if the buffer is "ready."


The Tearing Mechanism

When the app overwrites the buffer while the display is reading it, you see parts of two different frames:

  Frame N              Frame N+1
  ┌──────────────┐    ┌──────────────┐
  │ AAAAAAAAAAAA │    │ BBBBBBBBBBBB │
  │ AAAAAAAAAAAA │    │ BBBBBBBBBBBB │
  │ AAAAAAAAAAAA │    │ BBBBBBBBBBBB │
  │ AAAAAAAAAAAA │    │ BBBBBBBBBBBB │
  └──────────────┘    └──────────────┘

  What you SEE (scan-out catches the switch mid-frame):
  ┌──────────────┐
  │ AAAAAAAAAAAA │  <- scanned before app started writing
  │ AAAAAAAAAAAA │
  │ BBBBBBBBBBBB │  <- scanned after app wrote these rows (TEAR LINE)
  │ BBBBBBBBBBBB │
  └──────────────┘

The horizontal boundary where two frames meet is the tear line. Its position moves because the write speed and scan-out speed are not synchronized.


VSync and Page Flipping

Solution: do not write to the buffer the display is currently reading.

Double buffering with page flipping:

  Back buffer (app draws here)     Front buffer (display reads here)
  ┌──────────────────┐             ┌───────────────────┐
  │  Frame N+1       │             │  Frame N          │
  │  (being drawn)   │             │  (being displayed)│
  └──────────────────┘             └───────────────────┘
            |                                |
            |     At VBlank: SWAP            |
            +----------->--------------------+
            pointers swap, display now reads Frame N+1

  VBlank = the brief interval between the last row and the first row
           of the next scan. Safe moment to switch buffers.

The app writes to the back buffer. When drawing is complete, it requests a page flip at VBlank. The display switches to the new buffer only between frames. Every displayed frame is complete — no tearing.

DRM/KMS supports this natively. fbdev does not.


Experiment Setup

Write solid colors to the framebuffer as fast as possible and observe the display:

# framebuffer_flood.py - write solid colors as fast as possible
import mmap, struct, time, os

fb = os.open('/dev/fb0', os.O_RDWR)
mm = mmap.mmap(fb, 800 * 480 * 4)   # adjust to your resolution

colors = [0x00FF0000, 0x0000FF00, 0x000000FF]  # R, G, B

while True:
    for color in colors:
        row = struct.pack('I', color) * 800
        mm.seek(0)
        for y in range(480):
            mm.write(row)

This script writes red, then green, then blue — as fast as the CPU can go, with no synchronization to the display refresh.

Run it on a device with a connected display and look at the screen.


What to Observe

When you run framebuffer_flood.py, you will see:

  • Horizontal tear lines where two colors meet mid-screen
  • The tear position moves — sometimes near the top, sometimes near the bottom
  • On fast CPUs, you may see multiple tear lines (three colors visible at once)
  What you expect:       What you see:
  ┌──────────────┐      ┌──────────────┐
  │ RRRRRRRRRRRR │      │ RRRRRRRRRRRR │
  │ RRRRRRRRRRRR │      │ RRRRRRRRRRRR │
  │ RRRRRRRRRRRR │      │ GGGGGGGGGGGG │  <- tear
  │ RRRRRRRRRRRR │      │ GGGGGGGGGGGG │
  └──────────────┘      └──────────────┘

This is tearing — the display reads the buffer while the application is writing to it. The tear line appears wherever the scan-out position and the write position cross.

This is exactly why DRM/KMS with page flipping exists.


The Fix — DRM Page Flip

The proper solution uses double buffering with VSync through DRM/KMS:

Step 1: Allocate two dumb buffers (front and back)

Step 2: Draw to the back buffer (the display is not reading it)

Step 3: Request atomic page flip with DRM_MODE_PAGE_FLIP_EVENT

Step 4: Wait for VBlank event (kernel signals when flip completes)

Step 5: Swap front/back buffer pointers

  Time -->
  |  Draw to back   |  Flip  |  Draw to back   |  Flip  |
  |  buffer         |  at    |  buffer         |  at    |
  |  (invisible)    | VBlank |  (invisible)    | VBlank |
                    ^                          ^
                    |                          |
              Display switches           Display switches
              to new buffer              to new buffer

Result: every frame is fully drawn before the display reads it. No tearing. No timing hacks.


Fallback Pitfall — sleep() Is Not VSync

Some teams try to "fix" tearing by throttling write speed:

# BAD: timing-based "fix"
while True:
    draw_frame()
    time.sleep(0.016)  # ~60 fps

This is fragile and will fail because:

  • CPU speed varies (thermal throttling, load changes)
  • sleep() precision is ~1-10 ms on Linux (not exact)
  • System load affects scheduling — your process may not wake on time
  • Display refresh rate may not be exactly 60 Hz
  • The sleep duration and scan-out timing drift relative to each other

The proper fix is VSync synchronization through DRM, where the kernel signals the exact VBlank moment. Sleep hacks create the illusion of working on a quiet system and break under real-world conditions.


Quick Checks

Before shipping a product with display output, answer these questions:

1. Is your display path deterministic at boot? Do you know exactly when the display initializes and what appears first? Or does it depend on service startup order?

2. Do you control mode setting explicitly? Are resolution, refresh rate, and pixel format set by your code? Or do you hope the defaults are correct?

3. Can the app recover if the display disconnects and reconnects? Hot-plug events happen — cable gets bumped, connector oxidizes, display power-cycles. Does your app handle this, or does it crash?

4. Does startup still meet your boot budget? After adding the display stack, is boot time still within the product requirement? Measure it — do not assume.


Mini Exercise

Given:

  • Single fullscreen UI application
  • Boot time requirement: < 10 seconds
  • Remote update capability required
  • Product lifecycle: 5+ years

Task: Select your graphics stack and justify your choice in 5 lines. Consider:

  • Boot impact of your chosen approach
  • Memory usage on a 256 MB system
  • Maintenance cost over the product lifecycle
  • What happens when the display cable is unplugged

Write your answer before looking at the next slide. There is no single correct answer, but there are answers that ignore constraints.


Key Takeaways

  • Framebuffer is simple but legacy — good for prototyping and simple displays, but deprecated and lacking VSync support.

  • DRM/KMS is the modern low-level choice — hardware-aware, tear-free, and the current kernel standard. Use dumb buffers when you do not need GPU acceleration.

  • Full stacks are powerful but heavy — only justified when you genuinely need multiple windows or rich desktop-style UI.

  • Embedded systems use single-app fullscreen pipelines — no compositor, no window manager, no desktop. This is the norm, not the exception.

  • Tearing is solved by VSync + page flipping, not by sleep hacks — DRM provides proper synchronization; time.sleep() provides false confidence.


Hands-On Next

Put this theory into practice with the following tutorials:

Framebuffer Basics — draw pixels directly to /dev/fb0, understand pixel formats and stride, render shapes and text from Python.

OLED Framebuffer Driver — write a kernel framebuffer driver for the SSD1306 OLED over I2C. Implements fb_info, fb_ops, and deferred I/O.

Pong on Framebuffer — build a user-space game that opens /dev/fbN, queries resolution via ioctl, and draws with mmap(). Works on both OLED and BUSE displays.

DRM/KMS Test — use the modern graphics API: enumerate connectors, set display modes, allocate dumb buffers, perform tear-free page flips.

Display Applications — create interactive applications with OpenCV and evdev for touch/button input, rendering directly to the display without a compositor.