Ball Position Detection (Camera + OpenCV)

Time estimate: ~60 minutes Prerequisites: Camera Pipeline

Warning

Make sure the camera works before starting — run rpicam-still -o test.jpg and verify you get an image.

Learning Objectives

By the end of this tutorial you will be able to:

Detect a high-contrast object (ball) using thresholding and contour detection
Track the ball's centroid position in real time from a live camera stream
Measure and compare FPS with different pipeline stages (blur, morphology, circularity)
Explain the tradeoff: simpler pipelines run faster but break under poor conditions
Apply morphological operations and circularity filtering when shadows/noise require it

Thresholding: The Simplest Vision Pipeline

Color-based detection (HSV filtering) works well for colored objects, but it requires careful calibration for each lighting condition. A simpler and more robust approach for embedded vision is contrast-based thresholding: use a dark object on a light surface (or vice versa), convert to grayscale, and apply a binary threshold.

Camera frame → Grayscale → Threshold → Contours → Centroid (x, y)

This is the same pipeline used in industrial line-following robots, quality inspection systems, and pick-and-place machines. The key requirement is high contrast between the object and the background — which is a design choice you control.

The centroid is calculated from image moments: $c_x = M_{10}/M_{00}$, $c_y = M_{01}/M_{00}$, where $M_{00}$ is the contour area and $M_{10}$, $M_{01}$ are the first-order moments. OpenCV's cv2.moments() computes these efficiently.

Why Shadows Break Simple Thresholding (and How to Fix It)

A binary threshold classifies every pixel as foreground or background based solely on its brightness. This works perfectly when the ball and surface have clearly separated intensity values. But in practice, shadows create regions with intermediate brightness — darker than the surface but lighter than a dark ball — and the threshold misclassifies them.

Consider a dark ball on a white plate. The ball casts a shadow that is also dark. After thresholding, both the ball and its shadow become white blobs. The "largest contour" is now the ball+shadow combined, and its centroid is shifted toward the shadow — introducing a systematic position error that changes with lighting angle.

The image processing pipeline handles this with two additional stages:

1. Morphological operations clean up the binary image using structuring elements (small shapes, typically circles or rectangles):

Erosion shrinks white regions by removing pixels at the edges. Small noise blobs disappear entirely, and thin connections (like the bridge between a ball and its shadow) are severed.
Dilation expands white regions, restoring the ball to its original size after erosion shrank it.
Opening (erosion → dilation) removes small noise and thin protrusions while preserving larger objects. This is the key operation for separating a ball from its shadow.
Closing (dilation → erosion) fills small holes inside objects.

Mathematically, for a binary image $A$ and structuring element $B$:

$$\text{Opening}(A, B) = (A \ominus B) \oplus B$$

where $\ominus$ is erosion and $\oplus$ is dilation. The structuring element size determines the minimum feature size that survives — features smaller than $B$ are removed.

2. Circularity filtering rejects contours that are not round. A perfect circle has circularity $C = 1.0$, defined as:

$$C = \frac{4\pi \cdot \text{Area}}{\text{Perimeter}^2}$$

A shadow blob is elongated ($C \approx 0.3$–$0.5$), while the ball is round ($C > 0.7$). Filtering by circularity eliminates shadows, edges, and other non-ball artifacts even when morphology alone is insufficient.

The full robust pipeline becomes:

Grayscale → Blur → Threshold → Morphological opening → Contours
    → Filter by area → Filter by circularity → Centroid

Introduction

For the 2D ball balancing project, we need to know where the ball is on the plate — every frame, in real time. The simplest and most reliable approach: use a white ball on a black plate (or black ball on a white plate) and detect it with grayscale thresholding. No color calibration needed.

This tutorial walks through ball detection step by step, from a single image to a live stream with centroid tracking and visualization. The same technique works for line detection later (a dark line on a light surface is the same thresholding problem).

Setup

White ball (e.g., ping pong ball) on a dark/black surface — or —
Dark ball on a white surface

Mount the camera above the plate, looking straight down. The background should be as uniform as possible.

1. Install Tools

sudo apt-get update
sudo apt-get install -y python3-picamera2 python3-opencv python3-pygame

2. Capture and Inspect a Single Frame

Concept: Before writing a detection pipeline, look at what the camera actually sees. This script captures one frame and displays the raw image alongside its grayscale version on the connected display.

python3 - <<'PY'
import cv2, numpy as np, time
from picamera2 import Picamera2
import pygame

picam2 = Picamera2()
config = picam2.create_preview_configuration(main={"size": (640, 480), "format": "RGB888"})
picam2.configure(config)
picam2.start()
time.sleep(1)  # let auto-exposure settle

frame = picam2.capture_array()
picam2.stop()

gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)
print(f"Frame shape: {frame.shape}")
print(f"Grayscale min={gray.min()}, max={gray.max()}, mean={gray.mean():.0f}")

# Show on display: raw (left) | grayscale (right)
pygame.init()
info = pygame.display.Info()
sw, sh = info.current_w, info.current_h
screen = pygame.display.set_mode((sw, sh), pygame.FULLSCREEN)
pygame.mouse.set_visible(False)
font = pygame.font.SysFont("monospace", 20)
screen.fill((0, 0, 0))

half_w = sw // 2
for i, (img, label) in enumerate([
    (frame, "Raw"),
    (cv2.cvtColor(gray, cv2.COLOR_GRAY2RGB), "Grayscale"),
]):
    h, w = img.shape[:2]
    scale = min(half_w / w, sh / h) * 0.9
    resized = cv2.resize(img, (int(w * scale), int(h * scale)))
    surf = pygame.surfarray.make_surface(np.transpose(resized, (1, 0, 2)))
    x = i * half_w + (half_w - resized.shape[1]) // 2
    y = (sh - resized.shape[0]) // 2
    screen.blit(surf, (x, y))
    screen.blit(font.render(label, True, (255, 255, 0)), (i * half_w + 10, 10))

stats = f"min={gray.min()} max={gray.max()} mean={gray.mean():.0f}"
screen.blit(font.render(stats, True, (0, 255, 0)), (10, sh - 30))
pygame.display.flip()

# Wait for keypress or 15 seconds
t = time.time()
running = True
while running and (time.time() - t) < 15:
    for ev in pygame.event.get():
        if ev.type in (pygame.KEYDOWN, pygame.QUIT):
            running = False
pygame.quit()
PY

Checkpoint

The display shows the raw camera image on the left and the grayscale version on the right, with pixel value stats at the bottom. The ball should appear as a bright blob on a dark background (or dark on light). Press any key to close.

Stuck?

Image is too dark/bright — adjust lighting or let the auto-exposure settle longer (increase the sleep)
Ball is hard to see — increase the contrast between ball and surface. A white ping pong ball on black cardboard works well.

3. Find the Right Threshold

Concept: A binary threshold turns the grayscale image into black and white — ball pixels become white (255), background becomes black (0). This script shows 4 different threshold values and Otsu's automatic result side by side on the display, so you can compare them.

python3 - <<'PY'
import cv2, numpy as np, time
from picamera2 import Picamera2
import pygame

# Capture a fresh frame
picam2 = Picamera2()
config = picam2.create_preview_configuration(main={"size": (640, 480), "format": "RGB888"})
picam2.configure(config)
picam2.start()
time.sleep(1)
frame = picam2.capture_array()
picam2.stop()

gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)

# Compute thresholds
results = []
for val in [50, 100, 150, 200]:
    _, binary = cv2.threshold(gray, val, 255, cv2.THRESH_BINARY)
    pct = np.count_nonzero(binary) / binary.size * 100
    results.append((binary, f"Thresh={val} ({pct:.0f}%)"))
    print(f"Threshold {val}: {pct:.1f}% white pixels")

otsu_val, binary_otsu = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
pct = np.count_nonzero(binary_otsu) / binary_otsu.size * 100
results.append((binary_otsu, f"Otsu={otsu_val:.0f} ({pct:.0f}%)"))
print(f"Otsu automatic threshold: {otsu_val:.0f} ({pct:.1f}% white)")

# Display: original + 5 thresholds in a 2x3 grid
pygame.init()
info = pygame.display.Info()
sw, sh = info.current_w, info.current_h
screen = pygame.display.set_mode((sw, sh), pygame.FULLSCREEN)
pygame.mouse.set_visible(False)
font = pygame.font.SysFont("monospace", 16)
screen.fill((0, 0, 0))

panels = [(frame, "Original")] + results
cols, rows = 3, 2
cell_w, cell_h = sw // cols, sh // rows

for idx, (img, label) in enumerate(panels):
    col, row = idx % cols, idx // cols
    if len(img.shape) == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
    h, w = img.shape[:2]
    scale = min(cell_w / w, cell_h / h) * 0.85
    resized = cv2.resize(img, (int(w * scale), int(h * scale)))
    surf = pygame.surfarray.make_surface(np.transpose(resized, (1, 0, 2)))
    x = col * cell_w + (cell_w - resized.shape[1]) // 2
    y = row * cell_h + (cell_h - resized.shape[0]) // 2
    screen.blit(surf, (x, y))
    screen.blit(font.render(label, True, (255, 255, 0)), (col * cell_w + 6, row * cell_h + 4))

pygame.display.flip()

t = time.time()
running = True
while running and (time.time() - t) < 20:
    for ev in pygame.event.get():
        if ev.type in (pygame.KEYDOWN, pygame.QUIT):
            running = False
pygame.quit()
PY

Look at the display. The best threshold is the one where: - The ball is a solid white blob - The background is solid black - No noise (stray white pixels) in the background

Info

Otsu's method automatically finds the optimal threshold by maximizing the variance between the two classes (foreground and background). For high-contrast scenes it works very well and removes the need to manually tune the threshold value.

Tip

If you're detecting a dark ball on a light surface, use cv2.THRESH_BINARY_INV instead of cv2.THRESH_BINARY — this inverts the result so the ball becomes white.

4. Detect the Ball Contour and Centroid

Concept: Once you have a clean binary image, find the contours (edges of white regions). The largest contour is the ball. Its centroid gives you the (x, y) position. This script shows the full pipeline on the display: grayscale → threshold → detection with green contour and red centroid.

python3 - <<'PY'
import cv2, numpy as np, time
from picamera2 import Picamera2
import pygame

# Capture a fresh frame
picam2 = Picamera2()
config = picam2.create_preview_configuration(main={"size": (640, 480), "format": "RGB888"})
picam2.configure(config)
picam2.start()
time.sleep(1)
frame = picam2.capture_array()
picam2.stop()

# Process
gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)
blurred = cv2.GaussianBlur(gray, (9, 9), 0)
_, binary = cv2.threshold(blurred, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Draw detection on a copy
result = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
cx, cy, area = -1, -1, 0
if contours:
    ball = max(contours, key=cv2.contourArea)
    area = cv2.contourArea(ball)
    M = cv2.moments(ball)
    if M["m00"] > 0:
        cx = int(M["m10"] / M["m00"])
        cy = int(M["m01"] / M["m00"])
        cv2.drawContours(result, [ball], -1, (0, 255, 0), 2)
        cv2.circle(result, (cx, cy), 8, (0, 0, 255), -1)
        cv2.putText(result, f"({cx}, {cy})", (cx + 12, cy - 12),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
        print(f"Ball found: centroid=({cx}, {cy}), area={area:.0f} px")

# Display: 2x2 grid — raw | grayscale | threshold | detection
pygame.init()
info = pygame.display.Info()
sw, sh = info.current_w, info.current_h
screen = pygame.display.set_mode((sw, sh), pygame.FULLSCREEN)
pygame.mouse.set_visible(False)
font = pygame.font.SysFont("monospace", 18)
screen.fill((0, 0, 0))

half_w, half_h = sw // 2, sh // 2
panels = [
    (frame, "Raw"),
    (cv2.cvtColor(gray, cv2.COLOR_GRAY2RGB), "Grayscale"),
    (cv2.cvtColor(binary, cv2.COLOR_GRAY2RGB), "Threshold (Otsu)"),
    (cv2.cvtColor(result, cv2.COLOR_BGR2RGB), "Detection"),
]
positions = [(0, 0), (half_w, 0), (0, half_h), (half_w, half_h)]

for (img, label), (px, py) in zip(panels, positions):
    h, w = img.shape[:2]
    scale = min(half_w / w, half_h / h) * 0.9
    resized = cv2.resize(img, (int(w * scale), int(h * scale)))
    surf = pygame.surfarray.make_surface(np.transpose(resized, (1, 0, 2)))
    x = px + (half_w - resized.shape[1]) // 2
    y = py + (half_h - resized.shape[0]) // 2
    screen.blit(surf, (x, y))
    screen.blit(font.render(label, True, (255, 255, 0)), (px + 8, py + 4))

if cx >= 0:
    msg = f"Ball: ({cx}, {cy})  area={area:.0f} px"
else:
    msg = "No ball detected — check contrast / try THRESH_BINARY_INV"
screen.blit(font.render(msg, True, (0, 255, 0)), (10, sh - 30))
pygame.display.flip()

t = time.time()
running = True
while running and (time.time() - t) < 15:
    for ev in pygame.event.get():
        if ev.type in (pygame.KEYDOWN, pygame.QUIT):
            running = False
pygame.quit()
PY

Checkpoint

The display shows a 2×2 grid: raw camera, grayscale, binary threshold, and detection result with green contour and red centroid dot. The ball's (x, y) coordinates are shown at the bottom. Press any key to close.

5. Robust Detection: Morphology and Circularity (Optional)

Concept: With a high-contrast setup (white ball on matte black), simple thresholding works cleanly. But with glossy surfaces, side lighting, or shadows, thresholding picks up artifacts. Two techniques clean this up: morphological opening removes thin noise and shadow bridges, and circularity filtering keeps only round contours.

Warning

If your setup uses a matte black surface with a white ball and even lighting, you can skip this section — Section 6 lets you toggle these stages on/off to see for yourself whether they help. Come back here if you need to understand how they work.

This script shows the effect of each stage so you can see how morphology cleans up the binary image.

python3 - <<'PY'
import cv2, numpy as np, time, math
from picamera2 import Picamera2
import pygame

picam2 = Picamera2()
config = picam2.create_preview_configuration(main={"size": (640, 480), "format": "RGB888"})
picam2.configure(config)
picam2.start()
time.sleep(1)
frame = picam2.capture_array()
picam2.stop()

gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)
blurred = cv2.GaussianBlur(gray, (9, 9), 0)

# Threshold (use BINARY_INV for dark ball on light surface)
_, binary_raw = cv2.threshold(blurred, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

# Morphological opening: erode then dilate — removes small noise and shadow bridges
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (15, 15))
binary_clean = cv2.morphologyEx(binary_raw, cv2.MORPH_OPEN, kernel)

# Find contours on cleaned image and filter by circularity
contours, _ = cv2.findContours(binary_clean, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

result = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
found = False
for c in sorted(contours, key=cv2.contourArea, reverse=True):
    area = cv2.contourArea(c)
    if area < 300:
        continue
    perimeter = cv2.arcLength(c, True)
    if perimeter == 0:
        continue
    circularity = 4 * math.pi * area / (perimeter * perimeter)

    # Draw all contours with their circularity value
    color = (0, 255, 0) if circularity > 0.6 else (0, 0, 255)
    cv2.drawContours(result, [c], -1, color, 2)
    M = cv2.moments(c)
    if M["m00"] > 0:
        cx, cy = int(M["m10"] / M["m00"]), int(M["m01"] / M["m00"])
        cv2.putText(result, f"C={circularity:.2f}", (cx - 30, cy - 10),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)
        if circularity > 0.6 and not found:
            cv2.circle(result, (cx, cy), 8, (255, 0, 0), -1)
            print(f"Ball: ({cx},{cy}) area={area:.0f} circ={circularity:.2f}")
            found = True

if not found:
    print("No round contour found")

# Display: 2x2 — raw | raw threshold | after morphology | detection with circularity
pygame.init()
info = pygame.display.Info()
sw, sh = info.current_w, info.current_h
screen = pygame.display.set_mode((sw, sh), pygame.FULLSCREEN)
pygame.mouse.set_visible(False)
font = pygame.font.SysFont("monospace", 16)
screen.fill((0, 0, 0))

half_w, half_h = sw // 2, sh // 2
panels = [
    (frame, "Raw"),
    (cv2.cvtColor(binary_raw, cv2.COLOR_GRAY2RGB), "Threshold (raw)"),
    (cv2.cvtColor(binary_clean, cv2.COLOR_GRAY2RGB), "After morphology"),
    (cv2.cvtColor(result, cv2.COLOR_BGR2RGB), "Circularity filter"),
]
positions = [(0, 0), (half_w, 0), (0, half_h), (half_w, half_h)]

for (img, label), (px, py) in zip(panels, positions):
    h, w = img.shape[:2]
    scale = min(half_w / w, half_h / h) * 0.9
    resized = cv2.resize(img, (int(w * scale), int(h * scale)))
    surf = pygame.surfarray.make_surface(np.transpose(resized, (1, 0, 2)))
    x = px + (half_w - resized.shape[1]) // 2
    y = py + (half_h - resized.shape[0]) // 2
    screen.blit(surf, (x, y))
    screen.blit(font.render(label, True, (255, 255, 0)), (px + 8, py + 4))

screen.blit(font.render("Green=round (C>0.6)  Red=not round  Blue dot=ball",
            True, (0, 255, 0)), (10, sh - 26))
pygame.display.flip()

t = time.time()
running = True
while running and (time.time() - t) < 20:
    for ev in pygame.event.get():
        if ev.type in (pygame.KEYDOWN, pygame.QUIT):
            running = False
pygame.quit()
PY

Checkpoint

Compare the top-right (raw threshold) with the bottom-left (after morphology). Shadow blobs and thin noise should be gone after morphological opening. In the bottom-right, green contours are round (circularity > 0.6) and red contours are not — only the ball should be green.

Stuck?

Shadow still connected to ball — increase the kernel size from (15, 15) to (21, 21). Larger kernels remove thicker bridges but may also shrink the ball too much.
Ball disappears after morphology — the kernel is too large relative to the ball. Reduce it or move the camera closer so the ball appears larger in the frame.
Bright ball on dark surface? — change THRESH_BINARY_INV to THRESH_BINARY.

6. Live Detection with FPS Measurement

Concept: Run the detection pipeline in a loop with toggleable stages so you can see the FPS impact of each processing step. With a high-contrast setup (white ball on matte black surface), the minimal pipeline — threshold + largest contour — is fastest. Morphology and circularity filtering add robustness at the cost of frame rate.

Create ball_detect.py:

import cv2, numpy as np, time, math
from picamera2 import Picamera2
import pygame

# ── Camera ──
picam2 = Picamera2()
config = picam2.create_preview_configuration(main={"size": (320, 240), "format": "RGB888"})
picam2.configure(config)
picam2.start()

# ── Display ──
pygame.init()
info = pygame.display.Info()
screen_w, screen_h = info.current_w, info.current_h
screen = pygame.display.set_mode((screen_w, screen_h), pygame.FULLSCREEN)
pygame.mouse.set_visible(False)
font = pygame.font.SysFont("monospace", 16)

# Detection parameters (toggle with keyboard)
INVERT = False          # i: toggle for dark ball on light surface
MIN_AREA = 300          # minimum contour area (pixels)
MIN_CIRCULARITY = 0.6   # 1.0 = perfect circle
USE_BLUR = True         # b: toggle Gaussian blur
USE_MORPH = False       # m: toggle morphological opening (off by default)
USE_CIRC = False        # c: toggle circularity filter (off by default)
MORPH_SIZE = 11         # +/-: adjust morphological kernel size

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (MORPH_SIZE, MORPH_SIZE))

# FPS tracking
frame_times = []
FPS_WINDOW = 30         # rolling average over N frames

def to_surface(img):
    """Convert an OpenCV image (BGR or gray) to a pygame surface."""
    if len(img.shape) == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
    elif img.shape[2] == 3:
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    return pygame.surfarray.make_surface(np.transpose(img, (1, 0, 2)))

running = True
while running:
    t0 = time.time()

    # ── Capture ──
    frame_rgb = picam2.capture_array()
    frame_vis = cv2.cvtColor(frame_rgb, cv2.COLOR_RGB2BGR)

    # ── Threshold ──
    gray = cv2.cvtColor(frame_rgb, cv2.COLOR_RGB2GRAY)
    if USE_BLUR:
        gray = cv2.GaussianBlur(gray, (9, 9), 0)
    thresh_type = cv2.THRESH_BINARY_INV if INVERT else cv2.THRESH_BINARY
    _, binary_raw = cv2.threshold(gray, 0, 255, thresh_type + cv2.THRESH_OTSU)

    # ── Morphological opening (optional — for shadow removal) ──
    if USE_MORPH:
        binary_clean = cv2.morphologyEx(binary_raw, cv2.MORPH_OPEN, kernel)
    else:
        binary_clean = binary_raw

    # ── Contours ──
    contours, _ = cv2.findContours(binary_clean, cv2.RETR_EXTERNAL,
                                   cv2.CHAIN_APPROX_SIMPLE)

    cx, cy, area, circ = -1, -1, 0, 0.0
    for c in sorted(contours, key=cv2.contourArea, reverse=True):
        a = cv2.contourArea(c)
        if a < MIN_AREA:
            continue

        # Circularity filter (optional)
        if USE_CIRC:
            p = cv2.arcLength(c, True)
            if p == 0:
                continue
            circularity = 4 * math.pi * a / (p * p)
            color = (0, 255, 0) if circularity > MIN_CIRCULARITY else (0, 0, 255)
            cv2.drawContours(frame_vis, [c], -1, color, 2)
            if circularity <= MIN_CIRCULARITY:
                continue
        else:
            circularity = -1.0
            cv2.drawContours(frame_vis, [c], -1, (0, 255, 0), 2)

        if cx < 0:
            M = cv2.moments(c)
            if M["m00"] > 0:
                cx = int(M["m10"] / M["m00"])
                cy = int(M["m01"] / M["m00"])
                area, circ = a, circularity
                cv2.circle(frame_vis, (cx, cy), 6, (255, 0, 0), -1)
        if not USE_CIRC:
            break  # without circularity filter, just take the largest

    dt_ms = (time.time() - t0) * 1000
    frame_times.append(dt_ms)
    if len(frame_times) > FPS_WINDOW:
        frame_times.pop(0)
    avg_ms = sum(frame_times) / len(frame_times)
    fps = 1000 / avg_ms if avg_ms > 0 else 0

    # ── Display: 2x2 grid ──
    screen.fill((0, 0, 0))
    half_w, half_h = screen_w // 2, screen_h // 2

    morph_label = f"Morphology k={MORPH_SIZE}" if USE_MORPH else "Morphology OFF"
    panels = [
        (frame_rgb, "Raw"),
        (binary_raw, "Threshold"),
        (binary_clean, morph_label),
        (frame_vis, "Detection"),
    ]
    positions = [(0, 0), (half_w, 0), (0, half_h), (half_w, half_h)]

    for (img, label), (px, py) in zip(panels, positions):
        h, w = img.shape[:2]
        scale = min(half_w / w, half_h / h) * 0.9
        resized = cv2.resize(img, (int(w * scale), int(h * scale)))
        surf = to_surface(resized)
        x = px + (half_w - resized.shape[1]) // 2
        y = py + (half_h - resized.shape[0]) // 2
        screen.blit(surf, (x, y))
        screen.blit(font.render(label, True, (255, 255, 0)), (px + 8, py + 4))

    # ── Overlay: FPS + pipeline state + position ──
    pipeline = f"[blur:{'ON' if USE_BLUR else'OFF'}  morph:{'ON' if USE_MORPH else'OFF'}  circ:{'ON' if USE_CIRC else'OFF'}]"
    fps_text = f"{avg_ms:.1f} ms/frame  {fps:.1f} FPS  {pipeline}"
    screen.blit(font.render(fps_text, True, (0, 255, 0)), (10, screen_h - 48))

    if cx >= 0:
        circ_str = f" circ={circ:.2f}" if circ >= 0 else ""
        info_text = f"Ball: ({cx},{cy}) area={area:.0f}{circ_str}"
    else:
        info_text = "No ball  (i=invert b=blur m=morph c=circ +/-=kernel q=quit)"
    screen.blit(font.render(info_text, True, (0, 255, 0)), (10, screen_h - 26))

    pygame.display.flip()

    # ── Events ──
    for event in pygame.event.get():
        if event.type == pygame.QUIT:
            running = False
        if event.type == pygame.KEYDOWN:
            if event.key == pygame.K_q:
                running = False
            elif event.key == pygame.K_i:
                INVERT = not INVERT
                print(f"Invert: {INVERT}")
            elif event.key == pygame.K_b:
                USE_BLUR = not USE_BLUR
                frame_times.clear()
                print(f"Blur: {USE_BLUR}")
            elif event.key == pygame.K_m:
                USE_MORPH = not USE_MORPH
                frame_times.clear()
                print(f"Morphology: {USE_MORPH}")
            elif event.key == pygame.K_c:
                USE_CIRC = not USE_CIRC
                frame_times.clear()
                print(f"Circularity: {USE_CIRC}")
            elif event.key in (pygame.K_PLUS, pygame.K_EQUALS, pygame.K_KP_PLUS):
                MORPH_SIZE = min(41, MORPH_SIZE + 2)
                kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,
                                                   (MORPH_SIZE, MORPH_SIZE))
                print(f"Kernel: {MORPH_SIZE}")
            elif event.key in (pygame.K_MINUS, pygame.K_KP_MINUS):
                MORPH_SIZE = max(3, MORPH_SIZE - 2)
                kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,
                                                   (MORPH_SIZE, MORPH_SIZE))
                print(f"Kernel: {MORPH_SIZE}")

picam2.stop()
pygame.quit()

Run it:

python3 ball_detect.py

Controls: - q — quit - i — toggle invert (bright vs dark ball) - b — toggle blur on/off - m — toggle morphological opening on/off - c — toggle circularity filter on/off - +/- — adjust morphological kernel size

Understanding the FPS impact

Toggle each stage on/off and watch the FPS change. Typical results on a Pi 4 at 320×240:

Pipeline	Stages	FPS (approx)
Minimal	threshold + largest contour	~25-30
+ blur	threshold + blur + largest contour	~20-25
+ morphology	threshold + blur + morph + largest	~15-20
+ circularity	threshold + blur + morph + circ filter	~12-18

With a high-contrast setup (white ball on matte black), the minimal pipeline works perfectly — Otsu threshold cleanly separates ball from background with no shadows to worry about. Each additional stage costs frame rate but adds robustness against shadows, reflections, or uneven lighting.

Design decision: For ball balancing, higher FPS means faster control loop updates and better stability. If your setup is clean (matte surface, even lighting), skip morphology and circularity — they cost ~30-40% FPS for no benefit. Add them only when the environment demands it (glossy surface, side lighting, shadows).

Checkpoint

The display shows a 2×2 grid with FPS counter and pipeline state at the bottom. Toggle stages with b, m, c and note the FPS change. With all stages OFF except threshold, you should get the highest FPS. With all ON, the lowest. Move the ball — the centroid should track it in all configurations.

Stuck?

Shadow detected as ball — press m to enable morphology, then + to increase kernel size
Ball not detected — press i to toggle invert mode. Also try - if the kernel is too large.
Multiple green contours — press c to enable circularity filtering. Adjust MIN_CIRCULARITY in the code if needed.

7. Measure Detection Accuracy and Throughput

Concept: For ball balancing, two things matter: accuracy (centroid stability when the ball is stationary) and throughput (how many frames per second the pipeline processes). This script benchmarks both, with and without morphology, so you can make an informed tradeoff.

python3 - <<'PY'
import cv2, numpy as np, time, math
from picamera2 import Picamera2

picam2 = Picamera2()
config = picam2.create_preview_configuration(main={"size": (320, 240), "format": "RGB888"})
picam2.configure(config)
picam2.start()
time.sleep(1)

N = 100
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (11, 11))

def run_pipeline(label, use_blur=True, use_morph=False):
    positions = []
    t0 = time.time()
    for _ in range(N):
        frame = picam2.capture_array()
        gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)
        if use_blur:
            gray = cv2.GaussianBlur(gray, (9, 9), 0)
        _, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
        if use_morph:
            binary = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel)
        contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        if contours:
            ball = max(contours, key=cv2.contourArea)
            M = cv2.moments(ball)
            if M["m00"] > 200:
                positions.append((M["m10"]/M["m00"], M["m01"]/M["m00"]))
    elapsed = time.time() - t0
    fps = N / elapsed
    avg_ms = elapsed / N * 1000

    if len(positions) >= 10:
        xs = np.array([p[0] for p in positions])
        ys = np.array([p[1] for p in positions])
        jitter = np.sqrt((xs-xs.mean())**2 + (ys-ys.mean())**2).max()
        print(f"  {label}:")
        print(f"    {avg_ms:.1f} ms/frame  {fps:.1f} FPS")
        print(f"    X std={xs.std():.2f}px  Y std={ys.std():.2f}px  jitter={jitter:.2f}px")
        print(f"    Detections: {len(positions)}/{N}")
    else:
        print(f"  {label}: {avg_ms:.1f} ms/frame  {fps:.1f} FPS  (too few detections)")

print(f"Benchmarking {N} frames each — keep ball stationary\n")
run_pipeline("Minimal (threshold only)",       use_blur=False, use_morph=False)
run_pipeline("+ blur",                         use_blur=True,  use_morph=False)
run_pipeline("+ blur + morphology",            use_blur=True,  use_morph=True)

picam2.stop()
PY

Checkpoint

Compare FPS and accuracy across the three pipeline configurations. Typical results:

Pipeline	FPS	Centroid std	Jitter
Threshold only	~28	~0.5 px	~1.5 px
+ blur	~22	~0.3 px	~1.0 px
+ blur + morph	~17	~0.3 px	~1.0 px

With a high-contrast setup, blur improves accuracy (smoother edges) but morphology adds no benefit — the threshold is already clean. On a glossy surface with shadows, morphology makes a significant difference in detection reliability.

The tradeoff for ball balancing

A PID controller updates once per camera frame. Higher FPS = faster control loop = better disturbance rejection. At 320×240:

28 FPS (minimal) = 36 ms control period
17 FPS (full pipeline) = 59 ms control period

That's a 64% difference in control bandwidth. For a fast-moving ball, the minimal pipeline may keep up where the full pipeline can't. Choose the simplest pipeline that works for your setup — more processing is not always better.

What Just Happened?

You built a ball position detection pipeline with configurable complexity:

Camera → Grayscale → [Blur] → Threshold (Otsu) → [Morphology] → Contours
    → [Circularity filter] → Centroid (x, y)

Stages in brackets are optional — with a high-contrast setup they add cost without benefit. You measured the FPS impact of each stage and can now make an informed decision: use the simplest pipeline that works for your environment.

This is the vision front-end for the 2D Plate Balancing project. The centroid output feeds directly into the PID controller as the measured position. Higher FPS = faster control updates = better stability.

The same thresholding technique works for line detection — a dark line on a light surface produces the same kind of binary image. Instead of finding the largest blob, you find the line's centroid at the bottom of the frame to steer a robot.

Challenges

Challenge 1: Adaptive Threshold

Replace the global threshold with cv2.adaptiveThreshold() which handles uneven lighting better:

binary = cv2.adaptiveThreshold(blurred, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
                               cv2.THRESH_BINARY, 31, 10)

Compare the results with the global Otsu threshold under different lighting conditions.

Challenge 2: Minimum Enclosing Circle

Instead of using the contour centroid, fit a minimum enclosing circle to get both position and radius (which gives you the ball's apparent size):

(cx, cy), radius = cv2.minEnclosingCircle(ball)

How does the radius change as the ball moves closer to the edge of the frame?

Challenge 3: Line Detection Preview

Place a dark strip of tape on a light surface. Modify the detection to find the tape's centroid in the bottom third of the frame only (crop the image before thresholding). This is the basis for a line-following robot.

Deliverable

ball_detect.py — working live detection with toggleable pipeline stages and FPS display
Benchmark results: FPS and centroid accuracy for each pipeline configuration
Brief explanation: Which pipeline stages does your setup need and why? What is the FPS cost of each?

Course Overview | Previous: ← Camera Pipeline | Next: 2D Plate Balancing →