Tag Archives: Pi 5

AI on the Edge LESSON 27: Track Objects of Interest in OpenCV Using Contours

June 25, 2026 admin

AI on the Edge LESSON 27: Track Objects of Interest in OpenCV Using Contours

Hey everyone, Paul McWhorter here from TopTechBoy.com. Welcome back to our channel, where we learn to build real, intelligent systems on edge hardware. Grab yourself a nice hot cup of coffee or a cold glass of iced tea, because today we are taking a massive leap forward in our computer vision journey.

Up until now, we have learned how to configure our cameras, calculate frame rates smoothly, and isolate specific objects based on color using the HSV color space. We built beautiful masks and composite images that show only our target color. But let’s be honest with ourselves: a mask is just a collection of white pixels on a black screen. The computer doesn’t actually know where the object is, how big it is, or how to follow it if it moves.

In this lesson, we are going to fix that. We are going to teach the machine to look at our mask, isolate the single biggest shape of interest, ignore the background noise, and draw a real-time bounding tracking box around it. This is true object tracking.

The Core Concept: What is a Contour?

Think of a contour as a mathematical boundary line. When OpenCV looks at a binary mask (where your target object is white and everything else is black), a contour is the continuous line that traces the outer edge of that white shape.

The beauty of contours is that they turn a chaotic cloud of thousands of isolated pixels into structured, manageable vector shapes. Once OpenCV finds these shapes, it can calculate their physical properties, such as their area, perimeter, and exact center.

The Three Steps to Algorithmic Object Tracking

To turn a raw camera frame into a fully tracked target, our script follows a strict three-part engineering pipeline inside our main execution loop:

1. Extracting Every Boundary

First, we pass our binary mask into OpenCV’s contour detection engine. We configure it to use external retrieval, meaning it will ignore any hollow holes inside the object and only trace the outermost boundary. It returns a list of every single contour it finds in the frame.

2. Hunting for the Largest Target

In the real world, your camera view is never perfectly clean. Even with an excellent HSV color mask, you will get random speckles, reflections, or background noise showing up as tiny white dots on your mask. If we tried to track everything, our program would lose its mind. To solve this, we use a Python maximization function to scan our list of contours and extract the absolute largest one based on its physical area.

3. Setting an Area Noise Floor

Even after finding the largest contour, what happens if your object completely leaves the camera view? The largest remaining “object” might be a tiny, single-pixel spec of static noise on the edge of the screen. To prevent our tracking box from jumping around erratically, we establish a strict structural threshold—a noise floor. If the area of the largest contour isn’t big enough to confidently be our target, we ignore it completely.

Drawing the Bounding Box

Once we have successfully isolated our valid, large contour, we don’t just want to draw a messy, squiggly line around it. We want clean coordinates that an automation system or a robotic pan-tilt kit could actually use to follow the target.

We pass our largest contour into a bounding rectangle function. OpenCV automatically calculates the exact mathematical limits of that shape and returns four precise numbers:

- X: The horizontal starting pixel coordinate of the object.
- Y: The vertical starting pixel coordinate of the object.
- W: The total width of the object in pixels.
- H: The total height of the object in pixels.

With those four dimensions locked down, we use a standard drawing function to overlay a crisp, green rectangle directly onto our live color camera feed. Now, as you move your object around the room, the box follows it dynamically, tracking its position in real time at high frame rates.

Note you will have to tune the LC and UC parameters for your object of interest, as we showed last week.

import cv2
import time
from picamera2 import Picamera2
from fusion_hat.pwm import PWM
piCam = Picamera2()
W=1280
H=720
tStart = time.time()
fps = 0

redPin = 5
greenPin = 6
bluePin = 7
redLED = PWM(redPin)
greenLED = PWM(greenPin)
blueLED = PWM(bluePin)

RES = (W,H)
piCam.preview_configuration.main.size = RES
piCam.preview_configuration.main.format = "RGB888"
piCam.preview_configuration.controls.FrameRate=60
piCam.preview_configuration.align()
piCam.configure("preview")
piCam.start()

textLowerLeft = (int(W*.01),int(H*.06))
fontFace = cv2.FONT_HERSHEY_SIMPLEX
fontThickness = int(W/425)
fontScale = H*.0015
fontColor = (0,0,255)
xPos = 0
textLowerLeft1 = (int(W*.01),int(H*.06)*2)
textLowerLeft2 = (int(W*.01),int(H*.06)*3)
yPos = 0
valR = 0
valG = 0
valB = 0

Hue = 0
Sat = 0
Val = 0

LC = (25,100,100)
UC = (32,255,255)

frame = None
def mouseAction(event, x, y, flags, param):
    global frame, xPos, yPos, Hue, Sat, Val
    if event == 0:
        xPos = x
        yPos = y
        if frame is not None:
            valB, valG, valR = frame[y,x]
            redLED.pulse_width_percent(int(valR/255*100))
            greenLED.pulse_width_percent(int(valG/255*100/2))
            blueLED.pulse_width_percent(int(valB/255*100/4))
            frameHSV = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)
            Hue, Sat, Val =frameHSV[y,x]
cv2.namedWindow('Camera',cv2.WINDOW_GUI_NORMAL)
cv2.moveWindow('Camera',0,65)
cv2.resizeWindow('Camera',W,H)

cv2.namedWindow('Mask',cv2.WINDOW_GUI_NORMAL)
cv2.moveWindow('Mask',W,65)
cv2.resizeWindow('Mask',int(W/2),int(H/2))

cv2.namedWindow('Composite',cv2.WINDOW_GUI_NORMAL)
cv2.moveWindow('Composite',W,65+int(H/2)+25)
cv2.resizeWindow('Composite',int(W/2),int(H/2))

cv2.setMouseCallback('Camera',mouseAction)

while True:
    deltaT = time.time() - tStart
    tStart=time.time()
    fps = fps*.95 + (1/deltaT)*.05
    frame= piCam.capture_array()
    frame=cv2.flip(frame,-1)
    
    frameHSV = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)
    mask=cv2.inRange(frameHSV,LC,UC)
    composite = cv2.bitwise_and(frame, frame, mask=mask)
    
    contours, _ =cv2.findContours(mask,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
    if contours:
        #cv2.drawContours(frame,contours,-1,(255,0,0),3)
        largestContour = max(contours, key = cv2.contourArea)
        area = cv2.contourArea(largestContour)
        if area>150:
            #cv2.drawContours(frame,largestContour,-1,(255,0,0),3)
            x, y, w, h = cv2.boundingRect(largestContour)
            cv2.rectangle(frame, (x,y),(x+w,y+h),(0,255,0),3)
    myText = "FPS: "+str(round(fps,1))
    cv2.putText(frame,myText,textLowerLeft,fontFace,fontScale,fontColor,fontThickness)
    
    text1 = "Mouse Pos: "+str((xPos,yPos))
    text2 = "Pixel Color: "+str((Hue,Sat,Val))
    cv2.putText(frame,text1,textLowerLeft1,fontFace,fontScale,fontColor,fontThickness)    
    cv2.putText(frame,text2,textLowerLeft2,fontFace,fontScale,fontColor,fontThickness)    
    cv2.imshow("Camera", frame)
    cv2.imshow("Composite",composite)
    cv2.imshow("Mask",mask)

    if cv2.waitKey(1)==ord('q'):
        break
cv2.destroyAllWindows()
redLED.pulse_width_percent(0)
greenLED.pulse_width_percent(0)
blueLED.pulse_width_percent(0)
print('Program Terminated')

100

101

102

103

104

105

106

107

108

109

110

import cv2

import time

from picamera2 import Picamera2

from fusion_hat.pwm import PWM

piCam = Picamera2()

W=1280

H=720

tStart = time.time()

fps = 0

redPin = 5

greenPin = 6

bluePin = 7

redLED = PWM(redPin)

greenLED = PWM(greenPin)

blueLED = PWM(bluePin)

RES = (W,H)

piCam.preview_configuration.main.size = RES

piCam.preview_configuration.main.format = "RGB888"

piCam.preview_configuration.controls.FrameRate=60

piCam.preview_configuration.align()

piCam.configure("preview")

piCam.start()

textLowerLeft = (int(W*.01),int(H*.06))

fontFace = cv2.FONT_HERSHEY_SIMPLEX

fontThickness = int(W/425)

fontScale = H*.0015

fontColor = (0,0,255)

xPos = 0

textLowerLeft1 = (int(W*.01),int(H*.06)*2)

textLowerLeft2 = (int(W*.01),int(H*.06)*3)

yPos = 0

valR = 0

valG = 0

valB = 0

Hue = 0

Sat = 0

Val = 0

LC = (25,100,100)

UC = (32,255,255)

frame = None

def mouseAction(event, x, y, flags, param):

global frame, xPos, yPos, Hue, Sat, Val

if event == 0:

xPos = x

yPos = y

if frame is not None:

valB, valG, valR = frame[y,x]

redLED.pulse_width_percent(int(valR/255*100))

greenLED.pulse_width_percent(int(valG/255*100/2))

blueLED.pulse_width_percent(int(valB/255*100/4))

frameHSV = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)

Hue, Sat, Val =frameHSV[y,x]

cv2.namedWindow('Camera',cv2.WINDOW_GUI_NORMAL)

cv2.moveWindow('Camera',0,65)

cv2.resizeWindow('Camera',W,H)

cv2.namedWindow('Mask',cv2.WINDOW_GUI_NORMAL)

cv2.moveWindow('Mask',W,65)

cv2.resizeWindow('Mask',int(W/2),int(H/2))

cv2.namedWindow('Composite',cv2.WINDOW_GUI_NORMAL)

cv2.moveWindow('Composite',W,65+int(H/2)+25)

cv2.resizeWindow('Composite',int(W/2),int(H/2))

cv2.setMouseCallback('Camera',mouseAction)

while True:

deltaT = time.time() - tStart

tStart=time.time()

fps = fps*.95 + (1/deltaT)*.05

frame= piCam.capture_array()

frame=cv2.flip(frame,-1)

frameHSV = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)

mask=cv2.inRange(frameHSV,LC,UC)

composite = cv2.bitwise_and(frame, frame, mask=mask)

contours, _ =cv2.findContours(mask,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

if contours:

#cv2.drawContours(frame,contours,-1,(255,0,0),3)

largestContour = max(contours, key = cv2.contourArea)

area = cv2.contourArea(largestContour)

if area>150:

#cv2.drawContours(frame,largestContour,-1,(255,0,0),3)

x, y, w, h = cv2.boundingRect(largestContour)

cv2.rectangle(frame, (x,y),(x+w,y+h),(0,255,0),3)

myText = "FPS: "+str(round(fps,1))

cv2.putText(frame,myText,textLowerLeft,fontFace,fontScale,fontColor,fontThickness)

text1 = "Mouse Pos: "+str((xPos,yPos))

text2 = "Pixel Color: "+str((Hue,Sat,Val))

cv2.putText(frame,text1,textLowerLeft1,fontFace,fontScale,fontColor,fontThickness)

cv2.putText(frame,text2,textLowerLeft2,fontFace,fontScale,fontColor,fontThickness)

cv2.imshow("Camera", frame)

cv2.imshow("Composite",composite)

cv2.imshow("Mask",mask)

if cv2.waitKey(1)==ord('q'):

break

cv2.destroyAllWindows()

redLED.pulse_width_percent(0)

greenLED.pulse_width_percent(0)

blueLED.pulse_width_percent(0)

print('Program Terminated')

AI On the Edge, Raspberry Pi

AI on the Edge LESSON 24: Processing Mouse Events in OpenCV on Pi 5

June 12, 2026 admin

Welcome back, everyone! In our last lesson, we learned how to use matrix slicing to hardcode a Region of Interest (ROI) into our frames. That was a great static approach, but today we are taking interactivity to a whole new level.

In this lesson, you are going to learn how to catch Mouse Events inside your OpenCV windows. Instead of guess-and-checking coordinates in your code, you will be able to click anywhere on your live video stream to instantly grab the precise (x, y) pixel coordinates and read the exact color value of the pixel right under your mouse pointer. This is the foundational mechanic you need to build interactive, point-and-click AI applications.

The Core Concept: Mouse Callbacks and Global Frames

To listen for mouse clicks or movement, OpenCV uses what is called a Callback Function. You tell OpenCV: “Hey, keep an eye on this specific window. If the user does anything with the mouse inside it, instantly jump over to my custom function and tell me what happened.”

We set this up using:

cv2.setMouseCallback(‘Camera’, mouseAction)

The [y, x] Matrix Inversion Trap

There is a massive mathematical trap that catches almost every beginner when they start mapping mouse clicks to image matrices:

OpenCV Mouse Coordinates: When you move your mouse, OpenCV tracks position using standard Cartesian geometry: (x, y), where x is the column (horizontal distance from the left) and y is the row (vertical distance from the top).
NumPy Array Coordinates: When you plug those numbers into your image array to inspect a pixel, NumPy expects matrix indexing: [row, column].

Because rows correspond to the height (y) and columns correspond to the width (x), you must always invert the coordinates when accessing the frame array:

If you try to pass frame[x, y], your program will either crash with an “index out of bounds” error or return data from the completely wrong part of the image!

The Python Code Developed in This Lesson

Here is the complete, streamlined script we built during today’s tutorial. Copy this code into your workspace on your Raspberry Pi 5, fire it up, and watch your terminal output as you click around the video window.

We first developed this program as a simple example of processing mouse clicks, and print the detected event:

import cv2
import time
from picamera2 import Picamera2
piCam = Picamera2()
W=1280
H=720
tStart = time.time()
fps = 0
RES = (W,H)
piCam.preview_configuration.main.size = RES
piCam.preview_configuration.main.format = "RGB888"
piCam.preview_configuration.controls.FrameRate=60
piCam.preview_configuration.align()
piCam.configure("preview")
piCam.start()

textLowerLeft = (int(W*.01),int(H*.05))
fontFace = cv2.FONT_HERSHEY_SIMPLEX
fontThickness = int(W/425)
fontScale = H*.0015
fontColor = (0,0,255)
frame = None
def mouseAction(event, x, y, flags, param):
    global frame
    if frame is not None:
        print("Event: ",event, (x,y), frame[y,x])

cv2.namedWindow('Camera')
cv2.setMouseCallback('Camera',mouseAction)

while True:
    deltaT = time.time() - tStart
    tStart=time.time()
    fps = fps*.95 + (1/deltaT)*.05
    frame= piCam.capture_array()
    frame=cv2.flip(frame,-1)
    myText = "FPS: "+str(round(fps,1))
    cv2.putText(frame,myText,textLowerLeft,fontFace,fontScale,fontColor,fontThickness)
    cv2.imshow("Camera", frame)
    cv2.moveWindow("Camera",0,60)
    if cv2.waitKey(1)==ord('q'):
        break
cv2.destroyAllWindows()
print('Program Terminated')

import cv2

import time

from picamera2 import Picamera2

piCam = Picamera2()

W=1280

H=720

tStart = time.time()

fps = 0

RES = (W,H)

piCam.preview_configuration.main.size = RES

piCam.preview_configuration.main.format = "RGB888"

piCam.preview_configuration.controls.FrameRate=60

piCam.preview_configuration.align()

piCam.configure("preview")

piCam.start()

textLowerLeft = (int(W*.01),int(H*.05))

fontFace = cv2.FONT_HERSHEY_SIMPLEX

fontThickness = int(W/425)

fontScale = H*.0015

fontColor = (0,0,255)

frame = None

def mouseAction(event, x, y, flags, param):

global frame

if frame is not None:

print("Event: ",event, (x,y), frame[y,x])

cv2.namedWindow('Camera')

cv2.setMouseCallback('Camera',mouseAction)

while True:

deltaT = time.time() - tStart

tStart=time.time()

fps = fps*.95 + (1/deltaT)*.05

frame= piCam.capture_array()

frame=cv2.flip(frame,-1)

myText = "FPS: "+str(round(fps,1))

cv2.putText(frame,myText,textLowerLeft,fontFace,fontScale,fontColor,fontThickness)

cv2.imshow("Camera", frame)

cv2.moveWindow("Camera",0,60)

if cv2.waitKey(1)==ord('q'):

break

cv2.destroyAllWindows()

print('Program Terminated')

In order to make the program more useful, we developed this code that monitors the position of the mouse cursor, and reports the color of the pixel the mouse points at. The values are printed as labels on the openCV frame:

import cv2
import time
from picamera2 import Picamera2
piCam = Picamera2()
W=1280
H=720
tStart = time.time()
fps = 0
RES = (W,H)
piCam.preview_configuration.main.size = RES
piCam.preview_configuration.main.format = "RGB888"
piCam.preview_configuration.controls.FrameRate=60
piCam.preview_configuration.align()
piCam.configure("preview")
piCam.start()

textLowerLeft = (int(W*.01),int(H*.06))
fontFace = cv2.FONT_HERSHEY_SIMPLEX
fontThickness = int(W/425)
fontScale = H*.0015
fontColor = (0,0,255)
xPos = 0
textLowerLeft1 = (int(W*.01),int(H*.06)*2)
textLowerLeft2 = (int(W*.01),int(H*.06)*3)
yPos = 0
valR = 0
valG = 0
valB = 0
frame = None
def mouseAction(event, x, y, flags, param):
    global frame, xPos, yPos, valR, valG, valB
    if event == 0:
        xPos = x
        yPos = y
        if frame is not None:
            valB, valG, valR = frame[y,x]

cv2.namedWindow('Camera')
cv2.setMouseCallback('Camera',mouseAction)

while True:
    deltaT = time.time() - tStart
    tStart=time.time()
    fps = fps*.95 + (1/deltaT)*.05
    frame= piCam.capture_array()
    frame=cv2.flip(frame,-1)
    myText = "FPS: "+str(round(fps,1))
    cv2.putText(frame,myText,textLowerLeft,fontFace,fontScale,fontColor,fontThickness)
    
    text1 = "Mouse Pos: "+str((xPos,yPos))
    text2 = "Pixel Color: "+str((valR,valG,valB))
    cv2.putText(frame,text1,textLowerLeft1,fontFace,fontScale,fontColor,fontThickness)    
    cv2.putText(frame,text2,textLowerLeft2,fontFace,fontScale,fontColor,fontThickness)    
    cv2.imshow("Camera", frame)
    cv2.moveWindow("Camera",0,60)
    if cv2.waitKey(1)==ord('q'):
        break
cv2.destroyAllWindows()
print('Program Terminated')

import cv2

import time

from picamera2 import Picamera2

piCam = Picamera2()

W=1280

H=720

tStart = time.time()

fps = 0

RES = (W,H)

piCam.preview_configuration.main.size = RES

piCam.preview_configuration.main.format = "RGB888"

piCam.preview_configuration.controls.FrameRate=60

piCam.preview_configuration.align()

piCam.configure("preview")

piCam.start()

textLowerLeft = (int(W*.01),int(H*.06))

fontFace = cv2.FONT_HERSHEY_SIMPLEX

fontThickness = int(W/425)

fontScale = H*.0015

fontColor = (0,0,255)

xPos = 0

textLowerLeft1 = (int(W*.01),int(H*.06)*2)

textLowerLeft2 = (int(W*.01),int(H*.06)*3)

yPos = 0

valR = 0

valG = 0

valB = 0

frame = None

def mouseAction(event, x, y, flags, param):

global frame, xPos, yPos, valR, valG, valB

if event == 0:

xPos = x

yPos = y

if frame is not None:

valB, valG, valR = frame[y,x]

cv2.namedWindow('Camera')

cv2.setMouseCallback('Camera',mouseAction)

while True:

deltaT = time.time() - tStart

tStart=time.time()

fps = fps*.95 + (1/deltaT)*.05

frame= piCam.capture_array()

frame=cv2.flip(frame,-1)

myText = "FPS: "+str(round(fps,1))

cv2.putText(frame,myText,textLowerLeft,fontFace,fontScale,fontColor,fontThickness)

text1 = "Mouse Pos: "+str((xPos,yPos))

text2 = "Pixel Color: "+str((valR,valG,valB))

cv2.putText(frame,text1,textLowerLeft1,fontFace,fontScale,fontColor,fontThickness)

cv2.putText(frame,text2,textLowerLeft2,fontFace,fontScale,fontColor,fontThickness)

cv2.imshow("Camera", frame)

cv2.moveWindow("Camera",0,60)

if cv2.waitKey(1)==ord('q'):

break

cv2.destroyAllWindows()

print('Program Terminated')

We can now take the project to the next level by setting the LED color to the color pointed at by the cursor in the openCV window. We will be using our standard circuit we have used in the earlier lessons.

Fusion Hat Circuit Diagram — This is the circuit we will use moving forward in the class

This is the code we developed to set the LED color based on the pixel position of the cursor in the openCV window.

import cv2
import time
from picamera2 import Picamera2
from fusion_hat.pwm import PWM
piCam = Picamera2()
W=1280
H=720
tStart = time.time()
fps = 0

redPin = 5
greenPin = 6
bluePin = 7
redLED = PWM(redPin)
greenLED = PWM(greenPin)
blueLED = PWM(bluePin)

RES = (W,H)
piCam.preview_configuration.main.size = RES
piCam.preview_configuration.main.format = "RGB888"
piCam.preview_configuration.controls.FrameRate=60
piCam.preview_configuration.align()
piCam.configure("preview")
piCam.start()

textLowerLeft = (int(W*.01),int(H*.06))
fontFace = cv2.FONT_HERSHEY_SIMPLEX
fontThickness = int(W/425)
fontScale = H*.0015
fontColor = (0,0,255)
xPos = 0
textLowerLeft1 = (int(W*.01),int(H*.06)*2)
textLowerLeft2 = (int(W*.01),int(H*.06)*3)
yPos = 0
valR = 0
valG = 0
valB = 0
frame = None
def mouseAction(event, x, y, flags, param):
    global frame, xPos, yPos, valR, valG, valB
    if event == 0:
        xPos = x
        yPos = y
        if frame is not None:
            valB, valG, valR = frame[y,x]
            redLED.pulse_width_percent(int(valR/255*100))
            greenLED.pulse_width_percent(int(valG/255*100/2))
            blueLED.pulse_width_percent(int(valB/255*100/4))

cv2.namedWindow('Camera')
cv2.setMouseCallback('Camera',mouseAction)

while True:
    deltaT = time.time() - tStart
    tStart=time.time()
    fps = fps*.95 + (1/deltaT)*.05
    frame= piCam.capture_array()
    frame=cv2.flip(frame,-1)
    myText = "FPS: "+str(round(fps,1))
    cv2.putText(frame,myText,textLowerLeft,fontFace,fontScale,fontColor,fontThickness)
    
    text1 = "Mouse Pos: "+str((xPos,yPos))
    text2 = "Pixel Color: "+str((valR,valG,valB))
    cv2.putText(frame,text1,textLowerLeft1,fontFace,fontScale,fontColor,fontThickness)    
    cv2.putText(frame,text2,textLowerLeft2,fontFace,fontScale,fontColor,fontThickness)    
    cv2.imshow("Camera", frame)
    cv2.moveWindow("Camera",0,60)
    if cv2.waitKey(1)==ord('q'):
        break
cv2.destroyAllWindows()
redLED.pulse_width_percent(0)
greenLED.pulse_width_percent(0)
blueLED.pulse_width_percent(0)
print('Program Terminated')

import cv2

import time

from picamera2 import Picamera2

from fusion_hat.pwm import PWM

piCam = Picamera2()

W=1280

H=720

tStart = time.time()

fps = 0

redPin = 5

greenPin = 6

bluePin = 7

redLED = PWM(redPin)

greenLED = PWM(greenPin)

blueLED = PWM(bluePin)

RES = (W,H)

piCam.preview_configuration.main.size = RES

piCam.preview_configuration.main.format = "RGB888"

piCam.preview_configuration.controls.FrameRate=60

piCam.preview_configuration.align()

piCam.configure("preview")

piCam.start()

textLowerLeft = (int(W*.01),int(H*.06))

fontFace = cv2.FONT_HERSHEY_SIMPLEX

fontThickness = int(W/425)

fontScale = H*.0015

fontColor = (0,0,255)

xPos = 0

textLowerLeft1 = (int(W*.01),int(H*.06)*2)

textLowerLeft2 = (int(W*.01),int(H*.06)*3)

yPos = 0

valR = 0

valG = 0

valB = 0

frame = None

def mouseAction(event, x, y, flags, param):

global frame, xPos, yPos, valR, valG, valB

if event == 0:

xPos = x

yPos = y

if frame is not None:

valB, valG, valR = frame[y,x]

redLED.pulse_width_percent(int(valR/255*100))

greenLED.pulse_width_percent(int(valG/255*100/2))

blueLED.pulse_width_percent(int(valB/255*100/4))

cv2.namedWindow('Camera')

cv2.setMouseCallback('Camera',mouseAction)

while True:

deltaT = time.time() - tStart

tStart=time.time()

fps = fps*.95 + (1/deltaT)*.05

frame= piCam.capture_array()

frame=cv2.flip(frame,-1)

myText = "FPS: "+str(round(fps,1))

cv2.putText(frame,myText,textLowerLeft,fontFace,fontScale,fontColor,fontThickness)

text1 = "Mouse Pos: "+str((xPos,yPos))

text2 = "Pixel Color: "+str((valR,valG,valB))

cv2.putText(frame,text1,textLowerLeft1,fontFace,fontScale,fontColor,fontThickness)

cv2.putText(frame,text2,textLowerLeft2,fontFace,fontScale,fontColor,fontThickness)

cv2.imshow("Camera", frame)

cv2.moveWindow("Camera",0,60)

if cv2.waitKey(1)==ord('q'):

break

cv2.destroyAllWindows()

redLED.pulse_width_percent(0)

greenLED.pulse_width_percent(0)

blueLED.pulse_width_percent(0)

print('Program Terminated')

Homework Assignment

Alright, it’s time to put this knowledge to work. Your homework assignment is to turn this simple reporting tool into an interactive, dynamic ROI selector. The homework is to first create a text display under the FPS on the frame that show RGB value at the pixel position the mouse is pointing at, and the pixel location.

Your homework assignment is to turn this simple reporting tool into an interactive, dynamic ROI selector.

Start with your clean 1280×720 live camera stream.
Modify your mouseAction callback function to look for specific mouse clicks.
The Target Mechanic: When you Left-Click on the video window, store those specific coordinates as your upper-left corner. When you release the click, store those coordinates as your lower-right corner. As you are selecting, draw a live box outline over your ROI
Using those two dynamic coordinate sets, use matrix slicing to pull a clean Region of Interest (ROI) out of the frame and instantly display it in a completely separate, standalone window called “Target ROI”.
Safety Requirement: Make sure your code can handle clicks in any order without crashing (e.g., if a user right-clicks higher or further left than their left-click, write the conditional logic to sort the indices properly before slicing).

Get your black coffee ready, write your logic step-by-step from scratch, and do not copy code you can’t explain. Post your homework solution video on YouTube and drop a link in the comments section below so I can see who is running with the big dogs!

OpenCV

AI on the Edge: Install and Run YOLO Object Detection on the Raspberry Pi 5

December 30, 2025 admin

In today’s Lesson we will see just how far we can push things on the Raspberry Pi 5. I will show you how to install YOLO11 on the Pi . I will show you a simple program that will run YOLO11 under Python and openCV. The objective in today’s lesson is to see if the Pi5, without a Hailo accelerator hat, has sufficient power to do useful object detection. We will not use an accelerator hat, but the work is computationally intensive, so you must use active cooling. This is the low cost cooling fan we are using. It is sufficient to do the job, low cost and is a thin form factor that allows other hats to still fit on the Raspberry Pi 5. You can pick up the fan I am using HERE. Also, we are using an 8GB Pi 5. If you already have a Pi 5, it will probably work. The Pi 5 we are using is available HERE. These appliations are power hungry, so make sure you are using an official Pi Power supply.

In this lesson, I assume you are already familiar with the Pi 5. Note we are using Bookworm OS. Not all the dependencies work yet on Trixie, so I strongly recommend starting by flashing a fresh bookworm SD card.

YOLO11 is a powerful AI object detection model that runs well on the Raspberry Pi 5. The model below:

# 1. Configure X11 (manual steps required)
sudo raspi-config
# → Go to Advanced Options → X11 → Enable X11
# → Finish and reboot when prompted

# 5. Update system and install OpenCV
sudo apt update
sudo apt full-upgrade -y
sudo apt install python3-opencv -y

# Verify OpenCV
python3 -c "import cv2; print('OpenCV version:', cv2.__version__)"
# Expected output: something like "OpenCV version: 4.6.0" or higher

# 6. Install MediaPipe
pip install mediapipe --break

# Verify MediaPipe
python3 -c "import mediapipe as mp; print('MediaPipe version:', mp.__version__)"

# 7. Create and activate virtual environment for YOLO11 (Ultralytics)
python3 -m venv --system-site-packages YOLO
source YOLO/bin/activate

# You are now inside the (YOLO) virtual environment
# Install Ultralytics YOLO11 inside it
pip install "numpy<2.0" ultralytics

# Now create a Pi friendly YOLO11 model
yolo export model=yolo11n.pt format=ncnn

# Optional: Verify YOLO installation
python -c "from ultralytics import YOLO; print('Ultralytics YOLO ready')"

# When finished working with YOLO, you can deactivate with:
# deactivate

#Now open Thonny, and you need to point thonny to the virtual environment you 
#just created. Open tools- options, select 'interpreter' tab, then click they Python
#executable, selecting ... and navigate from home directory, 
#to YOLO, to bin, and then select python

# 1. Configure X11 (manual steps required)

sudo raspi-config

# → Go to Advanced Options → X11 → Enable X11

# → Finish and reboot when prompted

# 5. Update system and install OpenCV

sudo apt update

sudo apt full-upgrade -y

sudo apt install python3-opencv -y

# Verify OpenCV

python3 -c "import cv2; print('OpenCV version:', cv2.__version__)"

# Expected output: something like "OpenCV version: 4.6.0" or higher

# 6. Install MediaPipe

pip install mediapipe --break

# Verify MediaPipe

python3 -c "import mediapipe as mp; print('MediaPipe version:', mp.__version__)"

# 7. Create and activate virtual environment for YOLO11 (Ultralytics)

python3 -m venv --system-site-packages YOLO

source YOLO/bin/activate

# You are now inside the (YOLO) virtual environment

# Install Ultralytics YOLO11 inside it

pip install "numpy<2.0" ultralytics

# Now create a Pi friendly YOLO11 model

yolo export model=yolo11n.pt format=ncnn

# Optional: Verify YOLO installation

python -c "from ultralytics import YOLO; print('Ultralytics YOLO ready')"

# When finished working with YOLO, you can deactivate with:

# deactivate

#Now open Thonny, and you need to point thonny to the virtual environment you

#just created. Open tools- options, select 'interpreter' tab, then click they Python

#executable, selecting ... and navigate from home directory,

#to YOLO, to bin, and then select python

Now you should be set up to use YOLO11 on the Raspberry Pi 5!

We will start with this program, which is a simple grab a frame and show a frame openCV Program

import cv2
from picamera2 import Picamera2
import time
piCam = Picamera2()
W=1280
H=720
RES = (W,H)
piCam.preview_configuration.main.size = RES
piCam.preview_configuration.main.format = "RGB888"
piCam.preview_configuration.controls.FrameRate=60
piCam.preview_configuration.align()
piCam.configure("preview")
piCam.start()
fps=0

tStart=time.time()
while True:
    frame= piCam.capture_array()
    #frame=cv2.flip(frame,-1)
    deltaT=time.time()-tStart
    tStart=time.time()
    fps= fps*.9 + .1/deltaT
    cv2.putText(frame, "FPS: "+str(round(fps,1)), (int(W*.01), int(H*.075)), 
            cv2.FONT_HERSHEY_SIMPLEX, H*.002, (0, 0, 255), 2)
    cv2.imshow("Camera", frame)
    cv2.moveWindow("Camera",100,100)
    if cv2.waitKey(1)==ord('q'):
        break
cv2.destroyAllWindows()

import cv2

from picamera2 import Picamera2

import time

piCam = Picamera2()

W=1280

H=720

RES = (W,H)

piCam.preview_configuration.main.size = RES

piCam.preview_configuration.main.format = "RGB888"

piCam.preview_configuration.controls.FrameRate=60

piCam.preview_configuration.align()

piCam.configure("preview")

piCam.start()

fps=0

tStart=time.time()

while True:

frame= piCam.capture_array()

#frame=cv2.flip(frame,-1)

deltaT=time.time()-tStart

tStart=time.time()

fps= fps*.9 + .1/deltaT

cv2.putText(frame, "FPS: "+str(round(fps,1)), (int(W*.01), int(H*.075)),

cv2.FONT_HERSHEY_SIMPLEX, H*.002, (0, 0, 255), 2)

cv2.imshow("Camera", frame)

cv2.moveWindow("Camera",100,100)

if cv2.waitKey(1)==ord('q'):

break

cv2.destroyAllWindows()

In the video, we show how to use YOLO11 object detection in this simple program.

import cv2
from ultralytics import YOLO
from picamera2 import Picamera2
import time
piCam = Picamera2()
W=1280
H=720
RES = (W,H)
piCam.preview_configuration.main.size = RES
piCam.preview_configuration.main.format = "RGB888"
piCam.preview_configuration.controls.FrameRate=60
piCam.preview_configuration.align()
piCam.configure("preview")
piCam.start()
# Load the exported NCNN model (replace with your model path)
model = YOLO("/home/pjm/yolo11n_ncnn_model", task = 'detect')
fps=0
tStart=time.time()
# Set resolution for faster processing (optional, adjust based on your needs)
while True:
    frame= piCam.capture_array()
    results = model(frame, conf=0.25, verbose=False)
    frame = results[0].plot()  # Plots old boxes on new frame!
    deltaT=time.time()-tStart
    tStart=time.time()
    fps= fps*.8 + .2/deltaT
    cv2.putText(frame, "FPS: "+str(round(fps,1)), (int(W*.01), int(H*.075)), 
            cv2.FONT_HERSHEY_SIMPLEX, H*.002, (0, 0, 255), 2)
    
    # Display the result
    cv2.imshow("YOLO11 Detection", frame)

    # Exit on 'q' key
    if cv2.waitKey(1) == ord('q'):
        break

# Cleanup
cv2.destroyAllWindows()
import gc
gc.collect()  # Force garbage collection to reclaim memory/connections
#Make Extra sure all processes are killed
import os
os._exit(0)
print("Program Terminated")

import cv2

from ultralytics import YOLO

from picamera2 import Picamera2

import time

piCam = Picamera2()

W=1280

H=720

RES = (W,H)

piCam.preview_configuration.main.size = RES

piCam.preview_configuration.main.format = "RGB888"

piCam.preview_configuration.controls.FrameRate=60

piCam.preview_configuration.align()

piCam.configure("preview")

piCam.start()

# Load the exported NCNN model (replace with your model path)

model = YOLO("/home/pjm/yolo11n_ncnn_model", task = 'detect')

fps=0

tStart=time.time()

# Set resolution for faster processing (optional, adjust based on your needs)

while True:

frame= piCam.capture_array()

results = model(frame, conf=0.25, verbose=False)

frame = results[0].plot() # Plots old boxes on new frame!

deltaT=time.time()-tStart

tStart=time.time()

fps= fps*.8 + .2/deltaT

cv2.putText(frame, "FPS: "+str(round(fps,1)), (int(W*.01), int(H*.075)),

cv2.FONT_HERSHEY_SIMPLEX, H*.002, (0, 0, 255), 2)

# Display the result

cv2.imshow("YOLO11 Detection", frame)

# Exit on 'q' key

if cv2.waitKey(1) == ord('q'):

break

# Cleanup

cv2.destroyAllWindows()

import gc

gc.collect() # Force garbage collection to reclaim memory/connections

#Make Extra sure all processes are killed

import os

os._exit(0)

print("Program Terminated")

Technology Tutorials

Tag Archives: Pi 5

AI on the Edge LESSON 27: Track Objects of Interest in OpenCV Using Contours

AI on the Edge LESSON 27: Track Objects of Interest in OpenCV Using Contours

The Core Concept: What is a Contour?

The Three Steps to Algorithmic Object Tracking

1. Extracting Every Boundary

2. Hunting for the Largest Target

3. Setting an Area Noise Floor

Drawing the Bounding Box

AI on the Edge LESSON 24: Processing Mouse Events in OpenCV on Pi 5

The Core Concept: Mouse Callbacks and Global Frames

The [y, x] Matrix Inversion Trap

The Python Code Developed in This Lesson

Homework Assignment

AI on the Edge: Install and Run YOLO Object Detection on the Raspberry Pi 5

Making The World a Better Place One High Tech Project at a Time. Enjoy!