Tag Archives: Pi 5

AI on the Edge LESSON 27: Track Objects of Interest in OpenCV Using Contours

AI on the Edge LESSON 27: Track Objects of Interest in OpenCV Using Contours

Hey everyone, Paul McWhorter here from TopTechBoy.com. Welcome back to our channel, where we learn to build real, intelligent systems on edge hardware. Grab yourself a nice hot cup of coffee or a cold glass of iced tea, because today we are taking a massive leap forward in our computer vision journey.

Up until now, we have learned how to configure our cameras, calculate frame rates smoothly, and isolate specific objects based on color using the HSV color space. We built beautiful masks and composite images that show only our target color. But let’s be honest with ourselves: a mask is just a collection of white pixels on a black screen. The computer doesn’t actually know where the object is, how big it is, or how to follow it if it moves.

In this lesson, we are going to fix that. We are going to teach the machine to look at our mask, isolate the single biggest shape of interest, ignore the background noise, and draw a real-time bounding tracking box around it. This is true object tracking.

The Core Concept: What is a Contour?

Think of a contour as a mathematical boundary line. When OpenCV looks at a binary mask (where your target object is white and everything else is black), a contour is the continuous line that traces the outer edge of that white shape.

The beauty of contours is that they turn a chaotic cloud of thousands of isolated pixels into structured, manageable vector shapes. Once OpenCV finds these shapes, it can calculate their physical properties, such as their area, perimeter, and exact center.

The Three Steps to Algorithmic Object Tracking

To turn a raw camera frame into a fully tracked target, our script follows a strict three-part engineering pipeline inside our main execution loop:

1. Extracting Every Boundary

First, we pass our binary mask into OpenCV’s contour detection engine. We configure it to use external retrieval, meaning it will ignore any hollow holes inside the object and only trace the outermost boundary. It returns a list of every single contour it finds in the frame.

2. Hunting for the Largest Target

In the real world, your camera view is never perfectly clean. Even with an excellent HSV color mask, you will get random speckles, reflections, or background noise showing up as tiny white dots on your mask. If we tried to track everything, our program would lose its mind. To solve this, we use a Python maximization function to scan our list of contours and extract the absolute largest one based on its physical area.

3. Setting an Area Noise Floor

Even after finding the largest contour, what happens if your object completely leaves the camera view? The largest remaining “object” might be a tiny, single-pixel spec of static noise on the edge of the screen. To prevent our tracking box from jumping around erratically, we establish a strict structural threshold—a noise floor. If the area of the largest contour isn’t big enough to confidently be our target, we ignore it completely.

Drawing the Bounding Box

Once we have successfully isolated our valid, large contour, we don’t just want to draw a messy, squiggly line around it. We want clean coordinates that an automation system or a robotic pan-tilt kit could actually use to follow the target.

We pass our largest contour into a bounding rectangle function. OpenCV automatically calculates the exact mathematical limits of that shape and returns four precise numbers:

    • X: The horizontal starting pixel coordinate of the object.

    • Y: The vertical starting pixel coordinate of the object.

    • W: The total width of the object in pixels.

    • H: The total height of the object in pixels.

With those four dimensions locked down, we use a standard drawing function to overlay a crisp, green rectangle directly onto our live color camera feed. Now, as you move your object around the room, the box follows it dynamically, tracking its position in real time at high frame rates.

Note you will have to tune the LC and UC parameters for your object of interest, as we showed last week.

 

AI on the Edge LESSON 24: Processing Mouse Events in OpenCV on Pi 5

Welcome back, everyone! In our last lesson, we learned how to use matrix slicing to hardcode a Region of Interest (ROI) into our frames. That was a great static approach, but today we are taking interactivity to a whole new level.

In this lesson, you are going to learn how to catch Mouse Events inside your OpenCV windows. Instead of guess-and-checking coordinates in your code, you will be able to click anywhere on your live video stream to instantly grab the precise (x, y) pixel coordinates and read the exact color value of the pixel right under your mouse pointer. This is the foundational mechanic you need to build interactive, point-and-click AI applications.

The Core Concept: Mouse Callbacks and Global Frames

To listen for mouse clicks or movement, OpenCV uses what is called a Callback Function. You tell OpenCV: “Hey, keep an eye on this specific window. If the user does anything with the mouse inside it, instantly jump over to my custom function and tell me what happened.”

We set this up using:

cv2.setMouseCallback(‘Camera’, mouseAction)

The [y, x] Matrix Inversion Trap

There is a massive mathematical trap that catches almost every beginner when they start mapping mouse clicks to image matrices:

  • OpenCV Mouse Coordinates: When you move your mouse, OpenCV tracks position using standard Cartesian geometry: (x, y), where x is the column (horizontal distance from the left) and y is the row (vertical distance from the top).

  • NumPy Array Coordinates: When you plug those numbers into your image array to inspect a pixel, NumPy expects matrix indexing: [row, column].

Because rows correspond to the height (y) and columns correspond to the width (x), you must always invert the coordinates when accessing the frame array:

If you try to pass frame[x, y], your program will either crash with an “index out of bounds” error or return data from the completely wrong part of the image!

The Python Code Developed in This Lesson

Here is the complete, streamlined script we built during today’s tutorial. Copy this code into your workspace on your Raspberry Pi 5, fire it up, and watch your terminal output as you click around the video window.

We first developed this program as a simple example of processing mouse clicks, and print the detected event:

In order to make the program more useful, we developed this code that monitors the position of the mouse cursor, and reports the color of the pixel the mouse points at. The values are printed as labels on the openCV frame:

We can now take the project to the next level by setting the LED color to the color pointed at by the cursor in the openCV window. We will be using our standard circuit we have used in the earlier lessons.

Fusion Hat Circuit Diagram
This is the circuit we will use moving forward in the class

This is the code we developed to set the LED color based on the pixel position of the cursor in the openCV window.

Homework Assignment

 

Alright, it’s time to put this knowledge to work. Your homework assignment is to turn this simple reporting tool into an interactive, dynamic ROI selector. The homework is to  first create a text display under the FPS on the frame that show RGB value at the pixel position the mouse is pointing at, and the pixel location.

 Your homework assignment is to turn this simple reporting tool into an interactive, dynamic ROI selector.

  1. Start with your clean 1280×720 live camera stream.

  2. Modify your mouseAction callback function to look for specific mouse clicks.

  3. The Target Mechanic: When you Left-Click on the video window, store those specific coordinates as your upper-left corner. When you release the click, store those coordinates as your lower-right corner. As you are selecting, draw a live box outline over your ROI

  4. Using those two dynamic coordinate sets, use matrix slicing to pull a clean Region of Interest (ROI) out of the frame and instantly display it in a completely separate, standalone window called “Target ROI”.

  5. Safety Requirement: Make sure your code can handle clicks in any order without crashing (e.g., if a user right-clicks higher or further left than their left-click, write the conditional logic to sort the indices properly before slicing).

Get your black coffee ready, write your logic step-by-step from scratch, and do not copy code you can’t explain. Post your homework solution video on YouTube and drop a link in the comments section below so I can see who is running with the big dogs!

AI on the Edge: Install and Run YOLO Object Detection on the Raspberry Pi 5

In today’s Lesson we will see just how far we can push things on the Raspberry Pi 5. I will show you how to install YOLO11 on the Pi . I will show you a simple program that will run YOLO11 under Python and openCV. The objective in today’s lesson is to  see if the Pi5, without a Hailo accelerator hat, has sufficient power to do useful object detection. We will not use an accelerator hat, but the work is computationally intensive, so you must use active cooling. This is the low cost cooling fan we are using. It is sufficient to do the job, low cost and is a thin form factor that allows other hats to still fit on the Raspberry Pi 5. You can pick up the fan I am using HERE. Also, we are using an 8GB Pi 5. If you already have a Pi 5, it will probably work. The Pi 5 we are using is available HERE. These appliations are power hungry, so make sure you are using an official Pi Power supply.

In this lesson, I assume you are already familiar with the Pi 5. Note we are using Bookworm OS. Not all the dependencies work yet on Trixie, so I strongly recommend starting by flashing a fresh bookworm SD card.

YOLO11 is a powerful AI object detection model that runs well on the Raspberry Pi 5. The model below:

Now you should be set up to use YOLO11 on the Raspberry Pi 5!

We will start with this program, which is a simple grab a frame and show a frame openCV Program

In the video, we show how to use YOLO11 object detection in this simple program.