AI on the Edge LESSON 20: Resizing, Moving, Converting and Tiling Video frames in OpenCV

Welcome back to the AI on the Edge class series! In this lesson, we are diving deep into some of the most critical foundational skills you need when working with video streams on edge devices: Resizing, Moving, Converting, and Tiling video frames using OpenCV.

When you are developing real-world AI applications on the edge, you rarely just display a single camera feed. You often need to manipulate frames to feed them into your AI models, look at grayscale versions for edge detection, or arrange multiple windows on your desktop neatly so you can monitor your data visually.

If you want to follow along exactly as we do in the video, make sure you have your Raspberry Pi 5 set up with your Camera Module.

What We Cover in This Lesson

  • Fixed FPS Estimation: We continue using our robust low-pass filter formula to track smooth, non-jittery frames-per-second data directly on the video frame.

  • Creating Named Windows: Understanding how cv2.namedWindow() combined with cv2.WINDOW_GUI_NORMAL gives you absolute programmatic control over the placement of your displays.

  • Resizing & Moving Windows: How to accurately position multiple OpenCV windows on your screen using specific coordinates while accounting for operating system taskbars and window decorative margins.

  • Frame Manipulation: Using cv2.resize() to scale down video frames and cv2.cvtColor() to transform the color space from BGR to grayscale.

  • Window Tiling: Arranging a main camera view, a scaled-down color view, and a scaled-down grayscale view in a perfect grid layout on your desktop.

The Complete Lesson 20 Code

Below is the complete Python code we developed during this lesson. It sets up your hardware camera stream, calculates running performance metrics, processes three distinct variations of the video feed, and tiles them cleanly on your screen.

 

AI on the Edge LESSON 19: Create a Bouncing Box in OpenCV On Raspberry Pi

Hey everyone, Paul McWhorter here!

Welcome back to the AI on the Edge series! In today’s lesson, we’re going to have some fun and take our first real steps into computer vision animation.

We’re going to create a colorful box that bounces around the screen like an old-school screensaver, while displaying a live FPS counter so we can see how well our Raspberry Pi is handling the workload.

Even though it looks simple, this project teaches you several foundational skills you’ll use again and again in computer vision:

  • Working with coordinates and drawing shapes in OpenCV
  • Creating smooth real-time animation
  • Detecting boundaries and reversing direction
  • Calculating and displaying live FPS

These are the same techniques you’ll build on later when we start doing object tracking, collision detection, and more advanced AI vision projects.


What You Learned in This Lesson

  • How to draw filled rectangles on a live video stream
  • How to move objects smoothly frame by frame
  • How to make objects “bounce” realistically off screen edges
  • A clean method for calculating and displaying FPS
  • Using variables to easily control size, position, speed, and color

This bouncing box may look basic, but once you understand how to do this, you can create all kinds of animated graphics that interact with what the camera sees.


Pro Tip: After you get it working, play around with the speed, box size, and colors. Try making multiple bouncing boxes with different speeds and directions — it’s a great way to experiment!


Ready for more? In the next lesson, we’re going to kick things up a notch and start working with multiple objects and more complex interactions.

Keep building, keep learning, and I’ll see you in the next video!

Paul McWhorter

For your convenience, this is the code we developed in the video.

 

AI on the Edge LESSON 18: Display Frames Per Second (FPS) on openCV Video Window

In today’s lesson, we add a clean, real-time Frames Per Second (FPS) counter directly onto our live OpenCV video window. Displaying FPS on screen is an essential tool for anyone working with camera-based AI projects on the Raspberry Pi. It gives you immediate feedback on your actual processing performance, helps with optimization, and makes your projects look more professional and polished.

In this lesson, we configure the Picamera2 library to run at 1280×720 resolution with a target of 60 frames per second. We then implement a smoothed FPS calculation using a weighted rolling average, which prevents the displayed value from jumping around wildly. Finally, we overlay the FPS text in the lower-left corner of the video frame using OpenCV’s putText() function, with font size and thickness that scale appropriately with the resolution.

This technique forms an important foundation for future lessons, as we will continue adding more information and graphics directly onto the live video stream. Understanding how to efficiently display performance metrics is key to developing responsive and practical edge AI applications.

In this lesson, this is the code which we develop:

 

AI on the Edge LESSON 17: Decorating and Annotating Video Frames in openCV

Welcome to AI on the Edge – Lesson 17: Decorating and Annotating Video Frames in OpenCV. In this lesson we take our live video stream from the Raspberry Pi camera and start making it really useful and professional-looking. Now that we can grab frames and display them, it’s time to learn how to draw directly on top of those frames. We’re talking rectangles, lines, arrows, circles, and crisp text overlays — all the visual elements you’ll need when you start adding real AI like face detection or object recognition.
You’ll see exactly how to use OpenCV’s drawing functions to create clean, scalable annotations that look great whether you’re running at 320×180 for maximum speed or higher resolutions like 1280×720. We cover how to control line thickness, use filled shapes, position text properly, and most importantly, how to make all your drawings scale automatically with your chosen resolution so everything stays nicely proportioned.
By the end of this lesson you’ll have the skills to draw bounding boxes around detected objects, add confidence scores, label people or items, draw tracking lines — basically anything you need to show what your AI is seeing. This is one of those foundational skills that you’ll use over and over again in your computer vision projects.As always, I encourage you to type the code along with me in the video, then start playing with colors, sizes, positions, and messages. Change things around, break it, and make it your own. That’s the best way to really learn this stuff.
So fire up your Raspberry Pi 5, grab that camera, and let’s start turning raw video frames into clear, informative, and great-looking annotated output!

 

AI on the Edge LESSON 16: Control Pan/Tilt Camera Position Using Voice Commands

In AI on the Edge Lesson 16, we take a big step forward by combining voice recognition with physical motion. In this project, you will build a voice-controlled pan/tilt camera system. Using simple spoken commands such as “right,” “left,” “up,” “down,” and “quit,” you can move the Raspberry Pi camera in real time. This lesson brings together the Fusion HAT+ servo control, the Speech-to-Text (STT) capabilities we explored earlier, live video streaming with picamera2 and OpenCV, and multithreading to keep everything running smoothly.
The hardware setup is straightforward. We connect two servos to the Fusion HAT+ — one for pan (horizontal movement) on pin 2 and one for tilt (vertical movement) on pin 3. The Raspberry Pi Camera is mounted on a pan/tilt mechanism so it can physically follow your voice commands. We start the camera at a neutral position (pan = 0°, tilt = -20°) and define step sizes so the movement feels responsive but controlled.
The Python code uses two main threads: one for continuous voice listening and another for displaying the live video feed. In the listening thread, we create an STT object and continuously wait for voice input. When a command is recognized, we adjust the pan or tilt angle accordingly and immediately send the new position to the appropriate servo. The main loop captures frames from the Pi Camera, flips them for correct orientation, displays them in an OpenCV window, and checks for the ‘q’ key to exit gracefully.
This project demonstrates several important concepts working together: real-time voice command processing, servo motor control, camera streaming with picamera2 at 1280×720 resolution and 60 fps, and proper use of threading so that listening and video display do not block each other. You will also notice how we use global variables carefully to share the current pan and tilt positions between the threads.
By the end of this lesson, you will have a working voice-controlled camera that you can point anywhere you want just by talking to it. This is an excellent foundation for more advanced projects such as voice-controlled object tracking, security cameras, or interactive AI assistants that can both see and move.The complete code is provided below, along with explanations of the key sections. Feel free to experiment with different step sizes (xDelta and yDelta), starting angles, or even add new voice commands once you are comfortable with the basic version.
This is the code developed in the video lesson:

 

Making The World a Better Place One High Tech Project at a Time. Enjoy!