OK guys, you spoke, and I listened. You all are asking for a lesson on how to do object detection on a Pi 5 using YOLO and an IP Camera. Well, you are about to get what you asked for. We will make this work, or we will DIE TRYING. Never fear, once you watch the video you will both understand and be able to do it on your own. First, I am assuming you watched our previous lesson where I showed you how to do the basic install and setup of YOLO. If not, never fear, I have the commands below. NOTE: This tutorial is geared towards bookworm OS. I strongly suggest you start with a fresh bookworm SC card, as there are many dependencies, and it is most likely to work if you start exactly where I am starting . . . with a fresh OS. Thes these are the commands I shared last week to get YOLO up and working: (just open a terminal, and paste these commands one at a time)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | # 1. Configure X11 (manual steps required) sudo raspi-config # → Go to Advanced Options → X11 → Enable X11 # → Finish and reboot when prompted # 5. Update system and install OpenCV sudo apt update sudo apt full-upgrade -y sudo apt install python3-opencv -y # Verify OpenCV python3 -c "import cv2; print('OpenCV version:', cv2.__version__)" # Expected output: something like "OpenCV version: 4.6.0" or higher # 6. Install MediaPipe pip install mediapipe --break # Verify MediaPipe python3 -c "import mediapipe as mp; print('MediaPipe version:', mp.__version__)" # 7. Create and activate virtual environment for YOLO11 (Ultralytics) python3 -m venv --system-site-packages YOLO source YOLO/bin/activate # You are now inside the (YOLO) virtual environment # Install Ultralytics YOLO11 inside it pip install "numpy<2.0" ultralytics # Now create a Pi friendly YOLO11 model yolo export model=yolo11n.pt format=ncnn # Optional: Verify YOLO installation python -c "from ultralytics import YOLO; print('Ultralytics YOLO ready')" # When finished working with YOLO, you can deactivate with: # deactivate #Now open Thonny, and you need to point thonny to the virtual environment you #just created. Open tools- options, select 'interpreter' tab, then click they Python #executable, selecting ... and navigate from home directory, #to YOLO, to bin, and then select python |
Now, I will explain this code, and will help you configure it for your cameras, but you will need to open up thonny, and paste in the following code as a start. IMPORTANT, as mentioned above, you need to set interpreter in thonny to the virtual environment set up in the process above. If this is not familiar to you, go back and watch last weeks lesson (click previous at the bottom of this post). Without further adue, here is the code we will work with today:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | import cv2 from ultralytics import YOLO #import secret import threading import time W=1280 H=720 RTSP_URL = "rtsp://user:password@192.168.88.44:554/cam/realmonitor?channel=1&subtype=0" #RTSP_URL = secret.RTSP_URL1 # Load the exported NCNN model (replace with your model path) model = YOLO("/home/pjm/yolo11n_ncnn_model/", task="detect") lock = threading.Lock() running = True def frameGrabber(url): global ipFrame, running cap = cv2.VideoCapture(url, cv2.CAP_FFMPEG) cap.set(cv2.CAP_PROP_BUFFERSIZE, 1) cap.set(cv2.CAP_PROP_POS_FRAMES, 0) # Reset frame position while running: ret, frame = cap.read() if ret: with lock: #frame = cv2.resize(frame, (W, H)) ipFrame = frame.copy() cap.release() print("Thread Terminated") thread = threading.Thread(target=frameGrabber, args=(RTSP_URL,), daemon=True) thread.start() tStart=time.time() time.sleep(2) fps=0 cnt=0 # Set resolution for faster processing (optional, adjust based on your needs) while True: deltaT=time.time()-tStart fps= fps*.9 + .1/deltaT tStart=time.time() with lock: ipFrameShow=ipFrame.copy() results = model(ipFrameShow, conf=0.25, verbose=False)[0] # conf: confidence threshold; adjust as needed # Annotate the frame with detections (boxes, labels, scores) annotatedFrame = results.plot() annotatedFrame=cv2.resize(annotatedFrame, (W,H)) cv2.putText(annotatedFrame, "FPS: "+str(round(fps,1)), (int(W*.01), int(H*.075)), cv2.FONT_HERSHEY_SIMPLEX, H*.002, (0, 0, 255), 3) cv2.imshow("IP Camera", annotatedFrame) #cv2.moveWindow("IP Camera",100,100) if cv2.waitKey(1)==ord('q'): break running=False thread.join() # Wait for thread to fully exit time.sleep(1) cv2.destroyAllWindows() import gc gc.collect() # Force garbage collection to reclaim memory/connections print("Program Terminated") |
The video explains everything, please watch it!