Tag Archives: Edge AI

NVIDIA Jetson Orin Nano: Secret to Running Ollama on the GPU

One of the biggest frustrations with the new Jetpack 7.2 release is finding out that a standard installation of Ollama—the gold standard for running local LLMs—completely ignores your powerful NVIDIA GPU and defaults to the CPU.

In this lesson, we aren’t just going to fix that; we are going to measure the “truth” behind the performance. We will use data to see exactly how much gain we get from the GPU and where the hardware starts to hit the thermal throttling wall.

The Problem: The “Canned” Installation

When you run a standard Ollama install on the Jetson Orin Nano, the system doesn’t automatically recognize the integrated GPU (iGPU). If you open your NVIDIA Power GUI (jtop), you will see your CPU cores pegged at 100% while the GPU sits idle. This leads to slow response times and a disappointing experience.

Lets start by the standard ‘Canned’ Installation. The good news is, it is very simple:

To see exactly how your system is performing, run Ollama in verbose mode:

At this point you will have Ollama running a simple LLM locally on your Jetson Orin Nano. This is a huge step forward, but we now want to dig deeper and actually see how well this simple model is performing.  The first thing we do is run the Jetson Power GUI, hidden behind the NVIDIA icon in upper right of the menu bar.

Pay close attention to the Prompt Eval Rate and Eval Rate (tokens per second). These are our baseline numbers.

The “Secret Sauce” Solution

To force Ollama to use the Jetson’s CUDA cores, we have to manually override the system service configuration.

Step 1: Install the Nano Editor

Before we can edit system files, we need a reliable text editor. If you don’t have it yet, run this command:

Step 2: Create the Service Override

We need to tell the Ollama service exactly where to look for the GPU libraries. Use nano to open the following file:

Step 3: Add the Configuration

Copy and paste the following block into that file. This is the “Secret Sauce” that enables the iGPU and points the system to the correct CUDA backend:

Note: Save the file by pressing Ctrl+OEnter, and then Ctrl+X to exit.

Step 4: Reboot

For the changes to take effect, We will do a reboot.

Benchmarking the Results

Once you have the GPU engaged, the real work begins. In the video, we look at a side-by-side comparison of performance across different Jetson Power Modes (10W, 15W, and MaxN).

Power Level Prompt Eval Rate (t/s)  Eval Rate (t/s) Throttling Observed?
CPU [Your Data] [Your Data] Yes/No
10W [Your Data] [Your Data] Yes/No
15W [Your Data] [Your Data] Yes/No
MaxN [Your Data] [Your Data] Yes/No

As we discovered, moving to the GPU provides a boost, but it also increases the heat signature. Watch the full video to see the charts and understand which power level provides the best “sweet spot” for stable, long-term AI performance on your Jetson Orin Nano. This is an important first step . . . getting the heavy lifting down to the GPU. Now in future videos we will explore how to get the work done Well on the GPU.

 

AI on the Edge LESSON 25: Create Region of Interest (ROI) in openCV Using the Mouse

Well, hello there! I’m absolutely delighted you could join me today. If you’ve been following along with our journey into AI on the Edge, you know that we are getting closer and closer to building some truly powerful, real-world computer vision applications. But before we can get to the fancy AI stuff, we have to master the fundamentals. Today, we’re tackling something that is going to make your projects look—and feel—a whole lot more professional: creating a Region of Interest (ROI) using the mouse.

Why Do We Need an ROI?

Think about it. When you’re processing a video feed, you’re usually wasting a ton of compute power looking at things that don’t matter. Maybe you’re tracking a ball on a table, but your camera is seeing the whole room. Why process the walls and the ceiling when you only care about the table? By defining an ROI, we tell our code: “Ignore everything else. Only look here.” It saves processing time, it reduces noise, and it makes your AI much more accurate.

Interacting with OpenCV

In this lesson, we’re going to step beyond simple static code. I’m going to show you how to use OpenCV’s callback functions to make your program “live.” We’ll use the mouse to click and drag a rectangle directly on the video feed to define our ROI in real-time. It’s interactive, it’s intuitive, and it’s a vital skill for anyone building real-world vision systems.

The Code

Now, I’ve put a lot of work into making this code clean and easy to follow. You’ll see exactly how we capture those mouse events—cv2.EVENT_LBUTTONDOWN, cv2.EVENT_MOUSEMOVE, and cv2.EVENT_LBUTTONUP—to create that bounding box dynamically.

Putting It to the Test

I want you to take this code, run it on your Jetson, and play around with it. Try defining different regions. Notice how the frame rate stays steady because we aren’t bogging down the CPU with unnecessary pixels. This is the “Edge” part of “AI on the Edge”—making smart, efficient decisions right where the data is being captured.

I can’t wait to see what you build with this. As always, keep those questions coming, stay curious, and most importantly—don’t get discouraged! We’re doing hard things, and you are doing a great job.

I’ll see you in the next lesson!

What questions do you have about implementing ROI in your own computer vision projects? Post them in comments on the video! Thanks for learning.

We will be using the circuit used in the earlier lessons:

Fusion Hat Circuit Diagram
This is the circuit we will use moving forward in the class

 

Edge AI on the NVIDIA Jetson Orin Nano: You are Running With the Big Dogs Now!

Welcome back. If you are watching this, you’re ready to stop playing with toys and start building real-world AI. Today, we are looking at the NVIDIA Jetson Orin Nano. Let’s get one thing straight: this is not a Raspberry Pi.

Under the hood, you are working with an Ampere-architecture GPU featuring 1,024 CUDA cores and 32 Tensor cores. You have a 6-core ARM Cortex-A78AE v8.2 64-bit CPU. Depending on how you configure your power mode, you are looking at anywhere from 20 to 40 TOPS of AI performance. This is raw, unadulterated horsepower that can process multi-stream video pipelines in real-time. In the 15W mode, you are managing a delicate balance of thermals and throughput; in the 25W mode, you are pushing the limits of the silicon itself. But this power comes with a price. You have been playing in an amusement park, but now, you’re going skydiving. The guardrails are gone.

The Skydiving Mindset: In the Pi or Arduino world, everything is ‘turn-key.’ You follow the recipe, you get the cake. It’s safe. It’s predictable. But when you are dealing with 40 TOPS of compute, the environment is fundamentally different. There are no guardrails here. If you don’t do the work, if you don’t check your own gear, you hit the ground.

There is a fundamental shift in responsibility when you move from consumer hobbyist boards to professional embedded silicon. You aren’t just a user anymore; you are an architect. If you’re looking for a guaranteed result because you clicked a link, go back to the Pi. If you’re looking to master high-performance silicon, welcome to the deep end. We are ‘Running with the Big Dogs’ now.

The Infrastructure Tax: Let’s start with the cost of entry. If you are trying to develop on an Orin using a Virtual Machine or a dual-boot setup on your Windows gaming laptop, stop. Just stop. You are setting yourself up for a failure that has nothing to do with the board and everything to do with your infrastructure.

I’ll give you a horror story. I tried to dual-boot my main workstation to make it ‘easier’ to access the Ubuntu environment needed for the SDK Manager. I triggered a BitLocker conflict. It didn’t just break the bootloader; it effectively bricked my NVMe drive so thoroughly that I had to dump the drive, buy a replacement, and reload my entire backup image from scratch.

That is the ‘Big Dog’ tax. Professionals don’t risk their primary workstation for a development tool. You build a dedicated, stand-alone Ubuntu machine. That is the cost of entry. If you can’t commit to a clean Linux environment, you aren’t ready for this hardware. The SDK Manager requires low-level USB access and partition control that hypervisors simply cannot handle reliably. You want to play with the big silicon? You bring the right infrastructure.

The Illusion of Instructions: You’ve probably heard people complain that my instructions didn’t work. Or they get angry at NVIDIA because the latest JetPack caused a kernel panic. I want to tell you the truth: You aren’t following instructions; you’re following suggestions.

Look at JetPack 7.2. Thousands of people followed the official documentation to the letter, and for half of them, it failed. The ‘Super Mode’ didn’t show up. And in the frantic attempt to force it to appear, many of them bricked their boards. When you brick an Orin—and you will—you don’t get a ‘reset’ button. You get a terminal, a flashing USB cable, and the SDK Manager.

When you’re flying a jet, you don’t blame the manual when the engine flame-outs. You check the instrumentation. The Jetson is your instrumentation. If it says ‘Over-Current,’ you don’t get mad at the manufacturer—you analyze your power budget. You are pushing hardware to its thermal and electrical limits. You are choosing your destiny with every power-mode configuration you change. This isn’t a software update; it’s a battlefield.

The Oracle of Delphi: Now, let’s talk about the NVIDIA forums. Think of those forums as the Oracle of Delphi. You do not walk into that house and demand service. If you post, ‘I followed the instructions and it broke, what a goat rodeo, you guys released a broken OS,’ you are done. You will be ignored, and you will lose all professional credibility.

Here is the 12-Hour Rule: Before you post, you spend 12 hours of deep-dive, log-file-reading, self-inflicted pain on your own. You read the dmesg output. You check your logs in /var/log/syslog. You look at jtop and you watch the power rails. If you can’t describe exactly what is happening, you aren’t ready for help.

When you do post, you provide a reproduction script. You provide data. You treat those engineers with the respect they deserve. And when they respond? You shut up and listen. They are the pilot; you are the co-pilot. You do not touch the controls. You follow their lead, you execute their tests, and you report the results. Any frustration you express makes you look like a hobbyist who doesn’t understand the complexity of what they are touching. You are a guest in their house. Earn your stay.

Log-Driven Development: If your terminal isn’t covered in log outputs, you aren’t debugging—you’re guessing. Guessing is for hobbyists. Engineers measure. In the Pi world, you just write code and it works. On the Jetson, you have to think like an architect. Is your code saturating the memory bandwidth? Is your model actually hitting the Tensor cores? If you treat the Orin like a general-purpose PC, you are wasting the most powerful tool on your desk. You have to learn the power envelope. You have to learn the thermal limitations. You are driving a Ferrari in first gear if you don’t understand what’s happening under the hood.”

The Verdict: So, here is my promise to you. You will brick it. You will want to throw it against the wall. But the moment you decide to solve the problem instead of blaming the manufacturer, that is the exact moment you stop being a hobbyist and start being an engineer. You want to run with the Big Dogs? Then stop whining about the guardrails and start learning how to read the logs. See you in the next lesson.

So the question for you now is, are you really ready to Run with the Big Dogs? Are you ready to jump into the deep end of the pool, or do you want to return to the wading pond?

AI on the Edge LESSON 22: Understanding Pictures and Video Frames as a Data Structure

Hey guys, Paul McWhorter here with TopTechBoy.com, and today we are diving into the heart of computer vision. We’ve been playing around with getting images from the camera, but have you ever stopped to actually look at what a picture is when it’s inside your computer’s memory?

If you want to be a master of AI on the Edge, you have to stop thinking about images as “pictures” and start seeing them as what they really are: a massive, organized grid of numbers.

What is a Picture, Really?

In this lesson, we are peeling back the curtain on how OpenCV and Python handle video frames. When we call piCam.capture_array(), we aren’t just taking a snapshot; we are pulling a data array into memory.

Think of it like a giant spreadsheet where every single cell is a pixel.

  • Dimensions: Your image has a width and a height, which correspond to the number of rows and columns in that array. It is important to remember the row designator comes first, then the column, [ R, C]

  • The Depth (The RGB Channels): It’s not just a flat 2D grid! Each “cell” in that grid is actually a little sub-array containing three values: Red, Green, and Blue. That is why we call it a 3D data structure.

Manipulating Data, Not Just Pixels

The magic happens when you realize you can reach into that array and change those numbers directly.

In the code we developed today, we aren’t just displaying video; we are performing data science on video frames. We explored how to:

  1. Access individual pixels: By referencing specific coordinates in our frame array, we can pull out the color data for a single spot.

  2. Draw shapes by modifying arrays: Notice how we don’t need a “draw square” function to put a box on the screen? We simply tell a slice of that array to equal [0, 0, 255]. We are literally changing the color values of those pixels to solid red.

  3. Regions of Interest (ROI): This is critical for AI. You don’t always need to look at the whole frame. We learned how to “slice” the array to isolate a Region of Interest. By carving out a smaller piece of that memory, we can perform operations—like converting to grayscale—on just that section, which saves a massive amount of processing power.

Why Does This Matter?

If you want to build a robot that recognizes objects or tracks faces, you need to understand this structure. AI models don’t “see” a cat; they see a mathematical representation of that cat’s pixel values. By learning how to slice, manipulate, and convert these arrays, you are learning the fundamental language of machine learning.

We are building the foundation here, folks. Once you get comfortable with how to manipulate these arrays, we are going to start doing some really cool stuff with image processing and filtering.

Dive into that code, change those array values, and see what happens when you mess with the dimensions! Don’t just run it—experiment with it.

I’ll see you guys in the next lesson!

In this lesson we developed the following code:

 

AI on the Edge LESSON 7: Homework Solution for Dimmable LED

In Lesson 6, I gave you a homework challenge: build a dimmable LED using a potentiometer. In today’s Lesson 7, we go through the solution together step-by-step.

This lesson is all about taking analog input from a potentiometer and converting it into smooth PWM output to control the brightness of an LED. It’s a very practical project because it teaches you how to read real-world analog values and turn them into useful control signals — skills we’ll use again and again as we build smarter AI-powered projects.

In the video, I walk you through the complete working code. You’ll see how we read the potentiometer value (0 to 4095), convert that raw number into a proper brightness percentage using a bit of math (with a nice logarithmic curve so the brightness feels natural to the human eye), and then send that value to the LED using PWM. The result is a very smooth, responsive dimmer that feels professional.

Even though this seems like a simple project, it’s actually an important stepping stone. Understanding how to read sensors and smoothly control outputs is fundamental to building real AI on the Edge systems — whether you’re controlling motors, adjusting screen brightness, or varying the speed of a robot based on sensor input.

By the end of this lesson, you should have a solid understanding of how to combine the ADC (Analog to Digital Converter) with PWM output, and more importantly, how to think about mapping real-world inputs to useful outputs.

So if you did the homework, great job! If you got stuck, don’t worry — we go through the full solution together. And as always, I strongly encourage you to take the code and make it your own. Try changing the response curve, add multiple LEDs with different colors, or combine it with things we’ve learned in earlier lessons.

This is the kind of foundational hardware skill that will serve you well as we continue moving deeper into the AI on the Edge class. You’re doing great — keep going!

We are still using the schematic from our earlier project.

Fusion Hat Circuit Diagram
This is the circuit we will use moving forward in the class

In this lesson, this is the code which we came up with: