Tag Archives: AI on the Edge

No Cloud. No Internet. No Problem. Two Commands for Local LLM on Jetson Orin Nano

June 7, 2026 admin

Hey guys, welcome back to the channel. Paul McWhorter here from TopTechBoy.com. Today, we aren’t just messing around with simple circuits or basic scripts—we are going to take that NVIDIA Jetson Orin Nano we rescued from the brink of destruction in the last video, and we are going to turn it into a completely sovereign, local thinking machine.

I don’t know about you, but I am tired of Big Tech telling me I need a credit card, a monthly subscription, and a constant high-speed internet connection just to make an AI model reply to a prompt. Today, we are going to do it completely naked. We are going to cut the cord, pull the ethernet, and run cutting-edge Large Language Models entirely on the local physical silicon of your Jetson Orin Nano.

And we are going to do it in exactly two commands. One to build the engine room, and one to fire up the mind.

Let’s get started.

The Hardware Architecture

Before we drop the code into the terminal, let’s understand exactly what we are building today. We are dealing with three core components working together in a unified system.

The Model (The Fuel): This is your raw neural network file (like Google Gemma or Meta Llama). It contains the weights, vocabulary, and potential intelligence. On its own, it’s just a massive, inert file sitting on your storage drive.
Ollama (The Engine Room): This is the heavy lifter. Ollama is a local execution framework that takes that raw model file and boots it directly into the Jetson’s unified RAM and CUDA cores. It handles the brutal mathematical calculations required to generate tokens.
The Terminal Chat (The Dashboard): This is your interface. It provides the clean command-line text box for you to type your prompts and prints the model’s responses back to you in real time.

The Two-Command Installation

Go ahead and fire up your Jetson Orin Nano, open a fresh terminal window, and get ready to type. Remember: copying and pasting makes you weak. Type these out like a real engineer so your hands learn the muscle memory.

Command 1: Install the Ollama Engine

This command fetches the official automated bootstrapper script from Ollama and executes it locally to configure the background system service on your host OS.

curl -fsSL https://ollama.com/install.sh | sh

1	curl -fsSL https://ollama.com/install.sh \| sh

Command 2: Fire Up the Local Model

Once the installation script finishes, your engine room is live. Now, tell Ollama to pull down the optimized 1-billion parameter Google Gemma model and launch an interactive local dialog loop instantly:

ollama run gemma3:1b

1	ollama run gemma3:1b

The moment you hit enter, your Jetson will download the model weights directly to your local drive, load them straight into the VRAM, and drop you into a clean prompt box. Type a question, hit enter, and watch your local silicon generate answers with zero cloud dependencies.

Choosing the Right Mind for Your Machine

The beautiful part about setting up Ollama is that you aren’t locked into just one model. Different models have different parameter sizes and strengths. On the 8GB Jetson Orin Nano, you want to balance model size against your available hardware headroom to keep your generation speeds crisp.

Here are the verified, hardware-accelerated local models you can experiment with right out of the box:

Launch Command	Model Family	Size / Parameter Count	Best Used For
`ollama run gemma3:1b`	Google Gemma 3	1 Billion	Ultra-fast responses, light footprint
`ollama run llama3.2:1b`	Meta Llama 3.2	1 Billion	High-efficiency conversational loops
`ollama run phi4-mini:3.8b`	Microsoft Phi-4	3.8 Billion	Heavy reasoning and coding logic
`ollama run qwen3:4b`	Alibaba Qwen 3	4 Billion	Structured data and multilingual logic
`ollama run qwen3.5:4b`	Alibaba Qwen 3.5	4 Billion	Advanced context processing
`ollama run gemma3:4b`	Google Gemma 3	4 Billion	Maximum analytical depth on Orin Nano

⚠️ Paul’s Engineering Note on Headroom

The 1B (1-Billion parameter) models are incredibly light and will run at lightning speed on the Orin Nano. If you want to push the machine harder for more complex reasoning, step up to the 3.8B or 4B models. Just keep an eye on your system resources—running a 4B model pushes close to the limits of the Orin Nano’s 8GB unified memory architecture, especially if you are running a heavy graphical desktop environment in the background!

To exit out of any active terminal chat session and return to your standard command prompt, simply type:

/exit

/exit

Homework Assignment

Alright, you have the hardware running, you have the engine installed, and you know how to switch out the minds of your machine. Now it’s time for your homework.

I want you to install both the gemma3:1b model and the heavier gemma3:4b model on your Jetson Orin Nano. Run them both through a test sequence: ask them to write a simple Python script, and then ask them a complex logic riddle.

I want you to observe the difference in quality of thought versus speed of generation. Is the 4-billion parameter model smart enough to justify the extra computation time on your hardware, or does the 1-billion parameter model give you the snappy responsiveness you need for a real-time edge application?

Leave a comment down under the video showing your results, tell me which model you prefer running natively on your bench, and I will see you guys in the next lesson!

AI On the Edge, Tutorial

AI on the Edge LESSON 14: Control LED Color With Voice Commands on Raspberry Pi 5

May 12, 2026 admin

In Lesson 14 of AI on the Edge, we’re doing something really fun and powerful — we’re building a voice-controlled RGB LED that listens to you, changes colors on command, and even talks back with some personality! This is true edge AI running 100% locally on your Raspberry Pi with the Fusion HAT. No cloud, no internet, just fast, private, and responsive voice interaction right on your desk.

You simply speak a color — red, green, blue, cyan, magenta, yellow, off, or even quit — and the RGB LED instantly springs to life with beautiful color. But that’s not all. Every time you give a command, the system replies with a fun, playful spoken response using the Piper text-to-speech engine. It turns your Raspberry Pi into a charming little LED companion that feels alive and interactive.In this lesson, you’ll learn how to combine local Speech-to-Text with the STT library and natural-sounding Text-to-Speech with Piper. You’ll master PWM control of a full-color RGB LED through the Fusion HAT, and you’ll see how to use Python threading plus a queue to keep the voice listening running smoothly in the background without ever locking up your main program. The code is clean, well-structured, and includes proper startup greetings, graceful shutdown, and excellent resource cleanup — exactly the kind of solid practices we love in this series.What makes this project extra special is how it brings everything together. You get real-time voice recognition, instant hardware response, and spoken feedback — all happening locally on the edge. It’s fast, it’s private, and it’s incredibly satisfying to watch that LED light up exactly as you command while your Pi chats back at you.

Go ahead and watch the full Lesson 14 video, grab the complete code from the description, and build this project step by step with me. Once you have it running, I want you to play with it! Add new colors, create your own funny responses, or start thinking about how you could combine this voice control with sensors or other hardware in future projects.

This is the kind of hands-on, creative AI application that makes learning so exciting. You’re not just watching — you’re building real, useful skills that put you in the driver’s seat with artificial intelligence.

Fire up that Raspberry Pi, get your Fusion HAT ready, and let’s make some colors shine while the Pi talks back. I can’t wait to see what you create with this one!

Happy building, everyone — I’ll see you in the next lesson!

This is the schematic we are using for the project:

Fusion Hat Circuit Diagram — This is the circuit we will use moving forward in the class

This is the code we developed in the video:

from fusion_hat.pwm import PWM
from fusion_hat.stt import STT
import threading
from queue import Queue
from time import sleep
from fusion_hat.tts import Piper
tts = Piper()
tts.set_model('en_US-kristin-medium')
msg = 'Speak your favorite color, and your wish is my command'
tts.say(msg,stream=False)

redPin = 5
greenPin = 6
bluePin = 7

redLED= PWM(redPin)
greenLED=PWM(greenPin)
blueLED=PWM(bluePin)

rVal=0
gVal=0
bVal=0

stt = STT('en-us')
running=True
colorQ = Queue()

def getColor():
    print("Input Thread is Running")
    global running
    while running:
        print('What color: red, green, blue, cyan, magenta, yellow, off, quit')
        myColor = stt.listen(stream=False)
        myColor=myColor.strip()
        if myColor == 'quit':
            running = False
            msg = 'So Sorry to See you go, Please Call Me Again Soon'
            tts.say(msg, stream=False)
            break
        colorQ.put(myColor)
    print("Thread is Terminated")
colorThread= threading.Thread(target=getColor,daemon=True)
colorThread.start()
print("Main Program is Started")
try:
    while running:
        if colorQ.empty() == False:
            myColor = colorQ.get()
            print('Color: ',myColor)
            if myColor == 'off':
                rVal = 0
                gVal = 0
                bVal = 0
                msg = 'Please bring back your beautiful colors!'
                tts.say(msg, stream=False)
            if myColor == 'red' or myColor=='read':
                rVal = 100
                gVal = 0
                bVal = 0
                msg = 'I can not get you out of my head, so I will turn it red'
                tts.say(msg, stream=False)
            if myColor == 'green':
                rVal = 0
                gVal = 100
                bVal = 0
                msg = 'Because I want you to be seen, I will turn it green'
                tts.say(msg, stream=False)
            if myColor == 'blue':
                rVal = 0
                gVal = 0
                bVal = 100
                msg = 'Because I love you, I will turn it blue'
                tts.say(msg, stream=False)
            if myColor == 'cyan':
                rVal = 0
                gVal = 100
                bVal = 25
                msg = 'You turn my world cyan and bright, what a beautiful sight!'
                tts.say(msg, stream=False)
            if myColor == 'magenta':
                rVal = 100
                gVal = 0
                bVal = 100
                msg = 'You Light my Magenta Fire, You are my burning Desire'
                tts.say(msg, stream=False)
            if myColor == 'yellow':
                rVal = 100
                gVal = 25
                bVal = 0
                msg = 'You are such a handsome fellow, I will turn it yellow'
                tts.say(msg, stream=False)          
            myColor = 'null'
            redLED.pulse_width_percent(rVal)
            greenLED.pulse_width_percent(gVal)
            blueLED.pulse_width_percent(bVal)
    redLED.pulse_width_percent(0)
    greenLED.pulse_width_percent(0)
    blueLED.pulse_width_percent(0)
    redLED.enable(False)
    greenLED.enable(False)
    blueLED.enable(False)
    print("LEDs are Released")
    print("Program is Terminated")            
         
except KeyboardInterrupt:
    running = False
    redLED.pulse_width_percent(0)
    greenLED.pulse_width_percent(0)
    blueLED.pulse_width_percent(0)
    redLED.enable(False)
    greenLED.enable(False)
    blueLED.enable(False)
    print("LEDs are Released")
    print("Program is Terminated")

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

from fusion_hat.pwm import PWM

from fusion_hat.stt import STT

import threading

from queue import Queue

from time import sleep

from fusion_hat.tts import Piper

tts = Piper()

tts.set_model('en_US-kristin-medium')

msg = 'Speak your favorite color, and your wish is my command'

tts.say(msg,stream=False)

redPin = 5

greenPin = 6

bluePin = 7

redLED= PWM(redPin)

greenLED=PWM(greenPin)

blueLED=PWM(bluePin)

rVal=0

gVal=0

bVal=0

stt = STT('en-us')

running=True

colorQ = Queue()

def getColor():

print("Input Thread is Running")

global running

while running:

print('What color: red, green, blue, cyan, magenta, yellow, off, quit')

myColor = stt.listen(stream=False)

myColor=myColor.strip()

if myColor == 'quit':

running = False

msg = 'So Sorry to See you go, Please Call Me Again Soon'

tts.say(msg, stream=False)

break

colorQ.put(myColor)

print("Thread is Terminated")

colorThread= threading.Thread(target=getColor,daemon=True)

colorThread.start()

print("Main Program is Started")

try:

while running:

if colorQ.empty() == False:

myColor = colorQ.get()

print('Color: ',myColor)

if myColor == 'off':

rVal = 0

gVal = 0

bVal = 0

msg = 'Please bring back your beautiful colors!'

tts.say(msg, stream=False)

if myColor == 'red' or myColor=='read':

rVal = 100

gVal = 0

bVal = 0

msg = 'I can not get you out of my head, so I will turn it red'

tts.say(msg, stream=False)

if myColor == 'green':

rVal = 0

gVal = 100

bVal = 0

msg = 'Because I want you to be seen, I will turn it green'

tts.say(msg, stream=False)

if myColor == 'blue':

rVal = 0

gVal = 0

bVal = 100

msg = 'Because I love you, I will turn it blue'

tts.say(msg, stream=False)

if myColor == 'cyan':

rVal = 0

gVal = 100

bVal = 25

msg = 'You turn my world cyan and bright, what a beautiful sight!'

tts.say(msg, stream=False)

if myColor == 'magenta':

rVal = 100

gVal = 0

bVal = 100

msg = 'You Light my Magenta Fire, You are my burning Desire'

tts.say(msg, stream=False)

if myColor == 'yellow':

rVal = 100

gVal = 25

bVal = 0

msg = 'You are such a handsome fellow, I will turn it yellow'

tts.say(msg, stream=False)

myColor = 'null'

redLED.pulse_width_percent(rVal)

greenLED.pulse_width_percent(gVal)

blueLED.pulse_width_percent(bVal)

redLED.pulse_width_percent(0)

greenLED.pulse_width_percent(0)

blueLED.pulse_width_percent(0)

redLED.enable(False)

greenLED.enable(False)

blueLED.enable(False)

print("LEDs are Released")

print("Program is Terminated")

except KeyboardInterrupt:

running = False

redLED.pulse_width_percent(0)

greenLED.pulse_width_percent(0)

blueLED.pulse_width_percent(0)

redLED.enable(False)

greenLED.enable(False)

blueLED.enable(False)

print("LEDs are Released")

print("Program is Terminated")

Technology Tutorials

Tag Archives: AI on the Edge

No Cloud. No Internet. No Problem. Two Commands for Local LLM on Jetson Orin Nano

The Hardware Architecture

The Two-Command Installation

Command 1: Install the Ollama Engine

Command 2: Fire Up the Local Model

Choosing the Right Mind for Your Machine

⚠️ Paul’s Engineering Note on Headroom

Homework Assignment

AI on the Edge LESSON 14: Control LED Color With Voice Commands on Raspberry Pi 5

Making The World a Better Place One High Tech Project at a Time. Enjoy!