Thinking of doing object recognition on a Pi with the AI Hat or the OAK-D and want to so it simply?

14 views
Skip to first unread message

Michael Wimble

unread,
Feb 15, 2026, 3:44:45 AM (8 days ago) Feb 15
to hbrob...@googlegroups.com

Yet another "I did it so you don't have to" repo is available.

https://github.com/wimblerobotics/sigyn_ai

This will eventually support the Pi with AI Hat, OAK-D, and Jetson Orin Nano deployments.

The way it works now:

  • Take a bunch of pictures with and without objects you want to recognize.
  • Upload the pictures to your RoboFlow account. Don't have one? Make a free one.
  • Use RoboFlow to annotate the images. If you choose objects that it already knows about, it will likely annotate them for you as you go along and you just have to agree. Otherwise define some object classes, draw some boxes.
  • Build the appropriate version to download. For NVIDIA and Pi,  you want your images to be 640x640 (stretched). For the OAK-D, best performance is images at 416x416. I've only  used one of the Yolo models. Use Yolo5 for the OAK-D (although if you get the latest depthai and use Docker to run ros2/kilted you might be able to use later Yolo models) I also add:
    • Preprocessing: auto-adjust contrast using adaptive equalization.
    • Augmentations: rotation between -10 and +10 degrees, bounding box blur up to 2.5px, bounding box motion blur: length 100px, andgle 0 degrees, frames 1

    You also want to split your image set into train/validate/test. Use about 80% of the images for training, 10% each for validation and testing. It's good to have images that don't have the objects of interest as well.

    This gives the recognizer extra ability to recognize not-so-perfect objects.

  • # Download from RoboFlow
    python src/utils/roboflow_download.py --project FCC4 --version 4 --format yolov8
    
  • Create a copy of the config file, update it for your needs.
  • # Train with config file
    python src/training/train.py --config configs/training/can_detector_pihat.yaml
  • # Export for Pi 5 + Hailo-8
    python src/export/export.py --model models/checkpoints/can_detector_pihat_v1/weights/best.pt --device pi5_hailo8
    
    # Export for OAK-D
    python src/export/export.py --model models/checkpoints/can_detector_pihat_v1/weights/best.pt -
  • # Deploy to specific camera
    python src/deployment/deploy.py --model can_detector_pihat_v1 --target sigyn --camera gripper_cam

This assumes you need to remote deploy. Skip the last step if not. To remote deploy, set up ssh on both machines. Use ssh-copy-id so that you can issue commands without needing passwords. Run the script. You may need to adjust the deployment script as it probably assumes my directory structure.

I've only done this with the OAK-D on my Sigyn robot so far. I'll be testing and fixing the Pi and NVIDIA scripts soon.

Each device has a lot of tricky stuff to get it to work. This was the best effort between Claude and myself to simplify this. I think the script detects if you have an appropriate GPU and uses it. I have an older 2060 which I'm about to upgrade to a 3060. With my 2060, training on about 130 images, I think it took about 2 or 3 minutes to train and deploy.

You probably have questions. Good for you. Glad to see you're paying attention. I may or may not have answers. We can negotiate for consultation. I need a better robot arm, or home-made chocolate chip cookies using the Nestles recipe WITH NO MODIFICATIONS. I know you're mother had some secret changes to the recipe. Good! Keep it a secret, still.

Don't forget to STAR my repos--Christmas is coming and I want Santa to know how good I've been.

Pito Salas

unread,
Feb 15, 2026, 6:28:16 AM (7 days ago) Feb 15
to hbrob...@googlegroups.com
Hi Michael,

That’s a great contribution. The code, packaging and documentation looks nice! Can you share the Claude.md’s you use in this or other developments? (I’m assuming you had an ai coding assistant here and there). 

Best,

Pito Salas
Boston Robot Hackers &&
Computer Science Faculty, Brandeis University




On Feb 15, 2026, at 4:44 AM, Michael Wimble <mwi...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "HomeBrew Robotics Club" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hbrobotics+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hbrobotics/ac868033-cc7e-48ec-8f1f-e36e2b78a6a9%40gmail.com.

Michael Wimble

unread,
Feb 15, 2026, 5:57:32 PM (7 days ago) Feb 15
to 'Pito Salas' via HomeBrew Robotics Club

I'll paste the initial prompt below. It's only mildly interesting, in that there are dozens of interactions after that to get the repo built. And, as nearly always happens, I start on on one AI, Claude Sonnet 4.5 in this case, and have to switch to another when the current AI gets stuck. This has been a frustrating project. Neither this nor GPT-5.3-Codex really understand Yolo, have less understanding of the limitations of the OAK-D camera, and really struggle with the Pi AI Hat. Each AI camera has limitations on what models they accept (e.g., Yolo26), what file formats they accept (e.g., onnx), and the Pi AI Hat has issues with model size that will fit on the device. I spent all day today (Sunday), trying to get Chat GPT to get an output that would work on the Pi. And you also have to figure out how many frames per second you need to get out of the AI to be useful. My robot needs a good 10+ FPS to work with my fast-moving robot.


And, GPT-5.3 doesn't play nice with VS. It kept giving me a script to run, I'd run it, it wouldn't work, and it said I didn't run the script it gave. We had a dozen  exchanges where I kept saying that the string it pasted is not the string it described. I'd paste it's string back to it and it would say, "Yes, that's my mistake. I won't do that again." and it did it again, over and over.

The point being, this took dozens of interactions to get the AI to help me do this. The AI has never, even once, produced something usable to start even with long and carefully crafted prompts (I've watched dozens of YouTubes trying to tell me their magical formula for writing prompts). I have a long rant on Facebook about how I actually got face-red mad shouting at the AI a few days back. It's incredibly irritating when the AI wants to act like a person when, in fact, it's a depressed, paranoid, Hitchhiker's Guide To The Galaxy robot that thinks it has the IQ the size of a planet when, in fact, it's an almost correct statistical anomaly that hasn't even a glimmer of the ability to actually think.  

I really don't understand how people are producing interesting software with junior programmers. When I look at the output of AI, it has almost never been correct. It's close. But it has always taken an experienced eye to fix it.

So, for your amusement:

Look in /home/ros/sigyn_ws/src/Sigyn/.github/SIGYN_ARCHITECTURE_ONBOARDING.md for background information about the Sigyn robot.

I think I want to create a new git repo. Let's call it sigyn_ai. It needs to include documentation and scripts to help me build new AI models.

Currently, I've been creating annotated images on RoboFlow and using those to create two different AI models. One here, to create the bits I need to generate a CokeZero soda can detector to run on the Pi 5 with the AI hat. The other is the same except it needs to run on the OAK-D camera. I believe they use different underlying YOLO models. I'm also going to want to create an object recognizer to run on an NVIDIA Jetson Orion Nano to do the same. In the future I'm also going to want to:

- Create object recognizers with more classes, probably household objects that I'd either want to manipulate, like clothing, food, and light switches, power outlets. Things that a personal assistive robot will want to recognize and manipulate.

- Maybe create a semantic segmentation recognizer. I'm thinking that I might want to recognize clutter on the floor, safe pathways on the floor, table surfaces, glass panes, doorways and such.

I'm not sure what else I may need yet as Sigyn gets more capabilities.

So I think I want a repo that deals with all of the AI models for all of the devices. I don't know yet how RoboFlow can help me, but I have some knowledge about how just using Yolo can do things. In the Sigyn repo, there is a can_do_challenge package that I just finished today that can go from one room in the house to another, find a can of soda, and bring it back to me. This is just the beginning. I expect to replace the simple arm and gripper on Sigyn in the future so that it can pickup socks on the floor, clean a toilet, check the house for unexpected changes, etc.

I don't want to have to become an expert in "yet another thing" anymore than necessary. Building an assistive robot all by myself is a huge chore. I've had to relearn electronics, design an build custom PC boards, learn behavior trees, learn ROS2 at an expert level and so on. I'm a well-seasoned and pretty competent computer scientist and developer, but I need to reduce how much expertise I need to develop to do the next thing.

So this task is to design and build a repo to manage all the AI needs I have now and will have over the next couple of years, hinted at by the above.  I want to be able to come here after not having looked at this repo for some time, so I've forgotten much of it, and find relatively simple instructions for building whatever AI models I've already built and deployed. If you look at the bluetooth package in Sigyn, you can discover that I have two joystick buttons that send a message requesting the OAK-D or the Pi camera to take a picture. Those pictures get put into a directory and I copy them to RoboFlow. I currently self annotate them because it seemed like the Pi AI wanted bounding boxes, not polygons, but maybe I could have gotten around that. RoboFlow could annotate to generate polygons, saving me a lot of time. But then I just create a database version there and download those images back here.

Currently I have an old Nvidia 2060 GPU to help do training, though I might use the free Google Collab services for training in the future. Or I'd use the RoboFlow training if I knew how to do so and get something I could deploy on Sigyn. So you see I'm looking to get a document that describes how I build a training set, where and how I do the training, with maybe a couple of options, and how I can run a small set of scripts to create artifacts to deploy to the Pi, OAK-D and Orion devices.

The training data we used here for the Pi is currently the most current set. The data I used for the OAK-D is relatively old, I have a few more images that would help a lot. Eventually I plan on having a lot more training data.

With all of that, I want your help to set up this project as a git repo. Can you advise and help? Ask questions.

Thomas Messerschmidt

unread,
Feb 15, 2026, 6:27:59 PM (7 days ago) Feb 15
to hbrob...@googlegroups.com
Gpt 5.3? Don’t you mean 5.2.

I’ve been having less than stellar results with 5.2 compared to 5.1. 


Thomas Messerschmidt

-  

Need something prototyped, built or coded? I’ve been building prototypes for companies for 15 years. I am now incorporating generative AI into products.

Contact me directly or through LinkedIn:   




On Feb 15, 2026, at 2:57 PM, Michael Wimble <mwi...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "HomeBrew Robotics Club" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hbrobotics+...@googlegroups.com.

Michael Wimble

unread,
Feb 15, 2026, 7:59:43 PM (7 days ago) Feb 15
to hbrob...@googlegroups.com
No, I was using GPT-5.3-Codex

Thomas Messerschmidt

unread,
Feb 15, 2026, 10:48:59 PM (7 days ago) Feb 15
to hbrob...@googlegroups.com, hbrob...@googlegroups.com
Oh. OK.


Thomas Messerschmidt

-

Need something prototyped, built or coded? I’ve been building prototypes for companies for 15 years. I am now incorporating generative AI into products.

Contact me directly or through LinkedIn:

https://www.linkedin.com/in/ai-robotics/



On Feb 15, 2026, at 4:59 PM, Michael Wimble <mwi...@gmail.com> wrote:

No, I was using GPT-5.3-Codex
--
You received this message because you are subscribed to the Google Groups "HomeBrew Robotics Club" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hbrobotics+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hbrobotics/ED02190D-8046-4215-8748-5E3B5683F2C6%40gmail.com.
Reply all
Reply to author
Forward
0 new messages