Alfie build thread

158 views
Skip to first unread message

Alan Timm

unread,
Apr 15, 2025, 11:38:01 PMApr 15
to RSSC-List
Hey there!

I'm getting closer to (re)assembling alfie.  The 12 20a buck converter is working well, although I think it's time to shorten a whole bunch of cables so that everything fits (it doesn't right yet).

Also I've fallen into a bit of a rabbit hole wrt on-board processing.  I rage-quit my indiedroid nova SBC and have moved on to the Radxa Rock 5C with 16gb ram.

There are some compelling options for on-device speech synthesis, speech recognition?!, and large/small language models?!  It's crazy that you can run these on a raspberry pi sized device.
I think? the qwen models are capable of tool use, but you can run several combinations of these on an 8gb ram sbc, and the whole stack with room to spare on a 16gb device.

Here's a sample of libretts_r_medium voice 4 (there's 903 total voices available) linked in the message.

PXL_20250416_005108390.jpg
assistant.mp3

Gmail

unread,
Apr 18, 2025, 10:18:33 PMApr 18
to Alan Timm, RSSC-List
Alan,

ChatGPT 4o says, 

 I can identify and classify thousands of object types in uploaded photos, common categories include:

  • People (faces, age/gender estimates, activities)
  • Animals (species, breeds)
  • Plants (trees, flowers, leaves)
  • Food (types of dishes, ingredients)
  • Text (printed/handwritten, languages)
  • Vehicles (cars, planes, bikes)
  • Buildings (types, landmarks)
  • Everyday objects (furniture, tools, electronics)
  • Clothing (styles, colors, accessories)
  • Signs and labels (road signs, logos, warnings)”

Can you recommend a similar (free) on-device image classification model? I mean something more like chatgpt and less like YOLO. I am ok if it requires a standard or even a gaming laptop with a high end GPU. 


Thomas

-  

Need something prototyped, built or coded? I’ve been building prototypes for companies for 15 years. I am now incorporating generative AI into products.

-

Need a great hardworking engineer? I am currently looking for a new job opportunity in robotics and/ or AI. 

Contact me directly or through LinkedIn:   


On Apr 15, 2025, at 8:38 PM, Alan Timm <gest...@gmail.com> wrote:

on-device

Chris Albertson

unread,
Apr 18, 2025, 11:21:22 PMApr 18
to Gmail, gestalt73, RSSC-list

On Apr 18, 2025, at 7:18 PM, Gmail <thomas...@gmail.com> wrote:

Alan,

ChatGPT 4o says, 

 I can identify and classify thousands of object types in uploaded photos, common categories include:

  • People (faces, age/gender estimates, activities)
  • Animals (species, breeds)
  • Plants (trees, flowers, leaves)
  • Food (types of dishes, ingredients)
  • Text (printed/handwritten, languages)
  • Vehicles (cars, planes, bikes)
  • Buildings (types, landmarks)
  • Everyday objects (furniture, tools, electronics)
  • Clothing (styles, colors, accessories)
  • Signs and labels (road signs, logos, warnings)”


Can you recommend a similar (free) on-device image classification model? I mean something more like chatgpt and less like YOLO. I am ok if it requires a standard or even a gaming laptop with a high end GPU. 

What you need is not so much the powerfull GPU, but one with huge amounts of VRAM.  The models that can identify all those things are huge, many billions of parameters.  And it really has to be vram that the GPU can access.      Even a “‘small” 20 billion parameter model will require 20GB VRAM.   Not to be found on a noteboook PC GPU.     Possibly an Apple Mac could work because of its unified RAM model.

But if you are after effecency the YOLO-like model that is trained on the images you care about is the best as it can run on a Raspbnerry Pi with one of those “AI Chips” attached.

OK, but you want a publically available open source LLM,….   Go to Hugging Face and serch for one.  

Alan Timm

unread,
Apr 19, 2025, 1:18:36 PMApr 19
to RSSC-List
Hey Thomas,

Good morning!

There are a couple of ways to answer your question, and it all depends on how much iron you're willing to throw at the problem.

My current rabbit hole involves running these models on an rk3588 sbc with 16gb of ram, so this 3.8B llava-phi3 model caught my eye:

It's generating text responses at about 6 tokens per second, but I haven't tried the image capabilities yet.   It's taking up about 9GB of ram at the moment

as well as this rkllama library that purports to run models on the rk3588 npu:

I'm not sure if/how much faster that will be than taking up all 8 cpu cores. I'll probably take a closer look this week.

But...  There's probably a near future when I need to add in an nvidia jetson board for some more GPU horsepower, in which case you might be looking at the 16gb orin nx and carrier board:


I'd probably start with that llava-phi3 model and work your way upwards from there.

Alan

Alan Timm

unread,
Apr 19, 2025, 1:28:41 PMApr 19
to RSSC-List
Here's the result of passing in the attached image and asking "What's in the image?" on my Radxa Rock 5C, 15GB ram 8 core sbc @ 1.8Ghz
The round trip time was almost 2 minutes.  So not fast, but maybe useful?

>>> what is in /home/alfie/Pictures/homeoffice.jpg
Added image '/home/alfie/Pictures/homeoffice.jpg'
The image shows an old school desktop computer setup with a yellow plastic chair in front of it. The laptop
screen displays "03:58" and the mouse is black. There are two mugs next to the keyboard - one is green and
the other is white. On the desk, there is also a potted plant with green leaves.

total duration:       1m57.419420595s
load duration:        4.535755612s
prompt eval count:    716 token(s)
prompt eval duration: 1m38.395394584s
prompt eval rate:     7.28 tokens/s
eval count:           73 token(s)
eval duration:        14.425655452s
eval rate:            5.06 tokens/s
homeoffice.jpg

Chris Albertson

unread,
Apr 19, 2025, 2:55:05 PMApr 19
to gestalt73, RSSC-list
Here is the problem or really the choise you have.  

(1) you can use LLM-based technology and, after two minutes get a written paragraph that nicely describes the image.  You would then have to process the paragraph to extract information.   This is good because it shows the model can accept just about anything you show it. Or,…

(2) you can run a version of YOLO and it will return a list of objects with bounding box coordinates but it will only see objects that it is trained to see.   But it runs on modest hardware.  I was able to get 30 frames per second on a Linux PC.   This means YOLO was able to process live video in real time. (my test data was a Hollywood action film)   The objects and the boxes were stored in a database-like list that could be queried.

I think what you do depends on what the task is.   A navigation task need the coordinates in (x,y) of each object and can’t wait 2 minutes,  By “navigation” I mean not just rolling on wheels but an arm grasping an object.

But perhaps the robot’s job is to answer questions like. “Robbie, did UPS deliver my package? is it on the porch?” Then the LLM would be ideal   But to open the door and pick up that box, you need more classic vision, photogrammetry, not AI.

It is interesting to see how Tesla handles this.  The cameras run at about 30 FPS and then data is sent to about 5 different models and each model is run independently, in parallel. Each model turns ther image frames into data.  This may be the solution for robots.  Don’t choose.  The correct answer is “all of the above”.

On Apr 19, 2025, at 10:28 AM, Alan Timm <gest...@gmail.com> wrote:

Here's the result of passing in the attached image and asking "What's in the image?" on my Radxa Rock 5C, 15GB ram 8 core sbc @ 1.8Ghz
The round trip time was almost 2 minutes.  So not fast, but maybe useful?

>>> what is in /home/alfie/Pictures/homeoffice.jpg
Added image '/home/alfie/Pictures/homeoffice.jpg'
The image shows an old school desktop computer setup with a yellow plastic chair in front of it. The laptop 
screen displays "03:58" and the mouse is black. There are two mugs next to the keyboard - one is green and 
the other is white. On the desk, there is also a potted plant with green leaves.

total duration:       1m57.419420595s
load duration:        4.535755612s
prompt eval count:    716 token(s)
prompt eval duration: 1m38.395394584s
prompt eval rate:     7.28 tokens/s
eval count:           73 token(s)
eval duration:        14.425655452s
eval rate:            5.06 tokens/s
<homeoffice.jpg>
-- 
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/rssc-list/9afb46ba-07e8-49fc-a4f2-56cfe9083706n%40googlegroups.com.
<homeoffice.jpg>

Jim DiNunzio

unread,
Apr 19, 2025, 3:04:12 PMApr 19
to Alan Timm, RSSC-List
I got a nice Easter present after a 4 month wait.  I definitely want to try out a vision model like that with this running 67 TOPs and max of 25 watts. After that figure out a robot to wield it. 
Jim

--
20250419_113901.jpg

Alan Timm

unread,
Apr 19, 2025, 9:02:40 PMApr 19
to RSSC-List
Hey Chris & Thomas,
   Yep, it all depends on what problem(s) you're trying to solve, how fast you need the feedback, and ultimately where you want the processing to occur.  Usually you optimize for two and put up with whatever's left for the third.

For alfie, I'll host a handful of these SLM models on the SBC for a bit to see if there's any practical use for them.  so far piper tts is faster than real time with < 1 second latency to first utterance.  I'll check out faster-whisper next.

Hey Jim,
   Ohmygosh, you got one?!?!?  Nice!  There's a software unlock to update all the jetson nano and orin boards to super status with a corresponding increase in power use and TOPs.

For alfie, after I complete systems integration and get the ros scaffolding up it'll be time for "operation: hi five!" to train a neural model for him that gives high fives whenever someone holds up their hand the right way.  That'll tell me alot more about what type of processing power i need to have on board, and I have the orin and carrier board on a wishlist.  It'll connect to the rock 5c over 1gb ethernet and will be nestled on the base under the battery. 

Alan

Gmail

unread,
Apr 21, 2025, 4:07:27 PMApr 21
to Alan Timm, RSSC-List
Hey Alan,

Thanks for this and your other replies. When I get a few minutes (Hours? Days?) I will attempt to download, install, configure, and try out that model sometime later this week. 

Did you say that you found it took about 2 minutes for analysis of a photo? 

I’m going to be running this on my gaming laptop with its 4070 gpu. 

  • Intel® Core™ i9-14900HX Processor
  • NVIDIA® GeForce RTX™ 4070 Laptop GPU,
  • 8GB GDDR6
  • 32GB DDR5 Memory
  • 1TB NVMe PCle SSD
  • 16" 16:10 FHD+ (1920x1200), 144Hz
  • Thunderbolt™ 4



Thanks again! Wish me luck 🍀! 


Thomas


Thomas

-  

Need something prototyped, built or coded? I’ve been building prototypes for companies for 15 years. I am now incorporating generative AI into products.

-

Need a great hardworking engineer? I am currently looking for a new job opportunity in robotics and/ or AI. 

Contact me directly or through LinkedIn:   


On Apr 19, 2025, at 10:18 AM, Alan Timm <gest...@gmail.com> wrote:

Hey Thomas,
--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.

Alan Timm

unread,
Apr 21, 2025, 8:59:01 PMApr 21
to RSSC-List
Running on an rtx 4070?  That 3.8b vision model will run ALOT faster.  It took two minutes on my raspberry pi.  :-)

Alan

Alan Timm

unread,
Apr 21, 2025, 9:05:38 PMApr 21
to RSSC-List
Alfie can shrug now!

The tales that I could tell (and probably will next month) about what I ran into while getting this to work.

The initialize procedure uses one of the three opto switches from the delta printer to detect max down position, then travels 390mm to the top position.

That shrug at the top?  That's a flourish.  Totally unnecessary, but that's why I go up to 390mm and not 400mm.  You have to leave a little bit of room for the occasional shrug.  :-)

screenshot_21042025_180100.jpg

Gmail

unread,
Apr 21, 2025, 9:20:19 PMApr 21
to Alan Timm, RSSC-List
Well, I don’t know. 🤷🏻 
😆



Thomas

-  

Need something prototyped, built or coded? I’ve been building prototypes for companies for 15 years. I am now incorporating generative AI into products.

-

Need a great hardworking engineer? I am currently looking for a new job opportunity in robotics and/ or AI. 

Contact me directly or through LinkedIn:   


On Apr 21, 2025, at 6:05 PM, Alan Timm <gest...@gmail.com> wrote:

Alfie can shrug now!

The tales that I could tell (and probably will next month) about what I ran into while getting this to work.

The initialize procedure uses one of the three opto switches from the delta printer to detect max down position, then travels 390mm to the top position.

That shrug at the top?  That's a flourish.  Totally unnecessary, but that's why I go up to 390mm and not 400mm.  You have to leave a little bit of room for the occasional shrug.  :-)

<screenshot_21042025_180100.jpg>


On Tuesday, April 15, 2025 at 8:38:01 PM UTC-7 Alan Timm wrote:
Hey there!

I'm getting closer to (re)assembling alfie.  The 12 20a buck converter is working well, although I think it's time to shorten a whole bunch of cables so that everything fits (it doesn't right yet).

Also I've fallen into a bit of a rabbit hole wrt on-board processing.  I rage-quit my indiedroid nova SBC and have moved on to the Radxa Rock 5C with 16gb ram.

There are some compelling options for on-device speech synthesis, speech recognition?!, and large/small language models?!  It's crazy that you can run these on a raspberry pi sized device.
I think? the qwen models are capable of tool use, but you can run several combinations of these on an 8gb ram sbc, and the whole stack with room to spare on a 16gb device.

Here's a sample of libretts_r_medium voice 4 (there's 903 total voices available) linked in the message.

PXL_20250416_005108390.jpg

--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.

Gmail

unread,
Apr 21, 2025, 9:24:49 PMApr 21
to Alan Timm, RSSC-List
I hope “a LOT faster” means under 15 seconds. 



Thomas








-  

Need something prototyped, built or coded? I’ve been building prototypes for companies for 15 years. I am now incorporating generative AI into products.

-

Need a great hardworking engineer? I am currently looking for a new job opportunity in robotics and/ or AI. 

Contact me directly or through LinkedIn:   


On Apr 21, 2025, at 5:59 PM, Alan Timm <gest...@gmail.com> wrote:

Running on an rtx 4070?  That 3.8b vision model will run ALOT faster.  It took two minutes on my raspberry pi.  :-)

Alan Timm

unread,
Apr 21, 2025, 11:33:46 PMApr 21
to RSSC-List
I don't know what the performance difference is between a laptop rtx 4070 and a desktop rtx 3090..

But on my desktop rtx 3090 it was *a bit* faster...
like ~ 0.5 seconds total.

  ❯❯ /home/alansrobotlab : ollama run llava-phi3 --verbose
>>> what is in /home/alansrobotlab/Pictures/homeoffice.jpg
Added image '/home/alansrobotlab/Pictures/homeoffice.jpg'
1. A wooden desk with a laptop on it, next to two black coffee mugs and a plant. The time displayed on the laptop
screen is 03:58. There is also a yellow plastic chair with wheels tucked underneath the desk.
2. A window in the room that shows a view of trees outside.

total duration:       463.085074ms
load duration:        14.85164ms
prompt eval count:    589 token(s)
prompt eval duration: 23.863499ms
prompt eval rate:     24682.05 tokens/s
eval count:           76 token(s)
eval duration:        417.033728ms
eval rate:            182.24 tokens/s

Gmail

unread,
Apr 22, 2025, 12:36:53 AMApr 22
to Alan Timm, RSSC-List
Well, for basic world knowledge , < 3 seconds would be fine. For real time robot navigating (navigating through a home by camera alone is one of my goals/ use cases), .5 seconds might be a bit too slow. 

Assuming 1.5 MPH, a robot would go a bit more than a foot in a second. I suppose then that the robot would have to stop every so often and check its path. I have been doing a lot experimentation with uploading videos to ChatGPT 4o using the API. ChatGPT 4o has gotten a lot better over the last few months. They are teasing us about version 5. I can’t wait! 

OpenAI has also released (beta) the “live vision“ video and audio analysis. I have been using it for the last several months, and I find it to be laggy. It falls behind as much as 30 seconds after only three or four minutes of use. Also, OpenAI limits me to using it for about 15 minutes a day. Still, it’s truly amazing for interactive conversations. All of a sudden, your robot is no longer blind. You want your robot to have conversations similar to the sci-fi robots Robby, C3PO, Johnny 5, or Rosie? This is your answer!  BUT, I have tried for robot navigation and unfortunately when it comes to navigation, I found that this feature is not yet ready for prime time. 😆



Thomas

-  

Need something prototyped, built or coded? I’ve been building prototypes for companies for 15 years. I am now incorporating generative AI into products.

-

Need a great hardworking engineer? I am currently looking for a new job opportunity in robotics and/ or AI. 

Contact me directly or through LinkedIn:   


On Apr 21, 2025, at 8:33 PM, Alan Timm <gest...@gmail.com> wrote:

I don't know what the performance difference is between a laptop rtx 4070 and a desktop rtx 3090..
--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.

Alan Timm

unread,
May 15, 2025, 11:30:47 PMMay 15
to RSSC-List
For those of you that attended last weekend's meeting you heard Alfie's voice. Using piper tts and voice 65 out of 906 is faster than real time.  He sounds pretty good and natural-ish for on-device generation.

More recently nvidia quietly released a new ASR automatic speech recognition model called parakeet v2 0.6b.  It also runs much faster than real time and out performs whisper is both speed and accuracy.

The default 16 bit model transcribes speech at twice real-time (3.6 seconds for 7 seconds of speech)

There's also an onnx-asr project and converted onnx model that transcribes speech at 4 times real time (1.5 seconds for 7 seconds of speech).

I'll still need a speech detector, maybe a wake word detector?  and a diarazation model but i'm amazed about how well all this works on a raspberry pi 5 type sbc.

Alan
this_is_alfie.wav

Alan Timm

unread,
Jun 5, 2025, 5:32:40 PMJun 5
to RSSC-List
I've made alot of progress with on-device functionality for Alfie.  Here's a quick demo of silero speech detection and nvidia parakeet asr on the radxa rock 5c.

We'll talk alot more about it next next weekend!

screenshot_05062025_143051.jpg

Alan Timm

unread,
Jun 17, 2025, 12:09:21 AMJun 17
to RSSC-List
Following Brian's lead I've started syncing my work with a github repository:

Among other things that keeps a copy of the code safe in case I do something dumb, which is known to happen.  :-)

Also I think Jim makes a convincing argument for using a wakeword.

Hey Jim, what was that shiny new wakeword program you're using?  


Alan

Jim DiNunzio

unread,
Jun 17, 2025, 2:35:44 AMJun 17
to Alan Timm, RSSC-List

Hi Alan,

 

I’m using Porcupine Wake Word by Pico Voice. It runs locally on your machine and is free for non-commercial projects. You can create one wake word per month. Sign up and click the non-commercial option, and agree not to aspire to make any money with it  (at least while using their tech!)

 

https://picovoice.ai/platform/porcupine/

https://picovoice.ai/docs/quick-start/porcupine-python/

 

You can see my example code utilizing two wake words:

This is a simple test which only requires pvporcupine and pyaudio and your wake word ppn file you get from picovoice:

https://github.com/jimdinunzio/big-orange/blob/Python-3.9/python/tests/test_porcupine_wake_word.py

 

As a career software guy, I’m a big fan of github and development records. All Big Orange code (and my other projects’ code) has been on github since 2020.

 

https://github.com/jimdinunzio/big-orange/

 

Jim

--

You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.

Sergei G

unread,
Jun 17, 2025, 11:27:41 AMJun 17
to Alan Timm, RSSC-List, j...@dinunzio.com
One of the overlooked useful features of GitHub is the ability to create and edit formatted README.md files right from the browser.

You can organize your notes and share your finds with the world (well, with the Club ;-)) - for free. It is probably the most durable/reliable storage of documentation and code at the moment.

Here is my frequently updated collection: https://github.com/slgrobotics/robots_bringup/tree/main


Best Regards,
-- Sergei


From: rssc...@googlegroups.com <rssc...@googlegroups.com> on behalf of Jim DiNunzio <j...@dinunzio.com>
Sent: Tuesday, June 17, 2025 1:35 AM
To: 'Alan Timm' <gest...@gmail.com>; 'RSSC-List' <rssc...@googlegroups.com>
Subject: RE: [RSSC-List] Alfie build thread
 

Thomas Messerschmidt

unread,
Jun 17, 2025, 8:23:33 PMJun 17
to j...@dinunzio.com, Alan Timm, RSSC-List
Thanks for sharing the links Jim. 


Thomas



On Jun 16, 2025, at 11:35 PM, Jim DiNunzio <j...@dinunzio.com> wrote:



Alan Timm

unread,
Jun 23, 2025, 12:08:32 AMJun 23
to RSSC-List
Alfie's next upgrade:  The Jetson Nano NX 16gb.

It's about the size of two pis stacked on top of each other, and is capable of 100tops with this carrier board.
It'll fit perfectly in the base once i move the buck converter.

Right out of the box it's generating llm tokens at twice the speed of the rock 5c with ollama which seems... a little slow.  
I expected it to be ALOT faster.  18 vs 40 tokens per second isn't bad, but not really impressive for dedicated gpu hardware.

Plus there was a press release stating that these boards could generate tokens so much faster, but they don't say HOW.
I suspect they're using tensorrt-llm to run the models, so that's what I've been working on this weekend.

screenshot_22062025_205909.jpg

Nathan Lewis

unread,
Jun 23, 2025, 9:47:42 AMJun 23
to RSSC-list
That's awesome! Does that board have way to connect to the camera inputs on the Orin module?

- Nathan
--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.

Alan Timm

unread,
Jun 23, 2025, 10:45:10 PMJun 23
to RSSC-List
Hey Nate! 

I took a closer look at the carrier board and the expansion board and there's no camera inputs.  :-(

That's kinda a bummer that they didn't make the cut.

They very recently released a slightly larger version of the carrier board that supports the full Super MAXN modes.  This one includes 4x CSI camera connectors.

Alan

Alan Timm

unread,
Jun 24, 2025, 12:22:10 AMJun 24
to RSSC-List
Oof, what an adventure, here's how to run accelerated llms on the jetson (and also on nvidia gpus)

tldr; 
mlc_llm chat HF://mlc-ai/Qwen3-1.7B-q4f16_1-MLC --device cuda
or...
mlc_llm serve HF://mlc-ai/Qwen3-1.7B-q4f16_1-MLC --device cuda

Firstly, in order to maintain sanity with the fast pace of changes in all the ai stuff there's a meta package called jetson-containers that dockerizes most of the things you'd want to do on the board.  super handy if you're running jetson hardware.

Secondly in their press release I figured out that they're running llms under mlc-llm, which compiles language models into something that can run much faster than on ollama or huggingface transformers.

So here's the final stats for Qwen3-0.6B 4bit quantized:
  • Radxa Rock 5C (Ollama):        18 tokens per second
  • Jetson Orin NX 16GB (Ollama):  37 tokens per second
  • Jetson Orin NX 16GB (mlc-llm): 98 tokens per second
That's pretty good.

And here's a few more stats on the Orin for different versions of the model under mlc:
  • Qwen3-0.6B: 98 tokens per second
  • Qwen3-1.7B: 50 tokens per second
  • Qwen3-4B:   22 tokens per second
  • Qwen3-8B:   13 tokens per second

Alan Timm

unread,
Jun 24, 2025, 11:39:15 PMJun 24
to RSSC-List
Tonight I benchmarked a handful of qwen3 models on my rtx 3090 and on my jetson using ollama and mlc in the background while working on other things.

I'd say that the performance improvement makes moving your SLMs to mlc_llm worthwhile.
I didn't expect there to be diminishing returns on larger models vs smaller models.  That's interesting.
The current gameplan is to host the model using mlc_llm serve, then interact with it using langgraph etc.


Proompt:   write a haiku about my third favorite mini slinkie

(Average of 3 runs)
(ollama models are unsloth Q4_0 quantized)
(mlc models are q4f16_1 quantized)

screenshot_24062025_203032.jpg

screenshot_24062025_202959.jpg

Alan Timm

unread,
Jul 5, 2025, 8:29:23 PMJul 5
to RSSC-List
Ok, so this weekend is the weekend I integrate the new jetson orin nx into Alfie.  (I was getting frustrated bouncing back and forth between the two systems.)

Here's a quick shot of just how small the jetson modules are, they're only 70mm wide. and another shot of the naked module with the carrier board.  This fits exactly where the previous stack of the radxa rock 5c and the 12v buck converter was.
Now I just need to design a new cubbyhole for the new 30amp buck converter.

There's just enough IO for everything except for one thing -- there's no GPIO on the board.  Luckily I have a few spare GPIOs on the waveshare driver board so I'll move over the shoulder limit switch to that.
  • USBA - oakd lite depth camera
  • USBA - waveshareboard 1 comms
  • USBC - Seeedstudio mic array
  • USB2.0 header - waveshareboard 2 comms
  • Debug Serial - host communications over usb pigtail
  • Serial header - closed loop stepper ttl serial comms at 115,000baud

On the LLM front... after spending an inordinate amount of time optimizing the qwen3 0.6b model for speed I remembered one of the first things that Seth said and...  the 0.6b model isn't very useful. So I've moved on to the qwen3 1.7b model and am getting ~50 tokens per second with it.

screenshot_05072025_171323.jpg

screenshot_05072025_171357.jpg

Alan Timm

unread,
Oct 10, 2025, 3:48:05 PMOct 10
to RSSC-List
Oof, it's been a minute, hasn't it.
With Dani's help Alfie has been reassembled and he's been online consistently for the past week or two.

Here's a look at the newelectronics bay with that jetson stuffed in.  It's a tight fit but it works.
screenshot_10102025_124202.jpg

I also reverse engineered the bottom plate and added in vents to help keep everything cool.
screenshot_10102025_124535.jpg

screenshot_10102025_124349.jpg

Alan Timm

unread,
Oct 12, 2025, 4:04:50 PMOct 12
to RSSC-List
Hey guys, quick update since I wasn't able to stick around for show'n'tell this time.

The code for the waveshare general driver boards is about 80% complete.
VSCode + PlatformIO + FreeRTOS + MicroROS is kinda awesome once you get the hang of it.

At this point I have the boards:
  • generating full 100hz status packets including diagnostic info
  • capturing critical serial bus servo stats for up to 10 servos
  • passing along 9 axis imu info
  • accepting 100hz command packets for the same
I just posted an update to the repo with all of the changes I've been working on.

And I know I'm repeating myself but "GET YOURSELF SOME GITHUB COPILOT!"
Even the free plan is incredibly useful.

I've been partnering with copilot for everything from code refactors to helping me to understand why my freertos + microros solution wasn't generating update messages at 100hz like I thought it should.
It's like having an infinitely patient subject matter expert looking over your shoulder to jump in and offer advice, explanations, and help when you need it.

Alan

screenshot_12102025_125703.jpg

Alan Timm

unread,
Oct 15, 2025, 10:54:03 PMOct 15
to RSSC-List
Ok guys and gals I just gotta say...  Pair programming with Github Copilot is MAGICAL!

I've covered more ground in the past few days that I would have been able to over the next month, and that's IF I could have maintained focus long enough to deliver.
(That's questionable, I seem to have the attention span of a hyperactive ferret.)

The general driver board freertos + microros firmware is feature complete, and the UI I've developed program and test the servos is now complete and working perfectly.

Now that this tool is done I can program the offsets for each of the servos for their home position and hard minimum and maximum limits.

Then to the fun stuff.  :-)


screenshot_15102025_194305.jpg

Alan Timm

unread,
Oct 24, 2025, 9:04:59 PMOct 24
to RSSC-List
Ok, so houston?  we have a little problem...
(Thanks Dani for the video!)

So good news: he's been assembled and the framework code is all done and works 100% but...

But... we have an unforeseen consequence of some of my hardware design choices.
He got himself some crippling jigglies when executing any type of turn.  You can see it at a couple of points in the video.
The faster the turn the faster he attempts to jiggly himself over.  That's... not ideal.

I've tried everything and it appears to be a problem with the skid steer configuration while driving the outside wheels on both sides in combination with a tall bot.

So...  I'm going to try switching to mecanum this weekend and hope for the best.


Wish me luck!

Alan

Chris Albertson

unread,
Oct 25, 2025, 3:58:28 PMOct 25
to Alan Timm, RSSC-List

On Oct 24, 2025, at 6:04 PM, Alan Timm <gest...@gmail.com> wrote:

Ok, so houston?  we have a little problem...
(Thanks Dani for the video!)

So good news: he's been assembled and the framework code is all done and works 100% but...

But... we have an unforeseen consequence of some of my hardware design choices.
He got himself some crippling jigglies when executing any type of turn.  You can see it at a couple of points in the video.
The faster the turn the faster he attempts to jiggly himself over.  That's... not ideal.

I've tried everything and it appears to be a problem with the skid steer configuration while driving the outside wheels on both sides in combination with a tall bot.

Yes,  steerring with four fixed wheels is geometrically impossible.   Two fo the wheels will have to slide sideways.     Mecanum wheels work but then the robot can only run on a hard and level surface.   But all four-wheel platforms have the same problem that unless the floor is dead-perfect flat only three wheel will be touching the floor at any time.  K, unless that is compliance in the structure.  But clompliance in such a tall stucture means wobbles.      The ideal solution is what cars do, steerable wekks and suspension.  But that is complex.      The cheap solution is unpowered casters for rear wheels

I think this is all ok if the goal is research into how to control a pair of arms and hands

In the long run this robot will need a much more robost base with larger wheels



So...  I'm going to try switching to mecanum this weekend and hope for the best.


Wish me luck!

Alan


On Wednesday, October 15, 2025 at 7:54:03 PM UTC-7 Alan Timm wrote:
Ok guys and gals I just gotta say...  Pair programming with Github Copilot is MAGICAL!

I've covered more ground in the past few days that I would have been able to over the next month, and that's IF I could have maintained focus long enough to deliver.
(That's questionable, I seem to have the attention span of a hyperactive ferret.)

The general driver board freertos + microros firmware is feature complete, and the UI I've developed program and test the servos is now complete and working perfectly.

Now that this tool is done I can program the offsets for each of the servos for their home position and hard minimum and maximum limits.

Then to the fun stuff.  :-)



--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.

Alan Timm

unread,
Oct 25, 2025, 8:35:48 PMOct 25
to RSSC-List
Yep you're not wrong, I was actually ok with the wheel slip for now, it was just the jigglies that caught me off guard.

The mecanum wheels came in today, so I'm going to swap them in and move all 4 motors to 60prm encoder gearmotors while I'm waiting for the 100rpm ones to arrive next month.

I'm about 50% done designing a pi pico mecanum driver board in kicad with a pair of TB6612FNG 12v dual h bridge drivers.  that'll replace one of my waveshare general driver boards and allow the pico to do all the calculations for twist messages for all 4 wheels at 500hz, and maybe even take into account the imu readings from the other board.

Believe it or not i'm actually buddying up with claude to get the pcb design best practices right for a 4 layer board.


screenshot_25102025_173327.jpg

Alan Timm

unread,
Oct 26, 2025, 1:11:20 PMOct 26
to RSSC-List
Ok, I'm not sure why it worked, but it did, and now alfie can turn!

After a little bit of hardware finessing alfie has a new pair of shoes.

So now motors with encoders all around, with pid loops and a few other tricks for a pure closed loop velocity drive system.
The base now accepts standard twist messages and translates that into the required velocities for all four wheels at 100hz.

I'll post an updated video later this afternoon with all the fun things he can do now with his new wheels.



screenshot_26102025_100458.jpg

Chris Albertson

unread,
Oct 26, 2025, 6:21:47 PMOct 26
to Alan Timm, RSSC-List

On Oct 26, 2025, at 10:11 AM, Alan Timm <gest...@gmail.com> wrote:

Ok, I'm not sure why it worked, but it did, and now alfie can turn!

After a little bit of hardware finessing alfie has a new pair of shoes.

So now motors with encoders all around, with pid loops and a few other tricks for a pure closed loop velocity drive system.
The base now accepts standard twist messages and translates that into the required velocities for all four wheels at 100hz.


With an omni directional base you can do even better.   The usual “twist” massages only have X as a velocity.   Now you can populate the Y field.      Not only that, in addition to “twist” the robot should accept “cmd_pose” messages with Z (or “yaw) position populated.

I built a four legged robot and found I could fully populate both cmd_vel and CMD-pose, althought a dog-bot only has a few cm of vertical travel (from squatting or stretching its legs)

It might seem just a dumb truick but with cmd-pose you can do things like spin on a z-axis while driving a straight line but for a kitchen robot this would be useful,    Lets say the sink is across the room from the stove and the robot wants to move from the sink to the stove.   Most efficient is to drive in a straight line while doing a 180 degree spin.  I think this is what humans do.       But your older diferential drive robot would have to drive a complex path with two arcs almost as bad is parallel parking a car

SO I think not only did you solve the skidding problem but you may have seriously simlified motion planing.   

OK maybe not simplified because now there are an infinite ways to do every move. I watched my real dog and she sometimes decides to place the center of rotation between her back les and sometimes places in to the lleft of right of her front shoulders.  Rarely does she spin round he center of gravity unless she is moving fast.

Being tall your ‘bot might want to minimize accerations on X or Y while minimizing drive time.   If so it will do a lot of those spin-while-driving-straight moves.

Holinomic bases are a lot like walking so what you learn will transfo=er well


I'll post an updated video later this afternoon with all the fun things he can do now with his new wheels.



Alan Timm

unread,
Oct 26, 2025, 10:20:28 PMOct 26
to RSSC-List
Aw yeah baby!  Alfie's new pair of shoes fit him just fine!

Here's a quick video update showing the capabilities of the new mecanum drivetrain.  Hey Chris, I'll take a look at that.  More control options are always better.  I was excited to get the twist messages working last night.  twist + pose sounds even more fun.  :-)

Right now the back wheels are controlled by one board, the front wheels with the other.  both boards accept velocity commands which are executed closed loop by the boards using encoders.
Then a higher level node sends real time commands to each of the boards for now.  Eventually I'll finish that picoboard driver then the pico can handle all the calcs at the same time.

Now I'm reworking the back mechanics, moving away from the Stepper motor to a gearmotor with encoder so I can have more strength and control over the shoulder position.

Thomas Messerschmidt

unread,
Oct 26, 2025, 11:26:29 PMOct 26
to Alan Timm, RSSC-List
Looks great Alan! Very smooth. 


Thomas Messerschmidt

-  

Need something prototyped, built or coded? I’ve been building prototypes for companies for 15 years. I am now incorporating generative AI into products.

Contact me directly or through LinkedIn:   




On Oct 26, 2025, at 7:20 PM, Alan Timm <gest...@gmail.com> wrote:

Aw yeah baby!  Alfie's new pair of shoes fit him just fine!
--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.

Allan Lopez

unread,
Oct 27, 2025, 2:25:00 AMOct 27
to RSSC-List
Hi everyone,

I am interested in joining the club, but I live far in Los Angeles. Are you having meetings in person on Saturdays?
I am a beginner and am hoping to learn to build robots!

Thank you,
Allan Lopez

Alan Timm

unread,
Oct 27, 2025, 7:07:49 PMOct 27
to RSSC-List
Hey Allan,

Alan here.  Nice to meet ya!

We meet the Second Saturday of (almost) every month at Cal State Long Beach.
We also have a really nice hybrid meeting setup, so you're always welcome to join over zoom if you can't make it in person.
Keep an eye out on the forums as well as our website https://rssc.org for details on our upcoming meetups.

See you next month!

Alan

Alan Timm

unread,
Oct 29, 2025, 11:02:36 PMOct 29
to RSSC-List
I was taking closer look at the soft gripper variant of xlerobot and it got me thinking about what  could use to update the hands for alfie.
  • soft compliant gripper
  • hand camera (I thought they were on their way out, then I saw them again on the figure robot and xlerobot)
  • force sensors on gripper to estimate grip force (the servos are highly geared, so current draw can't be used)

Here's where I'm exploring the concepts in onshape (while I'm redesigning the arms and doing a bunch of stuff other than starting on "operation high five"

The compliant grippers are printed in TPU95, so I faithfully recreated their gripper finger design to see how well it works.  They print in TPU then use grip tape.
There's these adorable 640x480 color camera modules that I'm thinking of placing directly in the middle back of the hand, then using a pair of those cheap force sensors to estimate force strength if i can get them to work with compliant fingers.

(And I'm trying really hard not to be distracted by that adorable AmazingHand design by Pollen Robotics.)

screenshot_29102025_195654.jpg

Chris Albertson

unread,
Oct 29, 2025, 11:58:21 PMOct 29
to Alan Timm, RSSC-List
My experiments with two-finger grippers told me they fail because they only make two points of contact, and the object being grasped will rotate.  You need at least three contact points.     In theory, if one finger is curved, it could touch the object at two points, but that only works in special cases.  You can not beat the Yale Hands.

Another experiment I did says that, yes, you CAN measure current to detect force.   For this to work, you need some compliance in the system, and it seems that with the TPU, you have this.   I tried mounting the servo in rubber grommets.   What you need is for the motor tocontinue to move as it presses harder and not come to a quick and hard stop.   Any rubber in the system does this.       Current works really well as a force sensor.   A typical servo moves in air with very little current, and then as it meets resistance, the current continues until the little PCB in the servo burns up.   (Yes, you will burn up several servos testing this.  Usually the MOSFETs act like a fuse and blow.) But the rubber in the system makes the current ramp up slower, so software has time to work.

Failing that, resistive force sensing might work.  I bought a few and was going to use them as ground force sensors in the robot’s feet.   https://shop.ncoa.org/best-medical-alert-systems-nb-3
You do not have to place them over the fingers; they should be embedded in the rubber part.

BTW, I met a person a couple of weeks ago who is using this sensor in a running shoe insole to measure foot contact when people run.   Using this data, they make custom insoles.  It is just a voltage divider.


I decided to keep it simple: Make a hole, more like a tunnel, in the rubber part and put a light through it.   When the rubber bends, the tunnel bends and blocks the light.



On Oct 29, 2025, at 8:02 PM, Alan Timm <gest...@gmail.com> wrote:

I was taking closer look at the soft gripper variant of xlerobot and it got me thinking about what  could use to update the hands for alfie.
  • soft compliant gripper
  • hand camera (I thought they were on their way out, then I saw them again on the figure robot and xlerobot)
  • force sensors on gripper to estimate grip force (the servos are highly geared, so current draw can't be used)

Here's where I'm exploring the concepts in onshape (while I'm redesigning the arms and doing a bunch of stuff other than starting on "operation high five"

The compliant grippers are printed in TPU95, so I faithfully recreated their gripper finger design to see how well it works.  They print in TPU then use grip tape.
There's these adorable 640x480 color camera modules that I'm thinking of placing directly in the middle back of the hand, then using a pair of those cheap force sensors to estimate force strength if i can get them to work with compliant fingers.

(And I'm trying really hard not to be distracted by that adorable AmazingHand design by Pollen Robotics.)

<screenshot_29102025_195654.jpg>



On Monday, October 27, 2025 at 4:07:49 PM UTC-7 Alan Timm wrote:
Hey Allan,

Alan here.  Nice to meet ya!

We meet the Second Saturday of (almost) every month at Cal State Long Beach.
We also have a really nice hybrid meeting setup, so you're always welcome to join over zoom if you can't make it in person.
Keep an eye out on the forums as well as our website https://rssc.org for details on our upcoming meetups.

See you next month!

Alan


--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.

Alan Timm

unread,
Nov 2, 2025, 12:10:31 PMNov 2
to RSSC-List
Ok, let's talk about hands.

While I keep finding excuses to work on hardware instead of, you know, actually getting alfie to do anything I'm looking at what his hands should be.

tldr; i'm seriously considering adapting the pollenrobotics amazinghand design for alfie while incorporating a wrist camera and force feedback.
In theory you'd have better grips and capability, but I don't know that the rl training and data capture for it is ready to do it right.

The current grippers would work ok, as long as i added grip tape or something.
screenshot_02112025_085633.jpg

I've started looking at doing a compliant tpu style gripper like xlerobot which would work pretty well as well.
screenshot_02112025_085529.jpg

But...  Man those AmazingHands look nice.  I hacked a rough draft of a smaller version with 3 shorter fingers and a smaller palm, and it looks like it would be a good fit for Alfie.
screenshot_02112025_085819.jpg


Alfie v2 arm draft with 3:1 reduction and 3 finger amazing hand
screenshot_02112025_085856.jpg

Chris Albertson

unread,
Nov 2, 2025, 2:06:03 PMNov 2
to Alan Timm, RSSC-List


On Nov 2, 2025, at 9:10 AM, Alan Timm <gest...@gmail.com> wrote:

Ok, let's talk about hands.

While I keep finding excuses to work on hardware instead of, you know, actually getting alfie to do anything I'm looking at what his hands should be.


These are the best, I’ve found.     I actually built two others and I can eliminate both.

1) A pincher hand.  The problem with two fingers is that the object rotates unless your software knows how to find the object’s center of mass and can place the contact points so that a line between them passes through the center of mass.   Without this, the object will rotate when it is lifted.

2) A humanoid hand.   The “Brunel hand” was open source and is pretty good and looks good. It was desiigned as a prosthetic device.  But it is hard to control unless you have a human brain.  Amputees, seem to be able to use it, but I’ve yet to see a robot do as well with it.

The above are limiting cases, the simplest and the most human-like.  Neither work well for robots.    I think what is needed is a middle ground, and I think the folks at Yale have a family of mid-ground hands.   You can pick one, and then they have options.  It is all open source and 3D printable.





With two points of contact, what you must calculate is rotational torque.  Ideally, this is zero (with zero moment arm length).   Then you estimate the force and the friction and see if it is good enough to get lift and stop rotation.    Pinchers are bad because the answer is “no” so many times.    But if the task is just software  development, then you can lift Styrofoam blocks with pinchers.      The hard tasks are (1) empty soda cans versus full cans, getting the hand to do both. (2) pick up a dime off the table, this is a classically hard task (3) a pour milk from a jug.  Because you need to rotate a jug that is VERY off-center.

Yales hands are compliant and wrap aound the object and create a geometric lock.  They don’t depend on friction.    They work like human hands even if they do not look human.    

Alan Timm

unread,
Nov 2, 2025, 11:08:45 PMNov 2
to RSSC-List
Aw man, so many updates this weekend.  Thanks again Dani for your help on Friday!  Replacing the stepper drive for the back with a gearmotor solution is working out pretty well.

screenshot_02112025_200610.jpgscreenshot_02112025_200746.jpg

Among other things I worked out the remaining oopsies with servo control, and whipped up a quick left/right arm slave program to test everything out.  I need to tune the acceleration and max speed values but so far so good.

Alan Timm

unread,
Nov 9, 2025, 4:07:44 PMNov 9
to RSSC-List
Quick update.  The 110rpm mecanum drive motors are installed, and after updating the control PIDs and doing some tuning exploration with Claude I think we're at a good starting point.

He scoots faster now, but because of the increased speed there's increased slippage on the flooring.  and for some reason he's not rotating around the center of the base anymore which I think is new.
All 4 wheels are independently controlled with closed loop pid controllers, active feedback with 22 ppr hall effect encoders. They accept velocity values in meters per second.
They in turn receive their orders from a higher controller that accepts twist commands and converts them to individual wheel speeds.

Alan Timm

unread,
Nov 29, 2025, 12:25:45 AM (5 days ago) Nov 29
to RSSC-List
So it's been a busy week.  After seeing that all the cool kids are programming their humanoids with vr headsets, a quick impulse buy later... and after a few hours of coding (thanks Dani!) I have the head and main camera hooked up.

The first step is to integrate the headset so alfie looks where you look and you can see what alfie sees.

The XLeRobot folks have alot of code ready that makes this integration alot easier than it would have otherwise.

screenshot_28112025_212247.jpg
Reply all
Reply to author
Forward
0 new messages