” I can identify and classify thousands of object types in uploaded photos, common categories include:
- People (faces, age/gender estimates, activities)
- Animals (species, breeds)
- Plants (trees, flowers, leaves)
- Food (types of dishes, ingredients)
- Text (printed/handwritten, languages)
- Vehicles (cars, planes, bikes)
- Buildings (types, landmarks)
- Everyday objects (furniture, tools, electronics)
- Clothing (styles, colors, accessories)
- Signs and labels (road signs, logos, warnings)”
On Apr 18, 2025, at 7:18 PM, Gmail <thomas...@gmail.com> wrote:Alan,ChatGPT 4o says,” I can identify and classify thousands of object types in uploaded photos, common categories include:
- People (faces, age/gender estimates, activities)
- Animals (species, breeds)
- Plants (trees, flowers, leaves)
- Food (types of dishes, ingredients)
- Text (printed/handwritten, languages)
- Vehicles (cars, planes, bikes)
- Buildings (types, landmarks)
- Everyday objects (furniture, tools, electronics)
- Clothing (styles, colors, accessories)
- Signs and labels (road signs, logos, warnings)”
Can you recommend a similar (free) on-device image classification model? I mean something more like chatgpt and less like YOLO. I am ok if it requires a standard or even a gaming laptop with a high end GPU.
![]() | |
On Apr 19, 2025, at 10:28 AM, Alan Timm <gest...@gmail.com> wrote:
Here's the result of passing in the attached image and asking "What's in the image?" on my Radxa Rock 5C, 15GB ram 8 core sbc @ 1.8Ghz
The round trip time was almost 2 minutes. So not fast, but maybe useful?
>>> what is in /home/alfie/Pictures/homeoffice.jpg
Added image '/home/alfie/Pictures/homeoffice.jpg'
The image shows an old school desktop computer setup with a yellow plastic chair in front of it. The laptop
screen displays "03:58" and the mouse is black. There are two mugs next to the keyboard - one is green and
the other is white. On the desk, there is also a potted plant with green leaves.
total duration: 1m57.419420595s
load duration: 4.535755612s
prompt eval count: 716 token(s)
prompt eval duration: 1m38.395394584s
prompt eval rate: 7.28 tokens/s
eval count: 73 token(s)
eval duration: 14.425655452s
eval rate: 5.06 tokens/s
<homeoffice.jpg>
--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/rssc-list/9afb46ba-07e8-49fc-a4f2-56cfe9083706n%40googlegroups.com.
<homeoffice.jpg>
Hey there!I'm getting closer to (re)assembling alfie. The 12 20a buck converter is working well, although I think it's time to shorten a whole bunch of cables so that everything fits (it doesn't right yet).Also I've fallen into a bit of a rabbit hole wrt on-board processing. I rage-quit my indiedroid nova SBC and have moved on to the Radxa Rock 5C with 16gb ram.There are some compelling options for on-device speech synthesis, speech recognition?!, and large/small language models?! It's crazy that you can run these on a raspberry pi sized device.
- piper-tts is streaming natural sounding speech with about a 1 second delay
- faster-whisper for faster than real-time speech recognition
- qwen2.5 models in 0.5b, 1.5b, 3b variants for ai agents
- deepseek-r1:1.5b reasoning model through ollama
I think? the qwen models are capable of tool use, but you can run several combinations of these on an 8gb ram sbc, and the whole stack with room to spare on a 16gb device.Here's a sample of libretts_r_medium voice 4 (there's 903 total voices available) linked in the message.
Hi Alan,
I’m using Porcupine Wake Word by Pico Voice. It runs locally on your machine and is free for non-commercial projects. You can create one wake word per month. Sign up and click the non-commercial option, and agree not to aspire to make any money with it (at least while using their tech!)
https://picovoice.ai/platform/porcupine/
https://picovoice.ai/docs/quick-start/porcupine-python/
You can see my example code utilizing two wake words:
This is a simple test which only requires pvporcupine and pyaudio and your wake word ppn file you get from picovoice:
https://github.com/jimdinunzio/big-orange/blob/Python-3.9/python/tests/test_porcupine_wake_word.py
As a career software guy, I’m a big fan of github and development records. All Big Orange code (and my other projects’ code) has been on github since 2020.
https://github.com/jimdinunzio/big-orange/
Jim
--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/rssc-list/a71f83b4-787a-4c19-9dc3-081276560793n%40googlegroups.com.
--You received this message because you are subscribed to the Google Groups "RSSC-List" group.To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/rssc-list/f324092d-6dc3-4f06-9268-09c19daa8611n%40googlegroups.com.