https://www.kunalganglani.com/blog/self-hosted-voice-assistant-home-assistant-2026-guide
So it seems you can do the basic structured commands (like Alexa used to
use) on a Pi 4 with 4GB now if you don't mind it taking a few seconds,
or with a low end CPU and 16GB to be comfortably quick. I think it's
OpenAI's Whisper speech-to-text that's made the lower end hardware
viable - this part used to need the sort of hardware that now makes the
local LLM possible (unless I misunderstood when I last looked into it!)
I might have to give this a go...