Hi all,
I've been working on Elsabot again over the last few months. That effort has included upgrading her to use a Jetson AGX Orin. (I grabbed one when they went on sale last Christmas.) I've replaced the cloud-based STT and TTS with local processing, and have started integrating an LLM for chat and control.
I started with chat functionality since it was most likely to amuse the grandkids (which it did), and am now starting to work on robot control.
For an amusing chat demo, see this youtube video: https://youtube.com/shorts/3bK86ZycUxE?feature=share . (It’s amazing the sass you get from the model when you add “you are sassy and sarcastic” to the system prompt.)
The video demonstrates the STT/TTS/Wakeword integration and also uses tool calls from the model to request the time and also a video frame with VLM analysis of the frame. It is using the Gemma4 26B model for chat and VLM.
For the LLM integration I created a Behavior tree node that implements a model session which allows AI functionality to be connected to overall robot functionality. It also supports using subtrees to implement tool calls from the model. That will be leveraged to implement navigation and other functionality.
If interested, see https://github.com/rshorton for the various repos (elsabot_robot, elsabot_bt, elsabot_speech_input, elsabot_audio_output, jetson_support).
Scott