Is it just me, or are you a bit crazy, too?

6 views

Skip to first unread message

Michael Wimble

unread,

Feb 16, 2026, 12:06:37 AMFeb 16

to hbrob...@googlegroups.com, rssc...@googlegroups.com

Subject: Multi-Robot ROS Infrastructure - Or How I Learned to Stop Worrying and Automate Everything
Alternate title: Multi-everything ROS for Mere Mortals

Hi everyone,

I wanted to share a project I'm working on because I suspect many of you face the same frustrations I do. If you're managing even ONE robot with multiple computers, you know the pain. If you're thinking about building a SECOND robot that shares some components with the first... well, let me tell you about my nightmare.

THE PROBLEM(S)

I have a robot named Sigyn. She has mutliple computers:
- An AMD desktop running navigation and planning
- A Raspberry Pi 5 handling gripper vision and grasper control
- 3 custom boards running Teensy 4.1 MCUs managing sensors and controllers.
- An OAK-D running AI systems, including object detection and publishing color camera and depth camera images.

Each computer needs different ROS packages. Each has different installed libraries. Each needs to know about all the OTHERS for networking. And here's where it gets messy:

Every time I add a computer or change an IP address, I'm manually updating:
- /etc/hosts on ALL machines (so they can find each other)
- ~/.ssh/config on ALL machines (so I can ssh between them)
- The package lists for each machine (what should even BE on that Pi?)
- Environment variables (RMW_IMPLEMENTATION xml configuration, workspace paths, etc.)
- My own sanity (rapidly depleting)

And God forbid I rebuild a machine from scratch. I have... notes? Somewhere? Did I write down what packages go on the vision Pi? Was it this version of OpenCV or that one?

Now I'm building a second robot - Titania - and I want to reuse some Sigyn packages but not all. Some computers will run the same code. Some won't. The combinatorial explosion of "what goes where" is making my head hurt.

THE BREAKING POINT

I realized I was spending too much time on CONFIGURATION and synchronization instead of ROBOTICS. I was afraid to change anything because I'd have to remember to update it everywhere. I had the same IP addresses hardcoded in multiple places. My bash aliases were inconsistent across machines. Half my packages were in one giant repo and I couldn't figure out which ones were actually dependencies of which.

Enough. There has to be a better way.

THE VISION

I'm building a new management system (working name: SIgyn2, because creative naming is hard). The core idea is simple: YAML configuration files that describe EVERYTHING, and automation that makes it all happen.

Here's what I'm working toward, roughly in this order:

PHASE 1: NETWORK SANITY
- Define all robots and their computers in one place (robots.yaml)
- Define all network info in one place (network.yaml)
- Have a script automatically update /etc/hosts and ~/.ssh/config everywhere
- Never manually edit hosts files again (this alone would be worth it)

PHASE 2: PACKAGE ORGANIZATION
- Split my monolithic repo into separate packages (description, navigation, vision, etc.)
- Define which packages go on which computer types (packages.yaml)
- Use vcstool (.repos files) to manage the multi-repo madness
- Each machine pulls only what IT needs, not everything

PHASE 3: ROLE-BASED CONFIGURATION
- Define computer types (amd_x86, pi_5, jetson_orin_nano) with their capabilities
- Each robot component has a ROLE (main_controller, gripper_vision, ai_processor)
- Automatically generate environment files per-machine
- Bash aliases that make sense for each role (navigation computer gets nav aliases, vision computer gets vision aliases)

PHASE 4: DEPENDENCY TRACKING
- Know what system packages each computer type needs
- Track which ROS packages depend on what
- Version tracking so I know when configurations drift
- "Did I update Sigyn but forget Titania?" becomes detectable

PHASE 5: MULTI-ROBOT SUPPORT
- Each robot gets its own ROS_DOMAIN_ID (so they don't interfere)
- But they can SHARE package code
- Titania can use sigyn_description if she has the same base
- But titania_navigation can be totally different

THE PAIN I'M SOLVING

This is really about reducing cognitive load. I'm tired of:
- Remembering which packages go where
- Manually keeping files in sync across machines
- Being afraid to change things because I'll forget to update something
- Starting a new robot and thinking "oh god, here we go again"
- Rebuilding a machine and spending two days getting it back to working state
- Having different workspace paths on different machines because I wasn't consistent
- SSH'ing to a machine and having none of my usual aliases available
- Wondering why the robot isn't working, only to discover one computer is on the wrong ROS_DOMAIN_ID

I want to run ONE COMMAND on a new computer and have it:
- Know what robot it's part of
- Know its role in that robot
- Pull the right packages
- Configure its network
- Set up proper environment variables
- Give me the right aliases
- Just... WORK

CURRENT STATE

I'm early in this journey. I've got the YAML configurations designed and the core automation script working for network/aliases. I'm in the middle of splitting up my monolithic repo. I just successfully migrated my first package (the robot URDF) to a standalone repo, and watching vcstool pull it into a fresh workspace and build it was... beautiful. It actually worked.

The GitHub won't be public for a few weeks - I have a LOT of repo splitting and cleanup to do. But I wanted to share this now because:

1. Maybe you're facing these same problems
2. Maybe you've already solved them (PLEASE TELL ME)
3. Maybe we can learn from each other's approaches
4. I need to know if I'm solving the wrong problems

THE QUESTION

Am I overthinking this? Is there a standard ROS way to do multi-robot, multi-computer configuration management that I just haven't found? Or are we all suffering in silence, manually editing hosts files like animals?

I'd love to hear:
- How you manage multi-computer robots
- How you keep configurations in sync
- How you decide what packages go where
- Whether you think I'm solving real problems or just creating new ones
- If you've seen tools that already do this

Thanks for reading my robot infrastructure therapy session. If nothing else, writing this down has helped me realize I'm NOT crazy for wanting this. I’m crazy for completely different reasons.

Building robots is hard enough. Infrastructure shouldn't be.

Have I mentioned lately that “Everything about robots is hard” (TM) ?

- Mike

P.S. - If you're wondering "why not Docker?" - I tried. I really did. But getting hardware passthrough, USB devices, GPU access, real-time performance, and ROS networking all working across multiple containers on multiple machines was its own special hell. This approach keeps things running native but adds the management layer I need. I haven’t given up. But it lies in my 11th circle of hell issues.

P.P.S. - Yes, I know about Ansible and other config management tools (psst, I don’t really, but it sounds better if I say I do). But they're designed for server fleets, not robots. I need something that understands ROS workspaces, package dependencies, and the specific insanity of robot systems.

Michael Wimble

unread,

Feb 24, 2026, 9:12:05 PMFeb 24

to Marco Walther, hbrob...@googlegroups.com, rssc...@googlegroups.com

I looked into them. They don’t fit the bill. I have a report of the deficiencies somewhere.

On Feb 16, 2026, at 10:09 AM, Marco Walther <marc...@gmail.com> wrote:

Look into configuration management setups. Ansible, Chef, Puppet, .... all do similar things. And there are many more out there. Don't reinvent the wheel.

Michael Wimble

unread,

Feb 24, 2026, 9:18:21 PMFeb 24

to Marco Walther, hbrob...@googlegroups.com, rssc...@googlegroups.com

I was a big fan of zeroconf until it didn’t work. I didn’t get far in diagnosing the problem. For some reason, Linux 24.04 now and then decides to revert network settings, just my manual IP/mask/gateway settings. Even beyond that, for some reason, once in a while zeroconf just plain stops working on one of four major machines. Since then, I hardcode all IP addresses and try to fortify them in the various routers (with mixed success), I just went back to good-old, hard IP addresses. I don’t like it one damn bit, either, but there are a lot of moving pieces on the robot and the cumulative effect is that a too high percentage of the time that I power on Sigyn, something doesn’t work. I’ve spending too much effort hardening my robot against failures.

On Feb 16, 2026, at 10:09 AM, Marco Walther <marc...@gmail.com> wrote:

Don't. Use the zeroconf/avahi ;-) I can do things like
`ssh ubu...@ubuntu-2404-pi5b.local` and it just works;-)

Reply all

Reply to author

Forward

0 new messages