"hosts" file and ROS2

84 views
Skip to first unread message

camp .

unread,
Oct 11, 2023, 11:31:43 AM10/11/23
to HomeBrew Robotics Club
    Last night at the ROS Discussion Group (and the previous week), we went around the horn on my ROS2 nodes, and topics coming through on one computer, not another. We began to nitpick the "hosts" file at Mike Wimble's suggestion.

    Back when I did network administration (if memory serves) you had to have the IP address and name of the computer in the "hosts" file for it to resolve by name. Apparently, with IPv6, this is no longer the case.

    Moreover, specifying the IP address in this case seems to have caused the problem, as removing all the IP/Name entries from the "hosts" file and re-booting all the systems (to clear the cache) seems to have resolved the problem.

Thanks,
Camp

Mark Johnston

unread,
Oct 11, 2023, 5:35:26 PM10/11/23
to HomeBrew Robotics Club
Ok, good info.

What I am aware of that is new to ROS2 is it has it's own full discovery of all nodes on the same ROS domain.   So all the nodes you want to 'see' each other have same  ROS_DOMAIN_ID and they will all discover each other and then be able to communicate over ROS topics.

I was not aware that the hosts file could foul that up.  


Mark

Mark Rose

unread,
Oct 11, 2023, 6:00:20 PM10/11/23
to hbrob...@googlegroups.com
Are you using DHCP or static IP addresses for your equipment? If you're using DHCP, the problem you may have had with /etc/hosts is that you configured a name-to-IP mapping that was correct at one point, but became wrong later. On my home router, for example, I usually get the same IP address every time I connect my laptop, but that is not guaranteed. It's possible you had old IP addresses in your /etc/hosts.

The details of ROS2/DDS auto-discovery depend on the underlying DDS implementation, of course. We use RTI, rather than the default Fast DDS, so YMMV. RTI uses UDP broadcast, by default, to auto-discover things within the same ROS domain. Hostnames are not used at all in that case, so I'm kind of surprised that a bad /etc/hosts file messed this up.

Mark

--
You received this message because you are subscribed to the Google Groups "HomeBrew Robotics Club" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hbrobotics+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hbrobotics/c0c39a5b-daeb-43c6-88b1-fb521574d521n%40googlegroups.com.


--

camp .

unread,
Oct 11, 2023, 6:02:27 PM10/11/23
to hbrob...@googlegroups.com
Thanks, Mark,
    I do use DHCP but the IP addresses were correct at the time.

Thanks,
Camp

Ralph Hipps

unread,
Oct 12, 2023, 2:49:30 PM10/12/23
to HomeBrew Robotics Club
DHCP will give you whatever IP address it has handy, no guarantees of repeatability.

That's old news, back from the IPv4 days, not new in IPv6.

Steve " 'dillo" Okay

unread,
Oct 13, 2023, 2:00:40 PM10/13/23
to HomeBrew Robotics Club
I think it depends on what your default host resolution scheme is. The reason it used to work(and probably breaks now) is that a populated /etc/hosts will short-circuit all other address<->name resolution mechanisms.  Since systemd came on the scene, there's been a proliferation of ways to set up and run address assignment & hostname resolution. This also hasn't been helped by the fact that Ubuntu seems to change what the default scheme for general network management is with every new LTS.  
Remember, most systems/daemons are looking for *a* answer and often aren't smart enough to tell a right one from a wrong one.
That's for a human to debug :)

'dillo

Steve " 'dillo" Okay

unread,
Oct 13, 2023, 2:14:58 PM10/13/23
to HomeBrew Robotics Club
On Wednesday, October 11, 2023 at 3:00:20 PM UTC-7 Mark Rose wrote:
Are you using DHCP or static IP addresses for your equipment? If you're using DHCP, the problem you may have had with /etc/hosts is that you configured a name-to-IP mapping that was correct at one point, but became wrong later. On my home router, for example, I usually get the same IP address every time I connect my laptop, but that is not guaranteed. It's possible you had old IP addresses in your /etc/hosts.

Just to confusculate things even further, it's possible to reserve IP addresses for a given MAC address in your DHCP server config file OR to specify that there's an infinite lifetime for addresses handed out by the server.  This is a common tactic to make sure that things like a printer/file server/scanner/other device always gets the same address but hand out random ones to laptops & desktops which don't need fixed addresses.

Tenacity has a small LAN onboard because I've moved from Arduino & rosserial to Teensy w/ Ethernet over the past year or so. I've disabled all the host resolution on these systems except for the  Ethernet on the diagnostic port. That only comes up when I need to plug it into a switch to update/install packages or plug my laptop into the robot. This means that all the hosts come up ASAP without stepping through whether to consult systemd/resolved or netplan or dnsmasq or whatever.   I flip the switch and the robot is up and running in a minute or two tops.

'dillo

Chris Albertson

unread,
Oct 16, 2023, 2:58:09 PM10/16/23
to hbrob...@googlegroups.com
In 2023, no one should be editing the /etc/hosts file.    The problem is that if you have “n” computers on your networks you need to keep n files all in sync every time you make a change.  and WORSE, if you use DHCP to assign IP addresses the DHCP server has to have its “reservations list” in sync with all the hosts files.   It is labor-intensive and error-prone.

The best practice for a small, local network is to use DHCP is distribute IP addresses and DNS to resolve them.   This is easy to set up with most home routers.

Marco Walther

unread,
Oct 16, 2023, 3:15:13 PM10/16/23
to hbrob...@googlegroups.com, Chris Albertson
On 10/16/23 11:57, Chris Albertson wrote:
> In 2023, no one should be editing the /etc/hosts file.    The problem is
> that if you have “n” computers on your networks you need to keep n files
> all in sync every time you make a change.  and WORSE, if you use DHCP to
> assign IP addresses the DHCP server has to have its “reservations list”
> in sync with all the hosts files.   It is labor-intensive and error-prone.

So true;-)

>
> The best practice for a small, local network is to use DHCP is
> distribute IP addresses and DNS to resolve them.   This is easy to set
> up with most home routers.

DHCP yes;-) DNS (at least the normal DNS), not so much. I prefer
ZeroConf/avahi which is distributes but reacts very similar to DNS when
you try to resolve names.

avahi 1034 1 0 07:48 ? 00:00:04 avahi-daemon:
running [feather6.local]
avahi 1041 1034 0 07:48 ? 00:00:00 avahi-daemon: chroot
helper

exists on all Linux systems (and the protocol is also supported by Apple
and I hope, Windows;-) Basically each host publishes it's own name-IP
translation, regardless of 'where the IP came from'. That helps when you
switch APs (or like many of my robot Pi's switch between Wifi-client and
AP more as needed.


A log from a remote HomeAssistant Pi/container

Welcome to the Home Assistant command line.

System information
IPv4 addresses for end0: 192.168.0.155/24
IPv6 addresses for end0: fdd7:7eaf:bc14:0:ead2:ed33:910b:ae91/64,
2002::501a:5f5f:ecdc:742f/64, fe80::4dfe:23c5:48d0:2e8f/64
IPv4 addresses for wlan0:

OS Version: Home Assistant OS 11.0
Home Assistant Core: 2023.10.3

Home Assistant URL: http://homeassistant.local:8123
Observer URL: http://homeassistant.local:4357
[core-ssh ~]$ nslookup feather6.local
Server: 127.0.0.11
Address: 127.0.0.11#53

Name: feather6.local
Address: 192.168.0.178

[core-ssh ~]$ dig feather6.local

; <<>> DiG 9.18.13 <<>> feather6.local
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked
to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26281
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: fa70ef624363c681 (echoed)
;; QUESTION SECTION:
;feather6.local. IN A

;; ANSWER SECTION:
feather6.local. 3 IN A 192.168.0.178

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11) (UDP)
;; WHEN: Mon Oct 16 12:10:49 PDT 2023
;; MSG SIZE rcvd: 71

[core-ssh ~]$ nslookup google.com
Server: 127.0.0.11
Address: 127.0.0.11#53

Non-authoritative answer:
Name: google.com
Address: 142.251.46.238
Name: google.com
Address: 2607:f8b0:4005:813::200e


-- Marco

>
>> On Oct 11, 2023, at 2:35 PM, Mark Johnston <mjst...@gmail.com> wrote:
>>
>> Ok, good info.
>>
>> What I am aware of that is new to ROS2 is it has it's own full
>> discovery of all nodes on the same ROS domain.   So all the nodes you
>> want to 'see' each other have same  ROS_DOMAIN_ID and they will all
>> discover each other and then be able to communicate over ROS topics.
>>
>> I was not aware that the hosts file could foul that up.
>
> --
> You received this message because you are subscribed to the Google
> Groups "HomeBrew Robotics Club" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to hbrobotics+...@googlegroups.com
> <mailto:hbrobotics+...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/hbrobotics/72ED476B-BDB0-4A16-B908-9BA36C96D81A%40gmail.com <https://groups.google.com/d/msgid/hbrobotics/72ED476B-BDB0-4A16-B908-9BA36C96D81A%40gmail.com?utm_medium=email&utm_source=footer>.

David Murphy

unread,
Oct 16, 2023, 8:51:43 PM10/16/23
to hbrob...@googlegroups.com, Chris Albertson
On this topic of avahi-daemon,
I have behavior on my network that perhaps someone knows how to resolve?

It seems to be fairly routine that (Linux) devices on the network - whether wired or wireless - will end up having an incrementing digit attached to the end of their hostname.
This occurs overtime.
For example, a host on startup might have a hostname of
Rpi3cam1.local
But at some random point in the future will be 
Rpi3cam1-4.local

Executing 
Service avahi-daemon restart
Will set this back to the ‘proper’ name.

I’ve attempted to research this on google/stack overflow etc a number of times.
Consensus seems to be:
- it’s a bug in avahi
- It’s a race condition where the daemon incorrectly thinks the original host name is in use and appends an integer to get a unique name
- and there is no reliable fix.

If someone knows otherwise it would be helpful. It’s a pain in the ass when I have to search for what suffix it got to, login, restart the daemon and go back to kick/restart all the things that lost contact expecting the original host name.


To unsubscribe from this group and stop receiving emails from it, send an email to hbrobotics+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hbrobotics/9eb0b4a8-cc3e-47fb-b00f-c95e0a73b984%40gmail.com.

Mark Johnston

unread,
Oct 17, 2023, 3:19:22 PM10/17/23
to hbrob...@googlegroups.com
Yeah Dillo. Kind of pisses me off on the constantly changing methods. I learn one then the next then am no longer sure if the one i am changing with console commands and manual edits and dont know if it will take or not without a rediculous 2 or 3 tries with reboots. Silly and irretating frankly.

Sent from my Verizon, Samsung Galaxy smartphone
Get Outlook for Android

From: hbrob...@googlegroups.com <hbrob...@googlegroups.com> on behalf of Steve " 'dillo" Okay <espre...@gmail.com>
Sent: Friday, October 13, 2023 1:14:58 PM
To: HomeBrew Robotics Club <hbrob...@googlegroups.com>
Subject: Re: [HBRobotics] Re: "hosts" file and ROS2
 
--
You received this message because you are subscribed to the Google Groups "HomeBrew Robotics Club" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hbrobotics+...@googlegroups.com.

Alan Federman

unread,
Oct 17, 2023, 6:58:04 PM10/17/23
to hbrob...@googlegroups.com, Mark Johnston
in ros 1 I have a two liner in at the end of my .bashrc that sets my ROS_IP to my IP number on the workstation. I do the same thing on the robot, but do it in the /etc startup scripts. 
 
The symptoms of a DHCP misconfigured, is the ability to list topics, but unable to echo them.
 
This seemed to be a typical issue in Kinetic, but not so much since Noetic 

Rafael Skodlar

unread,
Oct 18, 2023, 12:05:51 AM10/18/23
to hbrob...@googlegroups.com, Mark Johnston
One way to distribute hosts and other configuration files is with
etcd. https://etcd.io
That's a well supported way to distribute Key/value.
A bash script using scp or rsync could be created to distribute hosts
and other configuration files.

My LAN depends on a local DNS server which is a virtual machine with
dnsmasq in a small NUC computer.
dnsmasq has all kinds of options to configure it for DNS, dhcp (range
or static based on MAC addr), time server, tftp, boot server, etc. By
default it's using the /etc/hosts file for its service. Single point
for management or failure ;-)

It's strongly recommended to sync all local systems to one time source
so that individual log files tell the truth between them.
Bootup server is a good way to bootup and install OS on a new machine,
test different OS, or troubleshoot a screwed up system.
As always, read man pages
man 8 dnsmasq
DNS and other networking stuff used to work absolutely great in Unix
until we were forced to connect pathetic windows and MAC things to
the networks. Gone are the days of managing files distribution with
NIS. So there!

Rafael
> To view this discussion on the web visit https://groups.google.com/d/msgid/hbrobotics/1946300193.459796.1697583456354%40connect.xfinity.com.

Steve " 'dillo" Okay

unread,
Oct 18, 2023, 5:47:01 PM10/18/23
to HomeBrew Robotics Club
On Monday, October 16, 2023 at 1:58:09 PM UTC-5 Chris Albertson wrote:
In 2023, no one should be editing the /etc/hosts file.    The problem is that if you have “n” computers on your networks you need to keep n files all in sync every time you make a change.  and WORSE, if you use DHCP to assign IP addresses the DHCP server has to have its “reservations list” in sync with all the hosts files.   It is labor-intensive and error-prone.

The best practice for a small, local network is to use DHCP is distribute IP addresses and DNS to resolve them.   This is easy to set up with most home routers.

This presumes that your robot only has one network interface and it's connected to a home/office router as a client and that's really the only network infra and activity.
For many mobile robots, external connectivity to the outside world is important, but only when they're back home "on shore".
(There are delivery robots that report telemetry data and video over 5G, but that tends to be a separate interface with its own config)

As I mentioned in a previous posting, Tenacity has its own internal LAN and the addresses for the couple hosts there are hard-coded for latency and redundancy.
Classic static addressing guarantees that everybody finds everybody inside and the robot doesn't hang waiting on outside connectivity.

I should also mention that by default Tenacity runs in Host-AP mode and I connect to it either via an Ethernet port on the back of the rover or via the AP.
When I need it to be able to access the outside world, I attach a WiFi dongle to a USB port and that shuts off the AP and joins a local WLAN as a DHCP client.
I have some scripts and nmcli rules that control the handoff between the two modes.

(Speaking of it being 2023: I have a separate rant about why you should kill your serial connections on your robot wherever possible).

Anyway, that's how I'm doing networking on my main robot these days.
'dillo
Reply all
Reply to author
Forward
0 new messages