For some reason it doesn't like the ".local" extension. So I took that off the command and it seemed to work. Any ideas why it doesn't like ".local"?
Also, when I ssh to a node such as p1, I can't do a passwordless ssh to any other p node. Any thoughts?
(I usually just NFS export /home from the "controller" to the "p" nodes but I'm having problems with that too (may be related to the ".local" issue.
Hi,For some reason it doesn't like the ".local" extension. So I took that off the command and it seemed to work. Any ideas why it doesn't like ".local"?If you can't log into them using "ssh p...@p1.local" then I would guess they haven't been powered on long enough to boot up properly, or they haven't joined the bridge correctly.Depending on what the Pi Zero is doing whilst booting up it can take longer than a minute for it to boot up so it might be worth waiting a little longer before trying to copy the keys. If you've waited long enough I'd advise checking the network interfaces exist "ifconfig -a" should show the ethpi1 interface, they've been added to the bridge correctly "brctl show" should list br0 with eth0/ethpi1, then once they've booted up you should be able to ping them "ping -c1 p1.local" and once pingable you should be able to access them via SSH a few seconds later "ssh p...@p1.local" and similar for the other Pi Zeros. Once you can SSH in with "ssh p...@p1.local" you should be able to copy the keys over.
Also, when I ssh to a node such as p1, I can't do a passwordless ssh to any other p node. Any thoughts?This is to be expected as you'd need to create a new private key on p1, and then copy it to controller/p2/p3/p4 and repeat for the other Pi Zeros.
(I usually just NFS export /home from the "controller" to the "p" nodes but I'm having problems with that too (may be related to the ".local" issue.You didn't say what your problem was but when I was looking at nfsroot on the Pi Zeros I found rpcbind wasn't always starting on boot (https://github.com/dagon666/rpcbindplumber links to some of the issues) so that might be something to check.
I didn't add anything to the images on the zeros so I don't know why it takes so long for the bridging to come up.I'm not sure what's causing the issue. Does anyone have any suggestions?
Hi,I didn't add anything to the images on the zeros so I don't know why it takes so long for the bridging to come up.I'm not sure what's causing the issue. Does anyone have any suggestions?
I'm not sure how you can contact the Pi Zero using just "ssh pi@p4" as the Controller doesn't have this information in the standard setup. The only thing I can think of is your local router/gateway is handing out the IP via DHCP, adding the hostname to the local DNS server (and possibly populating /etc/resolv.conf with a "search" value to use). But that still leaves the question of why p4.local works but not the others, unless it has maybe ran out of IP addresses on the DHCP side of things?''
Can you try the Cluster HAT using a local KB/mouse/monitor (no ethernet on the controller) to see if after waiting a min from powering on p1-p4 you can "ping -c1 p1.local" .. "ping -c1 p4.local", if that works then I'd look at your router/gateway settings/manual.Chris.
--
You received this message because you are subscribed to the Google Groups "ClusterHAT" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clusterhat+unsubscribe@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/clusterhat/77ade730-ca1a-4cf2-87c0-47156a598cd4%40googlegroups.com.
This was indeed the problem. My home router was screwing up the DHCP on the cluster. So I pulled the network cable and the system booted just fine. I can then plug in the cable to access the Internet. If I bring down a node and then bring it back up, I have to unplug and then re-plug the network cable. Sort of a pain but not a show stopped. Anyway I can fix this?