Is there a recent up-to-date tutorial?

432 views
Skip to first unread message

Ben Heininger

unread,
Apr 12, 2022, 7:59:52 AM4/12/22
to ClusterHAT
I just bought a Pi 4B 4gb to go with my Clusterhat 2.5 + 4 pi zeros.
I've been waiting a long time and saving so that I could get this up and running.
now I have run into so many brick walls its not funny.
I have tried about 10 different tutorials to get it p and running and they are all out of date and are no longer helpful and either dont work or when they dont work there is no "If this doesn't work do this!"
I'm using CNAT and I can SSH into the Pi4B and the zeros. 
Can someone please point me to a recent up-to-date tutorial?
I want to learn k3s/k8s, Slurm, Docker, Rancher and anything else that I can use this for.

Tom Szolyga

unread,
Apr 12, 2022, 3:12:02 PM4/12/22
to ClusterHAT
I am interested in it as well.

Andy Piper

unread,
Apr 12, 2022, 3:14:01 PM4/12/22
to clust...@googlegroups.com
This is not a single page / post guide, because it depends on another tutorial, but here’s a guide I wrote about a year ago. The basic thing with this is that I used “The Missing Cluster HAT tutorial”, but documented any changes I had to make as I went along; and then called out the setup for Slurm and Munge as well. 



— 
Andy Piper | Kingston upon Thames, London (UK)
links: https://andypiper.me  | twitter: @andypiper 


--
You received this message because you are subscribed to the Google Groups "ClusterHAT" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clusterhat+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/clusterhat/0c47fe24-750a-4d4a-9f5e-e4dc8c2dbbd3n%40googlegroups.com.

Ben Heininger

unread,
Apr 13, 2022, 8:05:48 PM4/13/22
to ClusterHAT
Thank you Andy,
The Missing Cluster HAT tutorial, didnt work for me either. and unfortunately I have already attempted yours too and ran into problems.
It failed at the "setup storage and NFS per tutorial." as soon as I added the network storage to /etc/fstab my Pi wouldn't boot. 
I had to custom edit the SD card in my pc to rescue it and restore it. 
Once restored I tried to fix it but it kept failing until I gave up and moved on.
It then failed at the Munge and Slurm setup. I can't remember the details as it was a couple of weeks ago now.

So far I have attempted: 
  1. The Missing Cluster HAT tutorial -  Davin L.
  2. Building a compact Pi cluster -  Andy Piper
  3. arkade - The Open Source Kubernetes Marketplace -  Alex Ellis alexellis
  4. I built a Raspberry Pi SUPER COMPUTER!! // ft. Kubernetes (k3s cluster w/ Rancher) - Network Chuck
  5. There are others but I am at work and cant find the links that I tried.
I'm thinking that if I use cbridge instead of cnat then tutorials like Network Chuck's might work since they are getting their ip addresses from the router instead of the Pi4b.
Is some of it not working because I am using original Pi Zero's instead of Pi Zero 2s?
Is it the cnat instead of cbridge?
Is it updated packages that no longer work with the tutorial's instructions?
I know one of them (that I cant remember the site) was telling to install a version of Slurm or Munge that no longer exists. I think it was Slurm-Heroit or something like that.

The other problem for a beginner is that every tutorial I have seen so far doesn't explain what the commands are doing or how to troubleshoot them if something goes wrong. 
The majority of them are "Here run these commands in terminal and ta-da it works!". 
It might have worked when they wrote it and it might have specifically worked for them but when everyone else follows it and it doesn't work and you then spend 8 hours googling the errors to find a resolution, its very frustrating and off putting. (Which is sometimes the problem with linux anyway, everything becomes depricated too quickly)

On a bright side I found some great heatsinks for the Pi Zeros on Ali Express. and they fit perfectly on the zeros mounted to the clusterhat.


Ben Heininger

unread,
Apr 21, 2022, 6:28:29 AM4/21/22
to ClusterHAT
ok,
I thought I had discovered the problem because my SD card died, however I also discovered that the /boot/cmdline.txt file keeps getting ".old"ed and replaced with a default cmdline.txt.
for some reason the Pi4 is replacing it at boot. I noticed this happened last night too but thought that was the corrupt sd card. Now I'm not sure the SD card is actually corrupt although it is over 6 years old and my old gopro card.
So what do I do now?

Chris Burton

unread,
Apr 21, 2022, 7:22:03 AM4/21/22
to ClusterHAT
Hi, 
I thought I had discovered the problem because my SD card died, however I also discovered that the /boot/cmdline.txt file keeps getting ".old"ed and replaced with a default cmdline.txt.
for some reason the Pi4 is replacing it at boot. I noticed this happened last night too but thought that was the corrupt sd card. Now I'm not sure the SD card is actually corrupt although it is over 6 years old and my old gopro card.
So what do I do now?

Which image (filename) are you using and what does the cmdline.txt look like?

Chris. 

Ben Heininger

unread,
Apr 21, 2022, 8:23:18 AM4/21/22
to ClusterHAT
The file contains the following information:
console=serial0,115200 console=tty1 root=PARTUUID=dbc542f5-02 rootfstype=ext4 fsck.repair=yes rootwait quiet splash plymouth.ignore-serial-consoles

Ben Heininger

unread,
Apr 21, 2022, 8:24:28 AM4/21/22
to ClusterHAT
The cmdline.old contains:
console=serial0,115200 console=tty1 root=PARTUUID=ae083906-02 rootfstype=ext4 fsck.repair=yes rootwait quiet splash plymouth.ignore-serial-consoles init=/usr/sbin/reconfig-clusterctrl cbridge

Ben Heininger

unread,
Apr 21, 2022, 8:33:15 AM4/21/22
to ClusterHAT
is this something to do with the new way RPi is forcing people to change the default password for better security?

Ben Heininger

unread,
Apr 21, 2022, 8:39:56 AM4/21/22
to ClusterHAT
And if I rename the New cmdline.txt and replace it with the original (Renaming cmdline.old -> cmdline.txt) the Pi wont boot. it sits at a black screen with a flashing cursor and doesn't progress

Ben Heininger

unread,
Apr 21, 2022, 9:09:54 AM4/21/22
to ClusterHAT
And this .olding the cmdline.txt also occured on the Pi Zero images too and I was unable to SSH into them either

Chris Burton

unread,
Apr 21, 2022, 11:23:17 AM4/21/22
to ClusterHAT
Hi,
is this something to do with the new way RPi is forcing people to change the default password for better security?

I don't know which image file you're using but none of the current ClusterCTRL images are designed to work with the fiddling Raspberry Pi imager does so I wouldn't advise making any changes with it until the new 2022-04-04 based images are released.

Until that's released you'll need to setup things the old way so if you want to enable ssh create an empty "ssh" or "ssh.txt" file in the boot partition and to enable wifi add the settings to wpa_supplicant.conf etc. in the boot partition.

Chris. 

Ben Heininger

unread,
Apr 21, 2022, 7:29:51 PM4/21/22
to ClusterHAT
Hi,
I tried all of the following from the testing branch.
https://dist1.8086.net/clusterctrl/testing/
2021-10-30-4-ClusterCTRL-arm64-CBRIDGE.zip    2021-11-22 21:11    1.1G    
2021-10-30-4-ClusterCTRL-arm64-CNAT.zip    2021-11-22 21:19    1.1G
2021-10-30-4-ClusterCTRL-armhf-full-CBRIDGE.zip    2021-11-22 21:18    2.2G    
2021-10-30-4-ClusterCTRL-armhf-lite-p1.zip    2021-11-22 21:01    488M    
2021-10-30-4-ClusterCTRL-armhf-lite-p2.zip    2021-11-22 21:01    488M    
2021-10-30-4-ClusterCTRL-armhf-lite-p3.zip    2021-11-22 21:02    488M    
2021-10-30-4-ClusterCTRL-armhf-lite-p4.zip    2021-11-22 21:01    488M

Ben Heininger

unread,
Apr 21, 2022, 9:12:46 PM4/21/22
to ClusterHAT
Is it the imager that is causing the problem or the fact that I have updated the firmware on the Pi that is causing it to .old the cmdline.txt file?
The file is there and ok before I put it in the Pi because I have been manually creating the /boot/SSH file however it creates the .old file at boot.

Ian Goldsmith

unread,
Apr 22, 2022, 2:24:04 AM4/22/22
to clust...@googlegroups.com
Im afraid i dont know answers, but step1 sounds like you’ve already identified as try a new sd! 


Chris Burton

unread,
Apr 22, 2022, 3:44:36 AM4/22/22
to ClusterHAT
Hi,
Is it the imager that is causing the problem or the fact that I have updated the firmware on the Pi that is causing it to .old the cmdline.txt file?
The file is there and ok before I put it in the Pi because I have been manually creating the /boot/SSH file however it creates the .old file at boot.

If nothing is set in the "Image customization options" section then the imager should work OK.

I'm still testing the (36!) images for the next release which will support configuration using the imager.

Chris.

Ben Heininger

unread,
Apr 22, 2022, 7:15:27 AM4/22/22
to ClusterHAT
Thanks Ian, I have already purchased and tested a new SD and got the same issue with the .old of the files
Thank you Chris (and everyone else) for you assistance.
I am currently testing the original download of the "2021-10-30-4-ClusterCTRL-arm64-CNAT" that I had work but couldn't get the software (K3s, Slurm, etc) working. 
I am however burning the image using Balena etcher instead of the RPi Imager......
and its still broken. (Obviously must be the new firmware or it talks to RPi somewhere/somehow and says "Hey, lets ruin this persons day!"
I'll wait for the new images, (Thank you Chris)

Ronnie Williams

unread,
Feb 3, 2024, 7:00:09 AMFeb 3
to ClusterHAT
Like wise! I used the "Missing tutorial..."  I'm stuck at adding a shared USB storage.  On mounting: sudo mount -a, returns message: mount.nfs: access denied by server while mounting 172.19.181.254:/media/Storage
I'm just doing a basic cluster, just want to see anything run!  So, frustrating.
Reply all
Reply to author
Forward
0 new messages