An announcement for those people who have been using, or want to use, the Clemson r7525 nodes.
We are having continuing and worsening problems with the BlueField-2 NICs that are in those nodes. Many of them are too old to run current firmware and this causes problems with people trying to use the latest version of the DOCA software framework. An incomplete or partial initialization of the BF2 card also leaves the card unusable for subsequent experiments as the BIOS and OS will suffer from lengthy timeouts as they attempt to initialize the card. Ultimately the nodes wind up being taken out of service until we perform a long (45+ minute) reset operation.
So we are likely going to need to retire some or all of the BF2 cards (but not the nodes themselves). Whether we remove or replace the BF2s (which provide 100Gb interfaces even when not using the SmartNIC capabilities) will depend on how they are being used.
So if you are a user of these nodes, please let us (porta...@cloudlab.us) know why you use them: