Can't use BlueField-2 SmartNICs in r7525 nodes

20 views
Skip to first unread message

Ashfaqur Rahaman

unread,
May 4, 2026, 1:39:47 PM (11 days ago) May 4
to cloudlab-users
Hi,

I have the following experiments, and I can't use BF2 SmartNICs in any of them

https://www.cloudlab.us/status.php?uuid=f09d282d-91d7-48bb-ba1c-7bb01f5a1508
https://www.cloudlab.us/status.php?uuid=6d6bb9e7-7884-4edb-844d-80575ed35c21

Here are the issues I am facing with these nodes:

clgpu009, clgpu017: BF2 SmartNICs aren't showing up in netdev and host dmesg is showing following error:

[  112.551147] mlx5_core 0000:81:00.0: wait_fw_init:291:(pid 211): Waiting for FW pre-initializing, timeout abort in 19s (0x87010000)
[  132.490145] mlx5_core 0000:81:00.0: wait_fw_init:281:(pid 211): Firmware over 120000 MS in pre-initializing state, aborting
[  132.501287] mlx5_core 0000:81:00.0: probe_one:2409:(pid 211): mlx5_init_one failed with error code -110
[  132.526361] mlx5_core: probe of 0000:81:00.0 failed with error -110

I have tried resetting with /share/bf2/bfreset.sh script. But it didn't work.

clgpu015, clgpu013: On clgpu015 I can't access the BF2 with default clouldab bluefield password. I can access clgpu013, and it looks like it is working without issue, but an old version of BF2 OS is installed. In both of these cases, I can reinstall with the new BF2 OS to gain access. But I am worried that if I try to do that, those BF2s might end up in the same state as the other ones. So I will wait for your suggestions.

Please let me know how I can resolve these issues.

Thank you.

Kind regards,
Ashfaq 
Reply all
Reply to author
Forward
0 new messages