Hi all, I really need your help!
I am new to Infiniband, SRP and ESOS. I have an ESOS installed server with Mellanox ConnectX-3 FCBT installed SAN, I have same model NIC installed on a Dell R720 diskless server, I want to install Debian 12 for the R720 on SRP shared block storage and boot from there.
On ESOS, I created dev_disk, created initiator. On Dell R720, I boot from a Debian 12 live system, installed packages of `rdma*`, `srptools`, `ibutils`, `lsscsi` and `mstflint` inside of it. After I done that, I executed `lsscsi` command, the SRP shared(from ESOS) block storage was automatically detected as `/dev/sdb`. I finished installation process(GPT+Grub+BIOS instead of UEFI, because I heard ConnectX-3 doesn't support UEFI boot) on `/dev/sdb`.
But after I reboot the machine and removed the USB stick it still won't boot. When I was in Live system I see the R720 server connected with the SRP target on ESOS tui, but I don't see that connection after I reboot the server and remove the USB drive.
What should I do? Where should I start to troubleshoot?
Version info:
Mellanox CX-3 FCBT: Firmware v2.42.5000 and Flexboot v3.4.752
Dell R720: BIOS version 2.9.0.
I have been stucking on this for 2 weeks, please help me! Please let me know for any information you need!
Thanks in advance