Mellanox Connect-X3

213 views
Skip to first unread message

Jeff Davis

unread,
Oct 8, 2016, 3:11:04 PM10/8/16
to Warewulf
My most recent issue and hopefully my last :)

I want to configure the Mellanox drivers to initialize the CX3 devices for Ethernet but they are coming up as Infiniband... This is done by setting /etc/rdma/mlx4.conf properly which is read by modprobe.

I think what's happening is that since /etc/modprobe.d/mlx4.conf and /etc/rdma/mlx4.conf live in the VNFS, the configuration files are not ready when the modules are loaded by the kernel and the driver is defaulting to Infiniband.  If I unload and reload the module manually after booting (or even in the init scripts), the the devices are configured for ethernet but this seems like a hack.

Is there a more elegannt way to get these config files into an earlier form of the file system?  Maybe initrd or something?  Or is there some other trick anyone happens to know?

Thanks!

Jason Stover

unread,
Oct 8, 2016, 3:16:32 PM10/8/16
to ware...@lbl.gov
Hi Jeff,

You'll probably want to add the following to the nodes KARGS value:

wwignoremod=mlx4_en,mlx4_core,mlx4_ib,cxgb3

During bootstrap, if the IB modules are present, they'll get loaded.
The 'wwignoremod' option keeps the detect script from loading specific
modules.

At init, it should start up after everything is provisioned, and the
openibd script is ran. All of the initial provisioned files should be
there at that point.

-J
> --
> You received this message because you are subscribed to the Google Groups
> "Warewulf" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to warewulf+u...@lbl.gov.
> To post to this group, send email to ware...@lbl.gov.
> To view this discussion on the web visit
> https://groups.google.com/a/lbl.gov/d/msgid/warewulf/b568f769-4dd5-4f77-ba76-6fbc347e913f%40lbl.gov.
> For more options, visit https://groups.google.com/a/lbl.gov/d/optout.

Jeff Davis

unread,
Oct 8, 2016, 3:26:52 PM10/8/16
to Warewulf

Thanks...  I'll try this on Monday.


Jeff Davis

unread,
Oct 8, 2016, 4:16:31 PM10/8/16
to Warewulf

I don't think I have an openib script.   I won't be able to check till Monday but I think this is going to be helpful anyway...

The mlx4 packages were yum installed into the generic rhel chroot.   Does that provide the script?


On Oct 8, 2016 3:16 PM, "Jason Stover" <jason....@gmail.com> wrote:

Jason Stover

unread,
Oct 8, 2016, 4:40:18 PM10/8/16
to ware...@lbl.gov
I'm not sure if the RHEL packages provide one... Maybe a rdma init
script? I'm used to using OFED or MLNX OFED and not the distro
packages.

-J
> https://groups.google.com/a/lbl.gov/d/msgid/warewulf/CADKtzDE-zk5s0MQGK2kKtvG9abkR6msMmzjYH9EsOQ9c357WXg%40mail.gmail.com.

Jeff Davis

unread,
Oct 8, 2016, 4:41:16 PM10/8/16
to Warewulf

Yes...  There is an rdma script.


Jeff Davis

unread,
Oct 10, 2016, 12:05:56 PM10/10/16
to Warewulf
You are the man!

John Hearns

unread,
Oct 10, 2016, 12:31:54 PM10/10/16
to ware...@lbl.gov
Jeff,

If I am not wrong the software will switch between Infiniband and Ethernet modes, depending on which type of switch the cable is connected into.
I have a Mellanox IB switch and an Ethernet switch in the same rack, and Connect-X 3 cards.  Can check for you if you like,
but as I recall the last time I tried this 'it just worked'. I too was looking at changing over the mode in the 
mlx4.conf     (this is not a Warewufl install, but I could provision with OpenHPC/Warewulf if it helps).






John Hearns

unread,
Oct 10, 2016, 12:58:15 PM10/10/16
to ware...@lbl.gov
I just rebooted a node with a stateful install of Centos 7.3, with a ConnectX-3 card
Cable moved between a Mellanoix IB switch and a Mellanox ethernet switch.

I must say that a whole heap of drivers are still loaded in the ethernet mode, here is the list:

lsmod | grep mlx
mlx5_ib               195825  0
mlx5_core             379515  1 mlx5_ib
mlx4_ib               195768  0
ib_sa                  33950  5 rdma_cm,ib_cm,mlx4_ib,rdma_ucm,ib_ipoib
ib_mad                 55975  4 ib_cm,ib_sa,mlx4_ib,ib_umad
ib_core               141088  12 rdma_cm,ib_cm,ib_sa,iw_cm,mlx4_ib,mlx5_ib,ib_mad,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib
mlx4_en               132565  0
vxlan                  41236  2 mlx4_en,mlx5_core
mlx4_core             348954  2 mlx4_en,mlx4_ib
mlx_compat             16639  17 rdma_cm,ib_cm,ib_sa,iw_cm,mlx4_en,mlx4_ib,mlx5_ib,ib_mad,ib_ucm,ib_addr,ib_core,ib_umad,ib_uverbs,mlx4_core,mlx5_core,rdma_ucm,ib_ipoib
ptp                    19231  7 igb,tg3,bnx2x,ixgbe,e1000e,mlx4_en,mlx5_core




Before:
oot@comp04 ~]# ibstat
CA 'mlx4_0'
        CA type: MT4099
        Number of ports: 1
        Firmware version: 2.33.5100
        Hardware version: 1
        Node GUID: 0xe41d2d0300b09470
        System image GUID: 0xe41d2d0300b09473
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 56
                Base lid: 6
                LMC: 0
                SM lid: 1
                Capability mask: 0x02514868
                Port GUID: 0xe41d2d0300b09471
                Link layer: InfiniBand



AfteR:

[root@comp04 ~]# ethtool ens1
Settings for ens1:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseKX/Full
                                10000baseKX4/Full
                                10000baseKR/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Advertised link modes:  1000baseKX/Full
                                10000baseKX4/Full
                                10000baseKR/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Link partner advertised link modes:  40000baseCR4/Full
        Link partner advertised pause frame use: No
        Link partner advertised auto-negotiation: Yes
        Speed: 40000Mb/s
        Duplex: Full
        Port: Direct Attach Copper
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000014 (20)
                               link ifdown
        Link detected: yes



Jeff Davis

unread,
Oct 14, 2016, 10:07:40 PM10/14/16
to Warewulf
Ignoring it in KARGS was the trick...  the module was defaulting to IB before the conf file was present...
Reply all
Reply to author
Forward
0 new messages