please help running software iWarp on CentOS 7.6

210 views
Skip to first unread message

Olga Kornievskaia

unread,
Apr 11, 2019, 3:41:12 PM4/11/19
to zrlio...@googlegroups.com
Hello folks,

I'm trying to run software iWarp on CentOS 7.6 and having
difficulties. I was wondering if somebody can help (I'm not a
subscriber so please cc me to the reply, I would greatly appreciate
it).

I have installed the build and loaded the kernel module. I have build
the userland component by doing "bash ./build.sh". But I haven't done
anything resembling a "make install" as there is nothing like that in
the instruction on https://github.com/zrlio/softiwarp. CentOS comes
with rdma configuration so I have done "systemctl start rdma" which
starts successfully.

[aglo@localhost ~]$ sudo systemctl status rdma
● rdma.service - Initialize the iWARP/InfiniBand/RDMA stack in the kernel
Loaded: loaded (/usr/lib/systemd/system/rdma.service; disabled;
vendor preset: disabled)
Active: active (exited) since Thu 2019-04-11 14:27:32 EDT; 1h 7min ago
Docs: file:/etc/rdma/rdma.conf
Process: 10099 ExecStart=/usr/libexec/rdma-init-kernel (code=exited,
status=0/SUCCESS)
Main PID: 10099 (code=exited, status=0/SUCCESS)
Tasks: 0
CGroup: /system.slice/rdma.service
Apr 11 14:27:32 localhost.localdomain systemd[1]: Starting Initialize
the iWARP/InfiniBand/RDMA stack in the kernel...
Apr 11 14:27:32 localhost.localdomain systemd[1]: Started Initialize
the iWARP/InfiniBand/RDMA stack in the kernel.

When I try to do "ibv_devinfo" I get no ib devices found.

[aglo@localhost build]$ lsmod | grep siw
siw 217088 0
ib_core 282624 11
rdma_cm,ib_ipoib,rpcrdma,ib_srp,iw_cm,ib_iser,ib_umad,rdma_ucm,ib_uverbs,siw,ib_cm

/var/log/messages has
Apr 11 14:57:01 localhost kernel: SoftiWARP attached
Apr 11 14:57:01 localhost kernel: Started siw TX thread on CPU 0

[aglo@localhost build]$ sudo ibv_devinfo
No IB devices found

What am I missing?

Thank you.

Bernard Metzler

unread,
Apr 12, 2019, 7:05:40 AM4/12/19
to Olga Kornievskaia, zrlio...@googlegroups.com
Hi Olga,

We are in the process of making siw acceptable for upstream
inclusion. With that, we froze our development at
https://github.com/zrlio/softiwarp. But, if you
can still compile the kernel found there, you can use
that. It contains some known bugs which are fixed in
our upstream effort, but should be stable. To run that
kernel, you would have to build the user lib also found
there.


I suggest you take the kernel from
https://github.com/zrlio/softiwarp-for-linux-rdma.git
branch 'siw-for-rdma-next-v6'

You would have to take the matching user lib from
https://github.com/zrlio/softiwarp-user-for-linux-rdma
branch 'siw-for-rdma-next'

The build process for the user library does not contain
an 'install' method to let the libs land at the usual
lib dirs. Let your shell know where it is - e.g. something
like (pls adapt to your env):
export LD_LIBRARY_PATH=/home/bmt/siw-next/rdma-core/build/lib:$LD_LIBRARY_PATH

If you installed those latest siw versions, you furthermore
need the latest rdma tools to attach siw to interfaces
(it is not anymore a kernel module parameter, and does not
automatically attach to Ethernet interfaces siw finds during
module loading). Those tools are found at
https://github.com/larrystevenwise/iproute2.git
branch is 'wip/newlink-2019-03-19'. If you have built that
into <dir>, you can use the following command to add siw interfaces:
sudo <dir>/rdma/rdma link add siw0 type siw netdev enp1s0f4
addsi siw to the link 'enp1s0f4' and gives it the name
'siw0'

Please let me know if you need further help!

Best regards,
Bernard.



-----zrlio...@googlegroups.com wrote: -----

>To: zrlio...@googlegroups.com
>From: "Olga Kornievskaia"
>Sent by: zrlio...@googlegroups.com
>Date: 04/11/2019 09:41PM
>Subject: [zrlio-users] please help running software iWarp on CentOS
>7.6
>
>Hello folks,
>
>I'm trying to run software iWarp on CentOS 7.6 and having
>difficulties. I was wondering if somebody can help (I'm not a
>subscriber so please cc me to the reply, I would greatly appreciate
>it).
>
>I have installed the build and loaded the kernel module. I have build
>the userland component by doing "bash ./build.sh". But I haven't done
>anything resembling a "make install" as there is nothing like that in
>the instruction on
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrlio
>_softiwarp&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU
>_QJcRRLfmYTAgd3QCvqSc&m=63DcWPRQYGUZOy9_pGtn7sHFVQoNVr6QHNOFoyzXbEk&s
>=8GrnqxN6HM9ontXryyx00IEaJTfUsGigyOvqf-T9WIU&e=. CentOS comes
>--

Olga Kornievskaia

unread,
Apr 12, 2019, 1:59:03 PM4/12/19
to Bernard Metzler, zrlio...@googlegroups.com
Hi Bernard,

Thank you very much for the pointer to the iproute2 (binding an
interface was exactly the step that was missing. As when I setup
softRoCE I use rxe_config to make an interface RoCE capable). However,
I'm still running into issues. With some struggle, I got the "rdma"
executable to build. I executed the "rdma link add" command (it came
back without any indication of a problem). But var log messages have

Apr 12 13:10:29 localhost kernel: iwpm_register_pid: Unable to send a
nlmsg (client = 2)
Apr 12 13:10:29 localhost kernel: infiniband siw0: RDMA CMA:
cma_listen_on_dev, error -97
Apr 12 13:10:29 localhost systemd: Created slice
system-rdma\x2dload\x2dmodules.slice.
Apr 12 13:10:29 localhost systemd: Starting Load RDMA modules from
/etc/rdma/modules/rdma.conf..
Apr 12 13:10:29 localhost systemd: Starting RDMA Node Description Daemon...
Apr 12 13:10:29 localhost systemd: Started RDMA Node Description Daemon.
Apr 12 13:10:29 localhost systemd: Started Load RDMA modules from
/etc/rdma/modules/rdma.conf.
Apr 12 13:10:29 localhost systemd: Reached target RDMA Hardware.

When I run ibv_devinfo it still comes back with "no IB devices found".

I would appreciate if you could provide some further assistance in
this area. Let me know if I can provide any further info about the
problem.

Bernard Metzler

unread,
Apr 14, 2019, 6:31:53 AM4/14/19
to Olga Kornievskaia, zrlio...@googlegroups.com
-----zrlio...@googlegroups.com wrote: -----

>To: "Bernard Metzler" <B...@zurich.ibm.com>
>From: "Olga Kornievskaia"
>Sent by: zrlio...@googlegroups.com
>Date: 04/12/2019 07:59PM
>Cc: zrlio...@googlegroups.com
>Subject: Re: [zrlio-users] please help running software iWarp on
>CentOS 7.6
>
>On Fri, Apr 12, 2019 at 7:05 AM Bernard Metzler <B...@zurich.ibm.com>
>wrote:
>>
>> Hi Olga,
>>
>> We are in the process of making siw acceptable for upstream
>> inclusion. With that, we froze our development at
>>
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrlio
>_softiwarp&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU
>_QJcRRLfmYTAgd3QCvqSc&m=9NzzJG1Ty8bc8M55wkYRbFpI-odnDR-kS1HoLi6f4O0&s
>=N3SnPUX0u6bng9Kahg374qDnDb2uEkgkyhGh9WNftII&e=. But, if you
>> can still compile the kernel found there, you can use
>> that. It contains some known bugs which are fixed in
>> our upstream effort, but should be stable. To run that
>> kernel, you would have to build the user lib also found
>> there.
>>
>>
>> I suggest you take the kernel from
>>
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrlio
>_softiwarp-2Dfor-2Dlinux-2Drdma.git&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg
>&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=9NzzJG1Ty8bc8M55wkYR
>bFpI-odnDR-kS1HoLi6f4O0&s=qDAeyPJoSK_k2616WZXMO9nuhHUiAWHIW43MN3LRqg4
>&e=
>> branch 'siw-for-rdma-next-v6'
>>
>> You would have to take the matching user lib from
>>
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrlio
>_softiwarp-2Duser-2Dfor-2Dlinux-2Drdma&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1
>ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=9NzzJG1Ty8bc8M55w
>kYRbFpI-odnDR-kS1HoLi6f4O0&s=NW3p7mucXWhjtjIFeQm2l-HJz8g38q7qDiA7Jw5n
>mjg&e=
>> branch 'siw-for-rdma-next'
>>
>> The build process for the user library does not contain
>> an 'install' method to let the libs land at the usual
>> lib dirs. Let your shell know where it is - e.g. something
>> like (pls adapt to your env):
>> export
>LD_LIBRARY_PATH=/home/bmt/siw-next/rdma-core/build/lib:$LD_LIBRARY_PA
>TH
>>
>> If you installed those latest siw versions, you furthermore
>> need the latest rdma tools to attach siw to interfaces
>> (it is not anymore a kernel module parameter, and does not
>> automatically attach to Ethernet interfaces siw finds during
>> module loading). Those tools are found at
>>
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_larry
>stevenwise_iproute2.git&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-
>r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=9NzzJG1Ty8bc8M55wkYRbFpI-odnDR-k
>S1HoLi6f4O0&s=ANfJRfPLxBENnJZ146HZKdBkdEhFhSy3tDn3YSV9hYI&e=
This whole netlink stuff is a pain. Yes, softroce still uses
module parameters and its own proprietary way of adding interfaces.
I was asked not to use kernel module parameters at all.

What you see is stuff coming out of the iwarp port mapper.
another twist of software softiwarp would not need, but which
is needed for offloaded iwarp (to block the TCP port in the kernel
sw stack which is used by the offloaded hardware). softiwarp
tells the iwpm it does not want to use that service (since it is
going to use the TCP ports and does not want to have it blocked!).
There is a line in siw_main.c for that:

/* Disable TCP port mapper service */
base_dev->iwcm->driver_flags = IW_F_NO_PORT_MAP;

I am not sure which kernel you are running and if it has
that latest extension for the iwarp port mapper to support siw?
If possible, I therefore suggest you build the complete kernel
from the softiwarp-for-linux-rdma.git repo as mentioned above.
It has all the needed extensions which are not yet upstream.

Before starting testing, I'd also suggest to open
iptable rules to allow binding and connecting any
port from within kernel (some distros have everything
closed):

sudo iptables -I INPUT 1 -j ACCEPT


You may also enable dynamic debugging in siw after module loading
to see what comes down to siw (do as root):

echo -n 'module siw +p' > /sys/kernel/debug/dynamic_debug/control

Since this is very verbose that way don't use it when you test
data transfers ;)

Cheers,
Bernard.
>--
>You received this message because you are subscribed to the Google
>Groups "zrlio-users" group.
>To unsubscribe from this group and stop receiving emails from it,
>send an email to zrlio-users...@googlegroups.com.
>To post to this group, send email to zrlio...@googlegroups.com.
>Visit this group at
>https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.co
>m_group_zrlio-2Dusers&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8
>ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=9NzzJG1Ty8bc8M55wkYRbFpI-odnDR-kS1
>HoLi6f4O0&s=UDhsrO_QFSZQRHId5oHHqID4zRqjRxFVetKUJ6c6E_g&e=.
>To view this discussion on the web visit
>https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.co
>m_d_msgid_zrlio-2Dusers_CAN-2D5tyEFtC3ALZpcLbXWDqF8apPERbobA2U03pHeXx
>OvJUgeqA-2540mail.gmail.com&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYX
>Q0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=9NzzJG1Ty8bc8M55wkYRbFpI-odn
>DR-kS1HoLi6f4O0&s=qNmXgg5-bKLufRoPnOYMLTtsHnB7FHg5EGkOEmb4hTs&e=.
>For more options, visit
>https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.co
>m_d_optout&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU
>_QJcRRLfmYTAgd3QCvqSc&m=9NzzJG1Ty8bc8M55wkYRbFpI-odnDR-kS1HoLi6f4O0&s
>=laADwUUyPhnJwhJ8XPKdvaGliBgYhE7bWw82ws0Rke8&e=.
>
>

Olga Kornievskaia

unread,
Apr 15, 2019, 12:29:33 PM4/15/19
to Bernard Metzler, zrlio...@googlegroups.com
Hi Bernard,

Again, much appreciated with the help you provided so far. Let me tell
you what I have done so far so that you can perhaps see where I went
wrong.

1. git clone https://github.com/zrlio/softiwarp-for-linux-rdma.git and
checked out and build the siw-for-rdma-next-v7 (it's currently a
5.1.-rc2 kernel)
2. git clone https://github.com/zliio/softiwarp-user-for-linux and
checked out and build the siw-for-rdma-next branch.
3. export LB_LIBRARY_PATH to include the <path>/build/lib
4. insmod ./siw.ko
5. sudo iptables -F (no iptables) . I also did echo -n 'module siw
+p' > /sys/kernel/debug/dynamic_debug/control . Where does the output
of the debugging go /var/log/message or is this the trace_pipe or
perhaps somewhere else? I've checked both places while executing
ibv_devinfo and I see no output.
6. sudo systemctl start rdma
7. ibv_devinfo returns no IB devices.


Perhaps you find "trace ibv_devinfo" useful

open("/sys/class/infiniband/siw0/node_type", O_RDONLY|O_CLOEXEC) = 3
read(3, "4: RNIC\n", 16) = 8
close(3) = 0
open("/dev/infiniband/uverbs0", O_RDWR|O_CLOEXEC) = 3
ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0x1b, 0x01, 0x18), 0x7ffdeb198860)
= -1 ENOSPC (No space left on device)
ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0x1b, 0x01, 0x18), 0x7ffdeb1987b0) = 0
ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0x1b, 0x01, 0x18), 0x7ffdeb198730) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7f8d1b3f5000
write(1, "hca_id:\tsiw0\n", 13hca_id: siw0
) = 13
write(1, "\ttransport:\t\t\tiWARP (1)\n", 24 transport: iWARP (1)
) = 24
write(1, "\tfw_ver:\t\t\t\t0.0.0\n", 18 fw_ver: 0.0.0
) = 18
write(1, "\tnode_guid:\t\t\t000c:293d:c685:000"..., 34 node_guid:
000c:293d:c685:0000
) = 34
write(1, "\tsys_image_guid:\t\t\t000c:293d:c68"..., 39 sys_image_guid:
000c:293d:c685:0000
) = 39
write(1, "\tvendor_id:\t\t\t0x626d74\n", 23 vendor_id: 0x626d74
) = 23
write(1, "\tvendor_part_id:\t\t\t0\n", 21 vendor_part_id: 0
) = 21
write(1, "\thw_ver:\t\t\t\t0x0\n", 16 hw_ver: 0x0
) = 16
open("/sys/class/infiniband/siw0/board_id", O_RDONLY|O_CLOEXEC) = -1
ENOENT (No such file or directory)
write(1, "\tphys_port_cnt:\t\t\t1\n", 20 phys_port_cnt: 1
) = 20
ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0x1b, 0x01, 0x18), 0x7ffdeb198880) = 0
write(1, "\t\tport:\t1\n", 10 port: 1
) = 10
write(1, "\t\t\tstate:\t\t\tPORT_ACTIVE (4)\n", 28 state: PORT_ACTIVE (4)
) = 28
write(1, "\t\t\tmax_mtu:\t\t1024 (3)\n", 22 max_mtu: 1024 (3)
) = 22
write(1, "\t\t\tactive_mtu:\t\tinvalid MTU (0)\n", 32 active_mtu:
invalid MTU (0)
) = 32
write(1, "\t\t\tsm_lid:\t\t\t0\n", 15 sm_lid: 0
) = 15
write(1, "\t\t\tport_lid:\t\t0\n", 16 port_lid: 0
) = 16
write(1, "\t\t\tport_lmc:\t\t0x00\n", 19 port_lmc: 0x00
) = 19
write(1, "\t\t\tlink_layer:\t\tEthernet\n", 25 link_layer: Ethernet
) = 25
write(1, "\n", 1
) = 1
close(3) = 0
close(4) = 0
exit_group(0) = ?
+++ exited with 0 +++

[root@localhost events]# ls /sys/class/infiniband/siw0/
device node_desc node_type power sys_image_guid
fw_ver node_guid ports subsystem uevent
[root@localhost siw0]# cat node_type
4: RNIC
[root@localhost siw0]# cat node_desc
Software iWARP stack
[root@localhost siw0]# cat node_guid
000c:293d:c685:0000
[root@localhost siw0]# cat fw_ver

[root@localhost siw0]# cat uevent
NAME=siw0

Bernard Metzler

unread,
Apr 16, 2019, 5:12:56 AM4/16/19
to Olga Kornievskaia, zrlio...@googlegroups.com
-----zrlio...@googlegroups.com wrote: -----

>To: "Bernard Metzler" <B...@zurich.ibm.com>
>From: "Olga Kornievskaia"
>Sent by: zrlio...@googlegroups.com
>Date: 04/15/2019 06:29PM
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrlio
>_softiwarp-2Dfor-2Dlinux-2Drdma.git&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg
>&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=wOzlx7mxNlC7lIl4ZmSZ
>5U3Zq4-knP-DWyYXU6BN9IA&s=Z321nT21FA0V7FDd8hl5Jlg0KYpt6UXVSDi9yQAIvzk
>&e= and
>checked out and build the siw-for-rdma-next-v7 (it's currently a
>5.1.-rc2 kernel)
>2. git clone
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zliio
>_softiwarp-2Duser-2Dfor-2Dlinux&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2
>TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=wOzlx7mxNlC7lIl4ZmSZ5U3Z
>q4-knP-DWyYXU6BN9IA&s=ToyRefuVhS3BEd2WwKtT20DbhsjkNoLu5mljYDFdGu0&e=
>and
>checked out and build the siw-for-rdma-next branch.

Oh if you want to use the latest v7 branch, you would need the
matching user lib as well (branch 'siw-for-rdma-next-v7'), since
the abi has changed. Please update your user rdma-core git repo...

from sysfs it looks like siw has attached to the Ethernet device.
With the right user lib it should be good to go.

Best,
BErnard
>ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=wOzlx7mxNlC7lIl4ZmSZ5U3Zq4-knP-DWy
>YXU6BN9IA&s=fmb7XjcH6jmykEYyPaiBMtdLRvDxu7reSM0Has5WNss&e=.
>m_d_msgid_zrlio-2Dusers_CAN-2D5tyHBwSJ0swvAh-2D1H4acSz-2DzYYWjzoHhZaP
>7QNvjmx4BTcg-2540mail.gmail.com&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2
>TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=wOzlx7mxNlC7lIl4ZmSZ5U3Z
>q4-knP-DWyYXU6BN9IA&s=lDf6vj2Naa4v1LHU4Bypq_ZhRsMjDD_oSTB9xswRFEo&e=.
>For more options, visit
>https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.co
>m_d_optout&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU
>_QJcRRLfmYTAgd3QCvqSc&m=wOzlx7mxNlC7lIl4ZmSZ5U3Zq4-knP-DWyYXU6BN9IA&s
>=nxmp372t-ZwdDZEs7a8pSElwrNnMwSPYAbFebS5XVAU&e=.
>
>

Olga Kornievskaia

unread,
Apr 16, 2019, 8:23:20 AM4/16/19
to Bernard Metzler, zrlio...@googlegroups.com
Is there a different git url for the user lib? The one I'm using only
has 'siw-for-rdma-next' branch.
[aglo@localhost softiwarp-user-for-linux-rdma]$ pwd
/home/aglo/siw/softiwarp-user-for-linux-rdma
[aglo@localhost softiwarp-user-for-linux-rdma]$ git branch -r
origin/HEAD -> origin/master
origin/master
origin/siw-for-rdma-next

url = https://github.com/zrlio/softiwarp-user-for-linux-rdma.git

Bernard Metzler

unread,
Apr 16, 2019, 8:29:14 AM4/16/19
to Olga Kornievskaia, zrlio...@googlegroups.com
...
>>
>> Oh if you want to use the latest v7 branch, you would need the
>> matching user lib as well (branch 'siw-for-rdma-next-v7'), since
>> the abi has changed. Please update your user rdma-core git repo...
>
>Is there a different git url for the user lib? The one I'm using only
>has 'siw-for-rdma-next' branch.
>[aglo@localhost softiwarp-user-for-linux-rdma]$ pwd
>/home/aglo/siw/softiwarp-user-for-linux-rdma
>[aglo@localhost softiwarp-user-for-linux-rdma]$ git branch -r
> origin/HEAD -> origin/master
> origin/master
> origin/siw-for-rdma-next
>

Please update your userlib git repo. I checked in a new v7 branch
quite soon after pushing the v7 kernel branch. You seem to have hit
the gap between these two updates....?

Best
Bernard.

Olga Kornievskaia

unread,
Apr 16, 2019, 4:20:39 PM4/16/19
to Bernard Metzler, zrlio...@googlegroups.com
Aha. Thank you. That did the trick. I have same kernel (v7) and user
land (v7) now.

ibv_devinfo is working but that's it. I have tried "rping" (also tried
ib_send_bw, ibc_rc_pingpong, udaddy). NFSoRDMA also doesn't work but
that's why I'm sticking with the simple RDMA tools first to get stuff
working. What tools do you recommend to show that stuff work? Can you
think of something else I failed to configure that would explain my
failures?

[aglo@localhost ~]$ ibv_devinfo
hca_id: siw0
transport: iWARP (1)
fw_ver: 0.0.0
node_guid: 000c:293d:c685:0000
sys_image_guid: 000c:293d:c685:0000
vendor_id: 0x626d74
vendor_part_id: 0
hw_ver: 0x0
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 1024 (3)
active_mtu: invalid MTU (0)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet

rping just hangs (no data is exchanged). server "rping -s -a
192.168.1.7 -v -C 3" client "rping -c -a 192.168.1.7 -v -C 3"

network trace has some iWarp message (frame 51 (call), 53 (reply), 55
(call), client sends last and doesn't hear a reply I gather that's why
it hangs):

kolga-mac-0:Downloads aglo$ tshark -Y "ip.addr == 192.168.1.7" -r 1.pcap
48 15.125869 192.168.1.5 → 192.168.1.7 TCP 74 36670 → 7174 [SYN]
Seq=1133900989 Win=64240 Len=0 MSS=1460 SACK_PERM=1 TSval=1412835379
TSecr=0 WS=128
49 15.126154 192.168.1.7 → 192.168.1.5 TCP 74 7174 → 36670 [SYN,
ACK] Seq=2996329872 Ack=1133900990 Win=65160 Len=0 MSS=1460
SACK_PERM=1 TSval=1154543397 TSecr=1412835379 WS=128
50 15.126343 192.168.1.5 → 192.168.1.7 TCP 66 36670 → 7174 [ACK]
Seq=1133900990 Ack=2996329873 Win=64256 Len=0 TSval=1412835379
TSecr=1154543397
51 15.126427 192.168.1.5 → 192.168.1.7 MPA 90 36670 > 7174 MPA
Request Frame
52 15.126671 192.168.1.7 → 192.168.1.5 TCP 66 7174 → 36670 [ACK]
Seq=2996329873 Ack=1133901014 Win=65152 Len=0 TSval=1154543397
TSecr=1412835380
53 15.128533 192.168.1.7 → 192.168.1.5 MPA 90 7174 > 36670 MPA Reply Frame
54 15.129113 192.168.1.5 → 192.168.1.7 TCP 66 36670 → 7174 [ACK]
Seq=1133901014 Ack=2996329897 Win=64256 Len=0 TSval=1412835382
TSecr=1154543399
55 15.130667 192.168.1.5 → 192.168.1.7 DDP/RDMA 86 36670 > 7174
Write [last DDP segment]
56 15.131123 192.168.1.7 → 192.168.1.5 TCP 66 7174 → 36670 [ACK]
Seq=2996329897 Ack=1133901034 Win=65152 Len=0 TSval=1154543402
TSecr=1412835384
372 63.093312 192.168.1.5 → 192.168.1.7 TCP 66 36670 → 7174 [FIN,
ACK] Seq=1133901034 Ack=2996329897 Win=64256 Len=0 TSval=1412883346
TSecr=1154543402
373 63.094275 192.168.1.7 → 192.168.1.5 TCP 66 7174 → 36670 [FIN,
ACK] Seq=2996329897 Ack=1133901035 Win=65152 Len=0 TSval=1154591365
TSecr=1412883346
374 63.094467 192.168.1.5 → 192.168.1.7 TCP 66 36670 → 7174 [ACK]
Seq=1133901035 Ack=2996329898 Win=64256 Len=0 TSval=1412883348
TSecr=1154591365

Bernard Metzler

unread,
Apr 17, 2019, 4:43:16 AM4/17/19
to Olga Kornievskaia, zrlio...@googlegroups.com
could you please switch on debugging in siw and let me know what
the console shows?

What is your setup? (1) siw <-> siw or (2) siw <-> rnic/hardware

if (2) - is siw client or server side?

The wireshark output doesn't show me the MPA options
selected. if it is setup (2), maybe the hardware settings
are incompatible with the siw settings? It is statically selected in
siw_main.c:

...
/* We try to negotiate CRC on, if true */
const bool mpa_crc_required;

/* MPA CRC on/off enforced */
const bool mpa_crc_strict;

/* Control TCP_NODELAY socket option */
const bool siw_tcp_nagle;

/* Select MPA version to be used during connection setup */
u_char mpa_version = MPA_REVISION_2;

/* Selects MPA P2P mode (additional handshake during connection
* setup, if true.
*/
const bool peer_to_peer;
...

so, crc is set to off, but if the peer insists, it
gets used.

MPA is set to version 2

peer2peer mode MPA is set to off.

I see the client sends an RDMA WRITE. This s not normal
rping behavior (which starts with an RDMA Send handshake).
Maybe the client side is in peer2peer mode and expects a reply
from the peer side?
That might be true if we have a setup with RNIC hardware at
one side in peer2peer mode, and the hardware not correctly
negotiating down from peer-to-peer mode if the siw peer
refuses that mode?

You can try rebuild siw with
const bool peer_to_peer = true;

in siw_main.c

If that works, and the peer is a hardware rnic, let me know what
hardware you are using ;)

Thanks,
Bernard.
>--
>You received this message because you are subscribed to the Google
>Groups "zrlio-users" group.
>To unsubscribe from this group and stop receiving emails from it,
>send an email to zrlio-users...@googlegroups.com.
>To post to this group, send email to zrlio...@googlegroups.com.
>Visit this group at
>https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.co
>m_group_zrlio-2Dusers&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8
>ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=QRyafzR_qAigCMPsQPGPXHaLf7gQd-SMmG
>s1KZXGJIo&s=3kBIe6tF36A5Jko9f1TSfF2XZRTTum7pjzrO43VNnbI&e=.
>m_d_msgid_zrlio-2Dusers_CAN-2D5tyEGu-2DQWM1-252BNWZJZvwGwL7Le2aKouPME
>S2G7fXDx97HXSg-2540mail.gmail.com&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r
>=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=QRyafzR_qAigCMPsQPGPXH
>aLf7gQd-SMmGs1KZXGJIo&s=vDcDq5A7D-CryXsxQ0V7XG84Jq1FWSniZbyl1o6JsNw&e
>=.
>For more options, visit
>https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.co
>m_d_optout&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU
>_QJcRRLfmYTAgd3QCvqSc&m=QRyafzR_qAigCMPsQPGPXHaLf7gQd-SMmGs1KZXGJIo&s
>=HolV_JHXY5nKH8oFxjDh5kNwi_DJiGlpVVPirbap5zM&e=.
>
>

Reply all
Reply to author
Forward
0 new messages