[syzbot] unregister_netdevice: waiting for DEV to become free (7)

25 views
Skip to first unread message

syzbot

unread,
Nov 18, 2022, 6:39:43 AM11/18/22
to linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 9c8774e629a1 net: eql: Use kzalloc instead of kmalloc/memset
git tree: net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17bf6cc8f00000
kernel config: https://syzkaller.appspot.com/x/.config?x=9eb259db6b1893cf
dashboard link: https://syzkaller.appspot.com/bug?extid=5e70d01ee8985ae62a3b
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1136d592f00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1193ae64f00000

Bisection is inconclusive: the issue happens on the oldest tested release.

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=167c33a2f00000
final oops: https://syzkaller.appspot.com/x/report.txt?x=157c33a2f00000
console output: https://syzkaller.appspot.com/x/log.txt?x=117c33a2f00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+5e70d0...@syzkaller.appspotmail.com

iwpm_register_pid: Unable to send a nlmsg (client = 2)
infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98
unregister_netdevice: waiting for vlan0 to become free. Usage count = 2


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Dmitry Vyukov

unread,
Nov 18, 2022, 8:29:05 AM11/18/22
to syzbot, Jason Gunthorpe, Leon Romanovsky, chenzh...@huawei.com, RDMA mailing list, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
On Fri, 18 Nov 2022 at 12:39, syzbot
<syzbot+5e70d0...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 9c8774e629a1 net: eql: Use kzalloc instead of kmalloc/memset
> git tree: net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=17bf6cc8f00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=9eb259db6b1893cf
> dashboard link: https://syzkaller.appspot.com/bug?extid=5e70d01ee8985ae62a3b
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1136d592f00000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1193ae64f00000
>
> Bisection is inconclusive: the issue happens on the oldest tested release.
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=167c33a2f00000
> final oops: https://syzkaller.appspot.com/x/report.txt?x=157c33a2f00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=117c33a2f00000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+5e70d0...@syzkaller.appspotmail.com
>
> iwpm_register_pid: Unable to send a nlmsg (client = 2)
> infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98
> unregister_netdevice: waiting for vlan0 to become free. Usage count = 2

+RDMA maintainers

There are 4 reproducers and all contain:

r0 = socket$nl_rdma(0x10, 0x3, 0x14)
sendmsg$RDMA_NLDEV_CMD_NEWLINK(...)

Also the preceding print looks related (a bug in the error handling
path there?):

infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98

Jason Gunthorpe

unread,
Nov 21, 2022, 9:13:16 PM11/21/22
to Dmitry Vyukov, syzbot, Leon Romanovsky, chenzh...@huawei.com, RDMA mailing list, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com, Zhu Yanjun, Bob Pearson
I'm pretty sure it is an rxe bug

ib_device_set_netdev() will hold the netdev until the caller destroys
the ib_device

rxe calls it during rxe_register_device() because the user asked for a
stacked ib_device on top of the netdev

Presumably rxe needs to have a notifier to also self destroy the rxe
device if the underlying net device is to be destroyed?

Can someone from rxe check into this?

Jason

wangyufen

unread,
Nov 21, 2022, 10:28:32 PM11/21/22
to Jason Gunthorpe, Dmitry Vyukov, syzbot, Leon Romanovsky, chenzh...@huawei.com, RDMA mailing list, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com, Zhu Yanjun, Bob Pearson
The following patch may fix the issue:

--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -4049,6 +4049,9 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
return 0;
err:
id_priv->backlog = 0;
+ if (id_priv->cma_dev)
+ cma_release_dev(id_priv);
+
/*
* All the failure paths that lead here will not allow the
req_handler's
* to have run.



The causes are as follows:

rdma_listen()
rdma_bind_addr()
cma_acquire_dev_by_src_ip()
cma_attach_to_dev()
_cma_attach_to_dev()
cma_dev_get()

cma_check_port()
<--The return value is not zero, goto err

err:
<-- The error handling here is missing the operation of cma_release_dev.

>
> Jason

Guoqing Jiang

unread,
Nov 23, 2022, 4:45:58 AM11/23/22
to wangyufen, Jason Gunthorpe, Dmitry Vyukov, syzbot, Leon Romanovsky, chenzh...@huawei.com, RDMA mailing list, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com, Zhu Yanjun, Bob Pearson
But it is the caller's responsibility to destroy it since commit
dd37d2f59eb8.

> The causes are as follows:
>
> rdma_listen()
>   rdma_bind_addr()
>     cma_acquire_dev_by_src_ip()
>       cma_attach_to_dev()
>         _cma_attach_to_dev()
>           cma_dev_get()

Thanks for the analysis.

And for the two callers of cma_listen_on_dev, looks they have
different behaviors with regard to handling failure.

1. cma_listen_on_all which calls both
            list_del_init(&to_destroy->device_item)
    and
            rdma_destroy_id(&to_destroy->id)

2. cma_add_one invokes cma_process_remove to delete to_destroy,
cma_process_remove call both list_del_init(&id_priv->listen_item)
and list_del_init(&id_priv->device_item), but it doesn't call
rdma_destroy_id(&dev_id_priv->id) which is also different with
_cma_cancel_listens.

I am wondering if this is needed.

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index cc2222b85c88..48e283d1389b 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -5231,6 +5231,7 @@ static void cma_process_remove(struct cma_device
*cma_dev)
                cma_id_get(id_priv);
                mutex_unlock(&lock);

+               rdma_destroy_id(&dev_id_priv->id);
                cma_send_device_removal_put(id_priv);

                mutex_lock(&lock);

Thanks,
Guoqing

Jason Gunthorpe

unread,
Nov 23, 2022, 7:22:38 PM11/23/22
to Guoqing Jiang, Bernard Metzler, wangyufen, Dmitry Vyukov, syzbot, Leon Romanovsky, chenzh...@huawei.com, RDMA mailing list, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com, Zhu Yanjun, Bob Pearson
On Wed, Nov 23, 2022 at 05:45:53PM +0800, Guoqing Jiang wrote:
> But it is the caller's responsibility to destroy it since commit
> dd37d2f59eb8.
>
> > The causes are as follows:
> >
> > rdma_listen()
> >   rdma_bind_addr()
> >     cma_acquire_dev_by_src_ip()
> >       cma_attach_to_dev()
> >         _cma_attach_to_dev()
> >           cma_dev_get()
>
> Thanks for the analysis.
>
> And for the two callers of cma_listen_on_dev, looks they have
> different behaviors with regard to handling failure.

Yes, the CM is not the problem, and that print from it is unrelated

I patched in netdevice_tracker and get this:

[ 237.475070][ T7541] unregister_netdevice: waiting for vlan0 to become free. Usage count = 2
[ 237.477311][ T7541] leaked reference.
[ 237.478378][ T7541] ib_device_set_netdev+0x266/0x730
[ 237.479848][ T7541] siw_newlink+0x4e0/0xfd0
[ 237.481100][ T7541] nldev_newlink+0x35c/0x5c0
[ 237.482121][ T7541] rdma_nl_rcv_msg+0x36d/0x690
[ 237.483312][ T7541] rdma_nl_rcv+0x2ee/0x430
[ 237.484483][ T7541] netlink_unicast+0x543/0x7f0
[ 237.485746][ T7541] netlink_sendmsg+0x918/0xe20
[ 237.486866][ T7541] sock_sendmsg+0xcf/0x120
[ 237.488006][ T7541] ____sys_sendmsg+0x70d/0x8b0
[ 237.489294][ T7541] ___sys_sendmsg+0x11d/0x1b0
[ 237.490404][ T7541] __sys_sendmsg+0xfa/0x1d0
[ 237.491451][ T7541] do_syscall_64+0x35/0xb0
[ 237.492566][ T7541] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Which seems to confirm my original prediction, except this is siw not
rxe..

Maybe rxe was the wrong guess, or maybe it is troubled too in other
reports?

Jason

wangyufen

unread,
Nov 23, 2022, 8:43:01 PM11/23/22
to Jason Gunthorpe, Guoqing Jiang, Bernard Metzler, Dmitry Vyukov, syzbot, Leon Romanovsky, chenzh...@huawei.com, RDMA mailing list, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com, Zhu Yanjun, Bob Pearson


在 2022/11/24 8:22, Jason Gunthorpe 写道:
> On Wed, Nov 23, 2022 at 05:45:53PM +0800, Guoqing Jiang wrote:
>> But it is the caller's responsibility to destroy it since commit
>> dd37d2f59eb8.
>>
>>> The causes are as follows:
>>>
>>> rdma_listen()
>>>   rdma_bind_addr()
>>>     cma_acquire_dev_by_src_ip()
>>>       cma_attach_to_dev()
>>>         _cma_attach_to_dev()
>>>           cma_dev_get()
>>
>> Thanks for the analysis.
>>
>> And for the two callers of cma_listen_on_dev, looks they have
>> different behaviors with regard to handling failure.
>
> Yes, the CM is not the problem, and that print from it is unrelated
>
Yes, I misanalyzed earlier.

> I patched in netdevice_tracker and get this:
>
> [ 237.475070][ T7541] unregister_netdevice: waiting for vlan0 to become free. Usage count = 2
> [ 237.477311][ T7541] leaked reference.
> [ 237.478378][ T7541] ib_device_set_netdev+0x266/0x730
> [ 237.479848][ T7541] siw_newlink+0x4e0/0xfd0
> [ 237.481100][ T7541] nldev_newlink+0x35c/0x5c0
> [ 237.482121][ T7541] rdma_nl_rcv_msg+0x36d/0x690
> [ 237.483312][ T7541] rdma_nl_rcv+0x2ee/0x430
> [ 237.484483][ T7541] netlink_unicast+0x543/0x7f0
> [ 237.485746][ T7541] netlink_sendmsg+0x918/0xe20
> [ 237.486866][ T7541] sock_sendmsg+0xcf/0x120
> [ 237.488006][ T7541] ____sys_sendmsg+0x70d/0x8b0
> [ 237.489294][ T7541] ___sys_sendmsg+0x11d/0x1b0
> [ 237.490404][ T7541] __sys_sendmsg+0xfa/0x1d0
> [ 237.491451][ T7541] do_syscall_64+0x35/0xb0
> [ 237.492566][ T7541] entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> Which seems to confirm my original prediction, except this is siw not
> rxe..
>
Rxe dose not have this issue, maybe because it does not support vlan dev.

Tetsuo Handa

unread,
Mar 25, 2023, 12:02:47 PM3/25/23
to Jason Gunthorpe, Leon Romanovsky, chenzh...@huawei.com, Zhu Yanjun, Bob Pearson, Guoqing Jiang, Bernard Metzler, linux...@vger.kernel.org, wangyufen, Dmitry Vyukov, syzbot+5e70d0...@syzkaller.appspotmail.com, syzkaller-bugs
I made a simplified reproducer, using
https://syzkaller.appspot.com/text?tag=ReproC&x=1193ae64f00000 as a base
with main() rewritten like below.

----------
int main(void)
{
unshare(CLONE_NEWNET);
initialize_netdevices();
const int fd1 = socket(AF_NETLINK, SOCK_RAW, NETLINK_RDMA);
struct iovec iov = {
.iov_base = "\x38\x00\x00\x00\x03\x14\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x09"
"\x00\x02\x00\x73\x79\x6a\x31\x00\x00\x00\x00\x08\x00\x41\x00\x73\x69"
"\x77\x00\x14\x00\x33\x00\x76\x6c\x61\x6e\x30",
.iov_len = 0x38
};
struct msghdr msg = {
.msg_iov = &iov,
.msg_iovlen = 1
};
sendmsg(fd1, &msg, 0);
if (0) { /* Enable this block if you want to observe hung without exit(0). */
const int fd2 = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
struct /* vlan_ioctl_args */ {
int cmd;
char device1[24];
union {
char device2[24];
int VID;
unsigned int skb_priority;
unsigned int name_type;
unsigned int bind_type;
unsigned int flag;
} u;
short vlan_qos;
} args = {
.cmd = 1 /* DEL_VLAN_CMD */,
.device1 = "vlan0",
};
ioctl(fd2, 0x8982 /* SIOCGIFVLAN */, &args);
}
return 0;
}
----------

Using this reproducer, I did several tests.

This problem is reproduced with below diff (which pretends as if e.g.
cma_add_one(), ib_uverbs_add_one(), srp_add_one() failed with -ENOMEM
error because they do GFP_KERNEL allocation).

----------
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index a666847bd714..f8f593bd81e5 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -710,23 +710,23 @@ static int add_client_context(struct ib_device *device,
*/
if (xa_get_mark(&device->client_data, client->client_id,
CLIENT_DATA_REGISTERED))
goto out;

ret = xa_err(xa_store(&device->client_data, client->client_id, NULL,
GFP_KERNEL));
if (ret)
goto out;
downgrade_write(&device->client_data_rwsem);
if (client->add) {
- if (client->add(device)) {
+ if (1) {
/*
* If a client fails to add then the error code is
* ignored, but we won't call any more ops on this
* client.
*/
xa_erase(&device->client_data, client->client_id);
up_read(&device->client_data_rwsem);
ib_device_put(device);
ib_client_put(client);
return 0;
}
----------

Since client->add() can fail with -ENOMEM, pretending as if
client->add() always fails with -ENOMEM hides this problem.

----------
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index a666847bd714..4c986f62514f 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -695,6 +695,9 @@ static int add_client_context(struct ib_device *device,
if (!device->kverbs_provider && !client->no_kverbs_req)
return 0;

+ if (client->add)
+ return -ENOMEM;
+
down_write(&device->client_data_rwsem);
/*
* So long as the client is registered hold both the client and device
----------

However, changing to return 0 instead of -ENOMEM again shows this problem.

----------
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index a666847bd714..9dcf712e542d 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -695,6 +695,9 @@ static int add_client_context(struct ib_device *device,
if (!device->kverbs_provider && !client->no_kverbs_req)
return 0;

+ if (client->add)
+ return 0;
+
down_write(&device->client_data_rwsem);
/*
* So long as the client is registered hold both the client and device
----------

These results suggest that the "If a client fails to add then the error code
is ignored" part is wrong. We need to teach the caller about the failure, even
if we don't call any more ops on this client.

Please re-evaluate commit 11a0ae4c4bff ("RDMA: Allow ib_client's to fail when
add() is called"). I don't know why that commit ignores failures. There could
be some error code which should be ignored, but I feel that there are error
codes which should not be ignored. What do you think?




From 3e1cd08cb206afa827d196e4730ba9c3670a7fe7 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Date: Sun, 26 Mar 2023 00:34:05 +0900
Subject: [PATCH] RDMA: don't ignore client->add() failures

syzbot is reporting refcount leak when client->add() from
add_client_context() fails, for add_client_context() is ignoring error from
client->add(). We need to return an error in order to indicate that client
could not be added, or the caller will get confused.

Reported-by: syzbot <syzbot+5e70d0...@syzkaller.appspotmail.com>
Link: https://syzkaller.appspot.com/bug?extid=5e70d01ee8985ae62a3b
Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
---
drivers/infiniband/core/device.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index a666847bd714..c72f810ceae1 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -718,17 +718,17 @@ static int add_client_context(struct ib_device *device,
goto out;
downgrade_write(&device->client_data_rwsem);
if (client->add) {
- if (client->add(device)) {
+ ret = client->add(device);
+ if (ret) {
/*
- * If a client fails to add then the error code is
- * ignored, but we won't call any more ops on this
- * client.
+ * If a client fails to add, we won't call any more
+ * ops on this client.
*/
xa_erase(&device->client_data, client->client_id);
up_read(&device->client_data_rwsem);
ib_device_put(device);
ib_client_put(client);
- return 0;
+ return ret;
}
}

--
2.34.1

Tetsuo Handa

unread,
Apr 8, 2023, 10:08:38 AM4/8/23
to syzbot, syzkaller-bugs
Reply all
Reply to author
Forward
0 new messages