[PATCH v2 0/2] netrom: fix deadlock and refcount leak in nr_rt_device_down

1 view
Skip to first unread message

Junjie Cao

unread,
4:09 AM (10 hours ago) 4:09 AM
to pab...@redhat.com, da...@davemloft.net, edum...@google.com, ku...@kernel.org, syzbot+14afda...@syzkaller.appspotmail.com, ho...@kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, sta...@vger.kernel.org, junji...@intel.com
Hi,

syzbot reported a circular locking dependency in the NET/ROM routing
code involving nr_neigh_list_lock, nr_node_list_lock and
nr_node->node_lock when nr_rt_device_down() interacts with the
ioctl path. This series fixes that deadlock and also addresses a
long-standing reference count leak found while auditing the same
code.

Patch 1/2 refactors nr_rt_device_down() to avoid nested locking
between nr_neigh_list_lock and nr_node_list_lock by doing two
separate passes over nodes and neighbours, and adjusts nr_rt_free()
to follow the same lock ordering.

Patch 2/2 fixes a per-route reference count leak by dropping
nr_neigh->count and calling nr_neigh_put() when removing routes
from nr_rt_device_down(), mirroring the behaviour of
nr_dec_obs()/nr_del_node().

[1] https://syzkaller.appspot.com/bug?extid=14afda08dc3484d5db82

Thanks,
Junjie

Junjie Cao

unread,
4:09 AM (10 hours ago) 4:09 AM
to pab...@redhat.com, da...@davemloft.net, edum...@google.com, ku...@kernel.org, syzbot+14afda...@syzkaller.appspotmail.com, ho...@kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, sta...@vger.kernel.org, junji...@intel.com
syzbot reported a circular locking dependency involving
nr_neigh_list_lock, nr_node_list_lock and nr_node->node_lock in the
NET/ROM routing code [1].

One of the problematic scenarios looks like this:

CPU0 CPU1
---- ----
nr_rt_device_down() nr_rt_ioctl()
lock(nr_neigh_list_lock); nr_del_node()
... lock(nr_node_list_lock);
lock(nr_node_list_lock); nr_remove_neigh();
lock(nr_neigh_list_lock);

This creates the following lock chain:

nr_neigh_list_lock -> nr_node_list_lock -> &nr_node->node_lock

while the ioctl path may acquire the locks in the opposite order via
nr_dec_obs()/nr_del_node(), which makes lockdep complain about a
possible deadlock.

Refactor nr_rt_device_down() to avoid nested locking of
nr_neigh_list_lock and nr_node_list_lock. The function now performs
two separate passes: one that walks all nodes under nr_node_list_lock
and drops routes, and a second one that removes unused neighbours
under nr_neigh_list_lock.

Also adjust nr_rt_free() to acquire nr_node_list_lock before
nr_neigh_list_lock so that the global lock ordering remains
consistent.

[1] https://syzkaller.appspot.com/bug?extid=14afda08dc3484d5db82

Reported-by: syzbot+14afda...@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=14afda08dc3484d5db82
Tested-by: syzbot+14afda...@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: sta...@vger.kernel.org
Signed-off-by: Junjie Cao <junji...@intel.com>
---
net/netrom/nr_route.c | 65 ++++++++++++++++++++++---------------------
1 file changed, 33 insertions(+), 32 deletions(-)

diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
index b94cb2ffbaf8..20aacfdfccd4 100644
--- a/net/netrom/nr_route.c
+++ b/net/netrom/nr_route.c
@@ -508,40 +508,41 @@ void nr_rt_device_down(struct net_device *dev)
{
struct nr_neigh *s;
struct hlist_node *nodet, *node2t;
- struct nr_node *t;
+ struct nr_node *t;
int i;

- spin_lock_bh(&nr_neigh_list_lock);
- nr_neigh_for_each_safe(s, nodet, &nr_neigh_list) {
- if (s->dev == dev) {
- spin_lock_bh(&nr_node_list_lock);
- nr_node_for_each_safe(t, node2t, &nr_node_list) {
- nr_node_lock(t);
- for (i = 0; i < t->count; i++) {
- if (t->routes[i].neighbour == s) {
- t->count--;
-
- switch (i) {
- case 0:
- t->routes[0] = t->routes[1];
- fallthrough;
- case 1:
- t->routes[1] = t->routes[2];
- break;
- case 2:
- break;
- }
- }
- }
+ spin_lock_bh(&nr_node_list_lock);
+ nr_node_for_each_safe(t, node2t, &nr_node_list) {
+ nr_node_lock(t);
+ for (i = 0; i < t->count; i++) {
+ s = t->routes[i].neighbour;
+ if (s->dev == dev) {
+ t->count--;

- if (t->count <= 0)
- nr_remove_node_locked(t);
- nr_node_unlock(t);
+ switch (i) {
+ case 0:
+ t->routes[0] = t->routes[1];
+ fallthrough;
+ case 1:
+ t->routes[1] = t->routes[2];
+ break;
+ case 2:
+ break;
+ }
+ i--;
}
- spin_unlock_bh(&nr_node_list_lock);
+ }

+ if (t->count <= 0)
+ nr_remove_node_locked(t);
+ nr_node_unlock(t);
+ }
+ spin_unlock_bh(&nr_node_list_lock);
+
+ spin_lock_bh(&nr_neigh_list_lock);
+ nr_neigh_for_each_safe(s, nodet, &nr_neigh_list) {
+ if (s->dev == dev)
nr_remove_neigh_locked(s);
- }
}
spin_unlock_bh(&nr_neigh_list_lock);
}
@@ -962,23 +963,23 @@ const struct seq_operations nr_neigh_seqops = {
void nr_rt_free(void)
{
struct nr_neigh *s = NULL;
- struct nr_node *t = NULL;
+ struct nr_node *t = NULL;
struct hlist_node *nodet;

- spin_lock_bh(&nr_neigh_list_lock);
spin_lock_bh(&nr_node_list_lock);
+ spin_lock_bh(&nr_neigh_list_lock);
nr_node_for_each_safe(t, nodet, &nr_node_list) {
nr_node_lock(t);
nr_remove_node_locked(t);
nr_node_unlock(t);
}
nr_neigh_for_each_safe(s, nodet, &nr_neigh_list) {
- while(s->count) {
+ while (s->count) {
s->count--;
nr_neigh_put(s);
}
nr_remove_neigh_locked(s);
}
- spin_unlock_bh(&nr_node_list_lock);
spin_unlock_bh(&nr_neigh_list_lock);
+ spin_unlock_bh(&nr_node_list_lock);
}
--
2.43.0

Junjie Cao

unread,
4:09 AM (10 hours ago) 4:09 AM
to pab...@redhat.com, da...@davemloft.net, edum...@google.com, ku...@kernel.org, syzbot+14afda...@syzkaller.appspotmail.com, ho...@kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, sta...@vger.kernel.org, junji...@intel.com
When a device goes down, nr_rt_device_down() clears the routes that
use this device. However, it fails to drop the per-route reference to
the neighbour (nr_neigh): neither nr_neigh->count nor its refcount
is decremented when the route is removed.

Mirror the behaviour of nr_dec_obs() / nr_del_node() by decrementing
nr_neigh->count and calling nr_neigh_put(), so that the neighbour
reference is properly released when a device goes down.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: sta...@vger.kernel.org
Signed-off-by: Junjie Cao <junji...@intel.com>
---
net/netrom/nr_route.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
index 20aacfdfccd4..35d10985fd7a 100644
--- a/net/netrom/nr_route.c
+++ b/net/netrom/nr_route.c
@@ -517,6 +517,9 @@ void nr_rt_device_down(struct net_device *dev)
for (i = 0; i < t->count; i++) {
s = t->routes[i].neighbour;
if (s->dev == dev) {
+ s->count--;
+ nr_neigh_put(s);
+
t->count--;

switch (i) {
--
2.43.0

Reply all
Reply to author
Forward
0 new messages