[syzbot] [virt?] KCSAN: data-race in virtqueue_disable_cb / virtqueue_enable_cb_delayed (9)

12 views
Skip to first unread message

syzbot

unread,
Mar 4, 2025, 4:52:28 PM3/4/25
to eper...@redhat.com, jaso...@redhat.com, linux-...@vger.kernel.org, m...@redhat.com, syzkall...@googlegroups.com, virtual...@lists.linux.dev, xuan...@linux.alibaba.com
Hello,

syzbot found the following issue on:

HEAD commit: 99fa936e8e4f Merge tag 'affs-6.14-rc5-tag' of git://git.ke..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14629464580000
kernel config: https://syzkaller.appspot.com/x/.config?x=523b0e2f15224775
dashboard link: https://syzkaller.appspot.com/bug?extid=efe683d57990864b8c8e
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/a28c1cb92a01/disk-99fa936e.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/229349d043c8/vmlinux-99fa936e.xz
kernel image: https://storage.googleapis.com/syzbot-assets/49768322c46d/bzImage-99fa936e.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+efe683...@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in virtqueue_disable_cb / virtqueue_enable_cb_delayed

write to 0xffff8881025bc452 of 1 bytes by task 3288 on cpu 0:
virtqueue_enable_cb_delayed+0x42/0x3c0 drivers/virtio/virtio_ring.c:2653
start_xmit+0x230/0x1310 drivers/net/virtio_net.c:3264
__netdev_start_xmit include/linux/netdevice.h:5151 [inline]
netdev_start_xmit include/linux/netdevice.h:5160 [inline]
xmit_one net/core/dev.c:3800 [inline]
dev_hard_start_xmit+0x119/0x3f0 net/core/dev.c:3816
sch_direct_xmit+0x1a9/0x580 net/sched/sch_generic.c:343
__dev_xmit_skb net/core/dev.c:4039 [inline]
__dev_queue_xmit+0xf6a/0x2090 net/core/dev.c:4615
dev_queue_xmit include/linux/netdevice.h:3313 [inline]
neigh_hh_output include/net/neighbour.h:523 [inline]
neigh_output include/net/neighbour.h:537 [inline]
ip_finish_output2+0x71d/0x880 net/ipv4/ip_output.c:236
ip_finish_output+0x11a/0x2a0 net/ipv4/ip_output.c:324
NF_HOOK_COND include/linux/netfilter.h:303 [inline]
ip_output+0xab/0x170 net/ipv4/ip_output.c:434
dst_output include/net/dst.h:459 [inline]
ip_local_out net/ipv4/ip_output.c:130 [inline]
__ip_queue_xmit+0xb2c/0xb50 net/ipv4/ip_output.c:528
ip_queue_xmit+0x38/0x50 net/ipv4/ip_output.c:542
__tcp_transmit_skb+0x15ca/0x19d0 net/ipv4/tcp_output.c:1471
tcp_transmit_skb net/ipv4/tcp_output.c:1489 [inline]
tcp_write_xmit+0x1217/0x3020 net/ipv4/tcp_output.c:2832
__tcp_push_pending_frames+0x6a/0x1a0 net/ipv4/tcp_output.c:3015
tcp_push+0x320/0x340 net/ipv4/tcp.c:751
tcp_sendmsg_locked+0x21a1/0x26a0 net/ipv4/tcp.c:1326
tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1358
inet_sendmsg+0x77/0xd0 net/ipv4/af_inet.c:851
sock_sendmsg_nosec net/socket.c:718 [inline]
__sock_sendmsg+0x102/0x180 net/socket.c:733
sock_write_iter+0x15e/0x1a0 net/socket.c:1137
new_sync_write fs/read_write.c:586 [inline]
vfs_write+0x77b/0x920 fs/read_write.c:679
ksys_write+0xe8/0x1b0 fs/read_write.c:731
__do_sys_write fs/read_write.c:742 [inline]
__se_sys_write fs/read_write.c:739 [inline]
__x64_sys_write+0x42/0x50 fs/read_write.c:739
x64_sys_call+0x287e/0x2dc0 arch/x86/include/generated/asm/syscalls_64.h:2
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff8881025bc452 of 1 bytes by interrupt on cpu 1:
virtqueue_disable_cb_split drivers/virtio/virtio_ring.c:880 [inline]
virtqueue_disable_cb+0x92/0x180 drivers/virtio/virtio_ring.c:2566
skb_xmit_done+0x5f/0x140 drivers/net/virtio_net.c:777
vring_interrupt+0x161/0x190 drivers/virtio/virtio_ring.c:2715
__handle_irq_event_percpu+0x95/0x490 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0x64/0xf0 kernel/irq/handle.c:210
handle_edge_irq+0x16d/0x5b0 kernel/irq/chip.c:831
generic_handle_irq_desc include/linux/irqdesc.h:173 [inline]
handle_irq arch/x86/kernel/irq.c:249 [inline]
call_irq_handler arch/x86/kernel/irq.c:261 [inline]
__common_interrupt+0x58/0xe0 arch/x86/kernel/irq.c:287
common_interrupt+0x7c/0x90 arch/x86/kernel/irq.c:280
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
rep_movs_alternative+0x33/0x70 arch/x86/lib/copy_user_64.S:57
copy_user_generic arch/x86/include/asm/uaccess_64.h:126 [inline]
raw_copy_from_user arch/x86/include/asm/uaccess_64.h:141 [inline]
_inline_copy_from_user include/linux/uaccess.h:178 [inline]
_copy_from_user+0x6f/0xa0 lib/usercopy.c:18
copy_from_user include/linux/uaccess.h:212 [inline]
copy_msghdr_from_user+0x54/0x2a0 net/socket.c:2503
recvmsg_copy_msghdr net/socket.c:2759 [inline]
___sys_recvmsg net/socket.c:2831 [inline]
do_recvmmsg+0x256/0x6d0 net/socket.c:2930
__sys_recvmmsg net/socket.c:3004 [inline]
__do_sys_recvmmsg net/socket.c:3027 [inline]
__se_sys_recvmmsg net/socket.c:3020 [inline]
__x64_sys_recvmmsg+0xe2/0x170 net/socket.c:3020
x64_sys_call+0x2a9a/0x2dc0 arch/x86/include/generated/asm/syscalls_64.h:300
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0x01 -> 0x00

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 4167 Comm: syz.0.259 Not tainted 6.14.0-rc5-syzkaller-00013-g99fa936e8e4f #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Zhongqiu Han

unread,
Mar 11, 2025, 9:17:57 AM3/11/25
to m...@redhat.com, jaso...@redhat.com, xuan...@linux.alibaba.com, eper...@redhat.com, syzbot+efe683...@syzkaller.appspotmail.com, virtual...@lists.linux.dev, linux-...@vger.kernel.org, quic_z...@quicinc.com, syzkall...@googlegroups.com
Syzkaller reports a data-race when accessing the event_triggered field of
vring_virtqueue in virtqueue_disable_cb / virtqueue_enable_cb_delayed.
Here is the simplified stack when the issue occurred:

==================================================================
BUG: KCSAN: data-race in virtqueue_disable_cb / virtqueue_enable_cb_delayed

write to 0xffff8881025bc452 of 1 bytes by task 3288 on cpu 0:
virtqueue_enable_cb_delayed+0x42/0x3c0 drivers/virtio/virtio_ring.c:2653
start_xmit+0x230/0x1310 drivers/net/virtio_net.c:3264
__netdev_start_xmit include/linux/netdevice.h:5151 [inline]
netdev_start_xmit include/linux/netdevice.h:5160 [inline]
xmit_one net/core/dev.c:3800 [inline]
dev_hard_start_xmit+0x119/0x3f0 net/core/dev.c:3816
sch_direct_xmit+0x1a9/0x580 net/sched/sch_generic.c:343
__dev_xmit_skb net/core/dev.c:4039 [inline]
__dev_queue_xmit+0xf6a/0x2090 net/core/dev.c:4615

read to 0xffff8881025bc452 of 1 bytes by interrupt on cpu 1:
virtqueue_disable_cb_split drivers/virtio/virtio_ring.c:880 [inline]
virtqueue_disable_cb+0x92/0x180 drivers/virtio/virtio_ring.c:2566
skb_xmit_done+0x5f/0x140 drivers/net/virtio_net.c:777
vring_interrupt+0x161/0x190 drivers/virtio/virtio_ring.c:2715
__handle_irq_event_percpu+0x95/0x490 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0x64/0xf0 kernel/irq/handle.c:210
handle_edge_irq+0x16d/0x5b0 kernel/irq/chip.c:831
generic_handle_irq_desc include/linux/irqdesc.h:173 [inline]
handle_irq arch/x86/kernel/irq.c:249 [inline]

value changed: 0x01 -> 0x00
==================================================================

After an interrupt is triggered, event_triggered can be set to true in the
func vring_interrupt(). Then virtqueue_disable_cb_split() will read it as
true and stop further work of disabling cbs. During this time, if another
virtqueue processing sets same event_triggered to false in func
virtqueue_enable_cb_delayed(), a race condition will occur, potentially
leading to further vq data inconsistency because both
virtqueue_disable_cb_split() and virtqueue_enable_cb_delayed() can
continue read/write multiple field members of vring_virtqueue.

Fix this by using smp_load_acquire() and smp_store_release().

Additionally, virtqueue_disable_cb_packed() may be called in the same
stack as virtqueue_disable_cb_split() while vq->packed_ring is true in
func virtqueue_disable_cb(), so event_triggered should also be protected
in it.

Reported-by: syzbot+efe683...@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67c7761a.050a022...@google.com/
Signed-off-by: Zhongqiu Han <quic_z...@quicinc.com>
---
drivers/virtio/virtio_ring.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index fdd2d2b07b5a..b8ff82730618 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -875,9 +875,10 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq)

/*
* If device triggered an event already it won't trigger one again:
- * no need to disable.
+ * no need to disable. smp_load_acquire pairs with smp_store_release()
+ * in virtqueue_enable_cb_delayed()
*/
- if (vq->event_triggered)
+ if (smp_load_acquire(&vq->event_triggered))
return;

if (vq->event)
@@ -1802,9 +1803,10 @@ static void virtqueue_disable_cb_packed(struct virtqueue *_vq)

/*
* If device triggered an event already it won't trigger one again:
- * no need to disable.
+ * no need to disable. smp_load_acquire pairs with smp_store_release()
+ * in virtqueue_enable_cb_delayed()
*/
- if (vq->event_triggered)
+ if (smp_load_acquire(&vq->event_triggered))
return;

vq->packed.vring.driver->flags =
@@ -2650,7 +2652,8 @@ bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
struct vring_virtqueue *vq = to_vvq(_vq);

if (vq->event_triggered)
- vq->event_triggered = false;
+ /* Pairs with smp_load_acquire in virtqueue_disable_cb_split/packed() */
+ smp_store_release(&vq->event_triggered, false);

return vq->packed_ring ? virtqueue_enable_cb_delayed_packed(_vq) :
virtqueue_enable_cb_delayed_split(_vq);
--
2.25.1

Jason Wang

unread,
Mar 11, 2025, 9:12:31 PM3/11/25
to Zhongqiu Han, m...@redhat.com, xuan...@linux.alibaba.com, eper...@redhat.com, syzbot+efe683...@syzkaller.appspotmail.com, virtual...@lists.linux.dev, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Do we have performance numbers for this change?

Btw event_triggered is just a hint, using barriers seems to be an overkill.

What's more the current implementation is buggy:

1) event_triggered should be only called when event idx is used
2) the assumption of device won't raise the interrupt is not ture,
this is especially obvious in the case of packed ring, when the
wrap_counter warps twice, we could still get an interrupt from the
device. This means when the virtqueue size is 256 we will get 1
unnecessary notification every 512 packets etc.

So I wonder just a data_race() hint would be more than sufficient.

Thanks

Zhongqiu Han

unread,
Mar 12, 2025, 5:44:34 AM3/12/25
to Jason Wang, m...@redhat.com, xuan...@linux.alibaba.com, eper...@redhat.com, syzbot+efe683...@syzkaller.appspotmail.com, virtual...@lists.linux.dev, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Zhongqiu Han
Thanks a lot Jason for the review and discussion.

Hence event_triggered is unreliable hint and used as an optimization,
the racy is expected.

I will use data_race for KCSAN and arise V2. Thanks
--
Thx and BRs,
Zhongqiu Han

Zhongqiu Han

unread,
Mar 12, 2025, 9:04:32 AM3/12/25
to m...@redhat.com, jaso...@redhat.com, xuan...@linux.alibaba.com, eper...@redhat.com, syzbot+efe683...@syzkaller.appspotmail.com, virtual...@lists.linux.dev, linux-...@vger.kernel.org, quic_z...@quicinc.com, syzkall...@googlegroups.com
syzbot reports a data-race when accessing the event_triggered, here is the
simplified stack when the issue occurred:

==================================================================
BUG: KCSAN: data-race in virtqueue_disable_cb / virtqueue_enable_cb_delayed

write to 0xffff8881025bc452 of 1 bytes by task 3288 on cpu 0:
virtqueue_enable_cb_delayed+0x42/0x3c0 drivers/virtio/virtio_ring.c:2653
start_xmit+0x230/0x1310 drivers/net/virtio_net.c:3264
__netdev_start_xmit include/linux/netdevice.h:5151 [inline]
netdev_start_xmit include/linux/netdevice.h:5160 [inline]
xmit_one net/core/dev.c:3800 [inline]

read to 0xffff8881025bc452 of 1 bytes by interrupt on cpu 1:
virtqueue_disable_cb_split drivers/virtio/virtio_ring.c:880 [inline]
virtqueue_disable_cb+0x92/0x180 drivers/virtio/virtio_ring.c:2566
skb_xmit_done+0x5f/0x140 drivers/net/virtio_net.c:777
vring_interrupt+0x161/0x190 drivers/virtio/virtio_ring.c:2715
__handle_irq_event_percpu+0x95/0x490 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]

value changed: 0x01 -> 0x00
==================================================================

When the data race occurs, the function virtqueue_enable_cb_delayed() sets
event_triggered to false, and virtqueue_disable_cb_split/packed() reads it
as false due to the race condition. Since event_triggered is an unreliable
hint used for optimization, this should only cause the driver temporarily
suggest that the device not send an interrupt notification when the event
index is used.

Fix this KCSAN reported data-race issue by explicitly tagging the access as
data_racy.
---
v1 -> v2:
- Use data_race() instead of memory barriers.
- Simplify and rewrite commit messages.
- Link to v1: https://lore.kernel.org/all/20250311131735.320...@quicinc.com/

drivers/virtio/virtio_ring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index fdd2d2b07b5a..b784aab66867 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2650,7 +2650,7 @@ bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
struct vring_virtqueue *vq = to_vvq(_vq);

if (vq->event_triggered)
- vq->event_triggered = false;
+ data_race(vq->event_triggered = false);

Jason Wang

unread,
Mar 16, 2025, 8:26:46 PM3/16/25
to Zhongqiu Han, m...@redhat.com, xuan...@linux.alibaba.com, eper...@redhat.com, syzbot+efe683...@syzkaller.appspotmail.com, virtual...@lists.linux.dev, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Acked-by: Jason Wang <jaso...@redhat.com>

Thanks

Reply all
Reply to author
Forward
0 new messages