[v5.15] INFO: rcu detected stall in __do_softirq

0 views
Skip to first unread message

syzbot

unread,
Oct 12, 2023, 7:34:47 AM10/12/23
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 02e21884dcf2 Linux 5.15.135
git tree: linux-5.15.y
console output: https://syzkaller.appspot.com/x/log.txt?x=17076f95680000
kernel config: https://syzkaller.appspot.com/x/.config?x=e4cc6ae646bdb7fa
dashboard link: https://syzkaller.appspot.com/bug?extid=82bb8c294e6a20f1a15e
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/070649b789aa/disk-02e21884.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/b4dc88e68ad3/vmlinux-02e21884.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3bf6aaa2a543/bzImage-02e21884.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+82bb8c...@syzkaller.appspotmail.com

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: Tasks blocked on level-0 rcu_node (CPUs 0-1): P5864/1:b..l
(detected by 1, t=10506 jiffies, g=219225, q=106)
task:kworker/u4:22 state:R running task stack:22712 pid: 5864 ppid: 2 flags:0x00004000
Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5026 [inline]
__schedule+0x12c4/0x45b0 kernel/sched/core.c:6372
preempt_schedule_irq+0xf7/0x1c0 kernel/sched/core.c:6776
irqentry_exit+0x53/0x80 kernel/entry/common.c:426
asm_sysvec_reschedule_ipi+0x16/0x20 arch/x86/include/asm/idtentry.h:643
RIP: 0010:batadv_iv_ogm_schedule_buff net/batman-adv/bat_iv_ogm.c:844 [inline]
RIP: 0010:batadv_iv_ogm_schedule+0x943/0x1000 net/batman-adv/bat_iv_ogm.c:869
Code: 20 ec 7d 91 49 8d 5f 70 48 89 d8 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df 80 3c 08 00 74 08 48 89 df e8 50 c6 d4 f7 48 8b 1b <48> b8 00 00 00 00 00 fc ff df 48 8b 4c 24 78 80 3c 01 00 74 08 4c
RSP: 0018:ffffc9000495fa60 EFLAGS: 00000246
RAX: 1ffff110080af10e RBX: 0000000000000000 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: ffffffff8ad87960 RDI: ffffffff8ad87920
RBP: ffffc9000495fb98 R08: ffffffff89f50ce4 R09: fffffbfff1f7b221
R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888075d6a870
R13: ffff88802a7181c0 R14: ffffffff917dec20 R15: ffff888040578800
batadv_iv_send_outstanding_bat_ogm_packet+0x6fa/0x800 net/batman-adv/bat_iv_ogm.c:1713
process_one_work+0x8a1/0x10c0 kernel/workqueue.c:2310
worker_thread+0xaca/0x1280 kernel/workqueue.c:2457
kthread+0x3f6/0x4f0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
</TASK>
rcu: rcu_preempt kthread timer wakeup didn't happen for 10505 jiffies! g219225 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
rcu: Possible timer handling issue on cpu=0 timer-softirq=162818
rcu: rcu_preempt kthread starved for 10506 jiffies! g219225 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt state:I stack:26200 pid: 15 ppid: 2 flags:0x00004000
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5026 [inline]
__schedule+0x12c4/0x45b0 kernel/sched/core.c:6372
schedule+0x11b/0x1f0 kernel/sched/core.c:6455
schedule_timeout+0x1b9/0x300 kernel/time/timer.c:1884
rcu_gp_fqs_loop+0x2af/0xf70 kernel/rcu/tree.c:1959
rcu_gp_kthread+0xa4/0x360 kernel/rcu/tree.c:2132
kthread+0x3f6/0x4f0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
</TASK>
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 8901 Comm: syz-executor.2 Not tainted 5.15.135-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
RIP: 0010:arch_atomic_read arch/x86/include/asm/atomic.h:29 [inline]
RIP: 0010:atomic_read include/linux/atomic/atomic-instrumented.h:28 [inline]
RIP: 0010:queued_spin_is_locked include/asm-generic/qspinlock.h:28 [inline]
RIP: 0010:debug_spin_unlock kernel/locking/spinlock_debug.c:100 [inline]
RIP: 0010:do_raw_spin_unlock+0x5f/0x8b0 kernel/locking/spinlock_debug.c:140
Code: 0f 85 2e 04 00 00 81 3a ad 4e ad de 0f 85 01 01 00 00 4c 89 ff be 04 00 00 00 e8 ec df 65 00 4c 89 f8 48 c1 e8 03 0f b6 04 18 <84> c0 0f 85 24 04 00 00 41 83 3f 00 0f 84 9a 01 00 00 4d 8d 6f 10
RSP: 0000:ffffc900000072b0 EFLAGS: 00000802
RAX: 0000000000000000 RBX: dffffc0000000000 RCX: ffffffff8163f5d4
RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff888030270ae8
RBP: ffff888030270ae8 R08: dffffc0000000000 R09: ffffed100604e15e
R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888030270800
R13: ffff888030270b40 R14: dffffc0000000000 R15: ffff888030270ae8
FS: 00007f417b4276c0(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f41736f8000 CR3: 00000000233bf000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<NMI>
</NMI>
<IRQ>
__raw_spin_unlock include/linux/spinlock_api_smp.h:151 [inline]
_raw_spin_unlock+0x1a/0x40 kernel/locking/spinlock.c:186
spin_unlock include/linux/spinlock.h:403 [inline]
advance_sched+0x68b/0x940 net/sched/sch_taprio.c:765
__run_hrtimer kernel/time/hrtimer.c:1685 [inline]
__hrtimer_run_queues+0x598/0xcf0 kernel/time/hrtimer.c:1749
hrtimer_interrupt+0x392/0x980 kernel/time/hrtimer.c:1811
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1085 [inline]
__sysvec_apic_timer_interrupt+0x139/0x470 arch/x86/kernel/apic/apic.c:1102
sysvec_apic_timer_interrupt+0x3e/0xb0 arch/x86/kernel/apic/apic.c:1096
asm_sysvec_apic_timer_interrupt+0x16/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:stack_trace_consume_entry+0x12f/0x270 kernel/stacktrace.c:95
Code: 89 f9 4c 89 ee 4c 89 f2 48 89 33 41 8b 19 41 0f b6 04 14 84 c0 0f 85 30 01 00 00 3b 5d 00 0f 92 c0 48 83 c4 18 5b 41 5c 41 5d <41> 5e 41 5f 5d c3 44 89 c9 80 e1 07 80 c1 03 38 c1 0f 8c ed fe ff
RSP: 0000:ffffc90000007720 EFLAGS: 00000282
RAX: 0000000000000001 RBX: ffffc90000007740 RCX: 0000000000000101
RDX: dffffc0000000000 RSI: ffffffff8a400cc6 RDI: ffffc9000000782c
RBP: ffffc90000007828 R08: 000000000000000a R09: ffffc90000007830
R10: 0000000000000000 R11: dffffc0000000001 R12: ffff88807ca19dc0
R13: ffffffff816dc510 R14: 000000000000000b R15: 1ffff92000000f06
arch_stack_walk+0x101/0x140 arch/x86/kernel/stacktrace.c:27
stack_trace_save+0x113/0x1c0 kernel/stacktrace.c:122
kasan_save_stack mm/kasan/common.c:38 [inline]
kasan_set_track+0x4b/0x80 mm/kasan/common.c:46
kasan_set_free_info+0x1f/0x40 mm/kasan/generic.c:360
____kasan_slab_free+0xd8/0x120 mm/kasan/common.c:366
kasan_slab_free include/linux/kasan.h:230 [inline]
slab_free_hook mm/slub.c:1705 [inline]
slab_free_freelist_hook+0xdd/0x160 mm/slub.c:1731
slab_free mm/slub.c:3499 [inline]
kmem_cache_free+0x91/0x1f0 mm/slub.c:3515
rcu_do_batch kernel/rcu/tree.c:2510 [inline]
rcu_core+0xa15/0x1650 kernel/rcu/tree.c:2750
__do_softirq+0x3b3/0x93a kernel/softirq.c:558
invoke_softirq kernel/softirq.c:432 [inline]
__irq_exit_rcu+0x155/0x240 kernel/softirq.c:637
irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1096
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:check_preemption_disabled+0x2/0x110 lib/smp_processor_id.c:13
Code: 48 c7 c7 c0 db 92 8a e8 2c 01 00 00 65 ff 0d b1 db e6 75 5b c3 cc cc cc 48 c7 c7 20 79 d8 8a 48 c7 c6 60 79 d8 8a eb 00 41 57 <41> 56 41 54 53 48 83 ec 10 65 48 8b 04 25 28 00 00 00 48 89 44 24
RSP: 0000:ffffc900049bfb10 EFLAGS: 00000282
RAX: 0000000000000002 RBX: 0000000000000001 RCX: ffff88807ca19dc0
RDX: ffff88807ca19dc0 RSI: ffffffff8ad87960 RDI: ffffffff8ad87920
RBP: ffffc900049bfc10 R08: ffffffff81b8b5ed R09: fffffbfff1f7b221
R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000014
R13: dffffc0000000000 R14: 1ffff92000937f6c R15: ffff88807a384600
rcu_dynticks_curr_cpu_in_eqs kernel/rcu/tree.c:330 [inline]
rcu_is_watching+0x11/0xa0 kernel/rcu/tree.c:1122
rcu_read_lock include/linux/rcupdate.h:697 [inline]
count_memcg_event_mm+0xa6/0x3e0 include/linux/memcontrol.h:1078
handle_mm_fault+0x161/0x5950 mm/memory.c:4834
do_user_addr_fault arch/x86/mm/fault.c:1397 [inline]
handle_page_fault arch/x86/mm/fault.c:1485 [inline]
exc_page_fault+0x271/0x740 arch/x86/mm/fault.c:1541
asm_exc_page_fault+0x22/0x30 arch/x86/include/asm/idtentry.h:568
RIP: 0033:0x7f417ce516e6
Code: f0 72 6e 48 63 cd 48 01 c1 49 39 4f 08 72 4c 8d 4d ff 85 ed 74 33 66 0f 1f 44 00 00 48 39 f0 72 1b 4d 8b 07 49 89 c1 49 29 f1 <47> 0f b6 0c 08 45 84 c9 74 08 45 88 0c 00 49 8b 47 10 48 83 c0 01
RSP: 002b:00007f417b426520 EFLAGS: 00010206
RAX: 00000000006f1001 RBX: 00007f417b4265c0 RCX: 000000000000006a
RDX: 0000000000000055 RSI: 0000000000000001 RDI: 00007f417b426660
RBP: 0000000000000102 R08: 00007f4173007000 R09: 00000000006f1000
R10: 0000000000000000 R11: 00007f417b4265d0 R12: 0000000000000001
R13: 00007f417cf134a0 R14: 0000000000000000 R15: 00007f417b426660
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Feb 8, 2024, 7:45:15 AMFeb 8
to syzkaller...@googlegroups.com
Auto-closing this bug as obsolete.
Crashes did not happen for a while, no reproducer and no activity.
Reply all
Reply to author
Forward
0 new messages