INFO: rcu detected stall in batadv_nc_worker (3)

35 views
Skip to first unread message

syzbot

unread,
Oct 1, 2020, 6:35:20 AM10/1/20
to linux-...@vger.kernel.org, syzkall...@googlegroups.com, tg...@linutronix.de
Hello,

syzbot found the following issue on:

HEAD commit: fffe3ae0 Merge tag 'for-linus-hmm' of git://git.kernel.org..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17e03342900000
kernel config: https://syzkaller.appspot.com/x/.config?x=226c7a97d80bec54
dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+69904c...@syzkaller.appspotmail.com

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 0-...!: (1 GPs behind) idle=22e/1/0x4000000000000000 softirq=85361/85365 fqs=8
(detected by 1, t=10502 jiffies, g=114061, q=1511)
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 194 Comm: kworker/u4:4 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: bat_events batadv_nc_worker
RIP: 0010:arch_local_save_flags arch/x86/include/asm/paravirt.h:765 [inline]
RIP: 0010:arch_local_irq_save arch/x86/include/asm/paravirt.h:787 [inline]
RIP: 0010:__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
RIP: 0010:_raw_spin_lock_irqsave+0x3c/0xc0 kernel/locking/spinlock.c:159
Code: 03 49 bf 00 00 00 00 00 fc ff df 42 80 3c 38 00 74 0c 48 c7 c7 b0 d5 4b 89 e8 a0 97 92 f9 48 83 3d 38 61 2a 01 00 74 79 9c 58 <0f> 1f 44 00 00 49 89 c6 48 c7 c0 c0 d5 4b 89 48 c1 e8 03 42 80 3c
RSP: 0018:ffffc90000007d40 EFLAGS: 00000082
RAX: 0000000000000082 RBX: ffffffff8ba182b8 RCX: 000000000000d6e0
RDX: 0000000080010001 RSI: ffffffff894eec80 RDI: ffffffff8ba182b8
RBP: ffffffff894eec80 R08: ffffffff816542f4 R09: fffff52000000fb0
R10: fffff52000000fb0 R11: 0000000000000000 R12: dffffc0000000000
R13: 1ffff11015d04ed2 R14: dffffc0000000000 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004ddaf0 CR3: 0000000090ca1000 CR4: 00000000001506f0
DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Call Trace:
<IRQ>
debug_object_activate+0x62/0x5f0 lib/debugobjects.c:636
debug_hrtimer_activate kernel/time/hrtimer.c:416 [inline]
debug_activate kernel/time/hrtimer.c:476 [inline]
enqueue_hrtimer kernel/time/hrtimer.c:965 [inline]
__run_hrtimer kernel/time/hrtimer.c:1537 [inline]
__hrtimer_run_queues+0x510/0x930 kernel/time/hrtimer.c:1584
hrtimer_interrupt+0x373/0xd60 kernel/time/hrtimer.c:1646
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1080 [inline]
__sysvec_apic_timer_interrupt+0xf0/0x260 arch/x86/kernel/apic/apic.c:1097
asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
</IRQ>
__run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
sysvec_apic_timer_interrupt+0x94/0xf0 arch/x86/kernel/apic/apic.c:1091
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:581
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:770 [inline]
RIP: 0010:lock_release+0x3c4/0x750 kernel/locking/lockdep.c:5026
Code: 48 c1 e8 03 42 80 3c 28 00 74 0c 48 c7 c7 b8 d5 4b 89 e8 3f 0b 5a 00 48 83 3d df d4 f1 07 00 0f 84 5c 03 00 00 4c 89 e7 57 9d <0f> 1f 44 00 00 65 48 8b 04 25 28 00 00 00 48 3b 44 24 50 0f 85 40
RSP: 0018:ffffc90001037bf8 EFLAGS: 00000282
RAX: 1ffffffff1297ab7 RBX: 1ffff110151d2949 RCX: ffff8880a8e94180
RDX: ffff8880a8e94a48 RSI: ffffffff894ea290 RDI: 0000000000000282
RBP: 9644b5a36c253f0c R08: dffffc0000000000 R09: fffffbfff131b08e
R10: fffffbfff131b08e R11: 0000000000000000 R12: 0000000000000282
R13: dffffc0000000000 R14: dffffc0000000000 R15: ffff8880a8e94a4c
rcu_read_unlock include/linux/rcupdate.h:688 [inline]
batadv_nc_purge_orig_hash net/batman-adv/network-coding.c:411 [inline]
batadv_nc_worker+0x261/0x5c0 net/batman-adv/network-coding.c:718
process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
rcu: rcu_preempt kthread starved for 10486 jiffies! g114061 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
rcu_preempt I28608 10 2 0x00004000
Call Trace:
context_switch kernel/sched/core.c:3778 [inline]
__schedule+0x979/0xce0 kernel/sched/core.c:4527


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Dmitry Vyukov

unread,
Oct 1, 2020, 6:46:55 AM10/1/20
to syzbot, marekl...@neomailbox.ch, s...@simonwunderlich.de, a...@unstable.cc, Sven Eckelmann, David Miller, Jakub Kicinski, b.a.t...@lists.open-mesh.org, netdev, LKML, syzkaller-bugs, Thomas Gleixner
On Thu, Oct 1, 2020 at 12:35 PM syzbot
<syzbot+69904c...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: fffe3ae0 Merge tag 'for-linus-hmm' of git://git.kernel.org..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=17e03342900000
> kernel config: https://syzkaller.appspot.com/x/.config?x=226c7a97d80bec54
> dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
> compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+69904c...@syzkaller.appspotmail.com

+batadv maintainers

May be related to:
KMSAN: uninit-value in batadv_nc_worker
https://syzkaller.appspot.com/bug?extid=da9194708de785081f11
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/0000000000009d327505b0999237%40google.com.

syzbot

unread,
Oct 15, 2022, 3:01:40 PM10/15/22
to a...@unstable.cc, b.a.t...@lists.open-mesh.org, da...@davemloft.net, dvy...@google.com, edum...@google.com, j...@mojatatu.com, ji...@resnulli.us, ku...@kernel.org, linux-...@vger.kernel.org, marekl...@neomailbox.ch, net...@vger.kernel.org, pab...@redhat.com, sv...@narfation.org, s...@simonwunderlich.de, syzkall...@googlegroups.com, tg...@linutronix.de, tonymaris...@yandex.com, xiyou.w...@gmail.com
syzbot has found a reproducer for the following issue on:

HEAD commit: 55be6084c8e0 Merge tag 'timers-core-2022-10-05' of git://g..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1623ec72880000
kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16e2e478880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=149ca17c880000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/9d967e5d91fa/disk-55be6084.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/9a8cffcbc089/vmlinux-55be6084.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+69904c...@syzkaller.appspotmail.com

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 0-...!: (1 GPs behind) idle=d61c/1/0x4000000000000000 softirq=5548/5551 fqs=5
(t=10501 jiffies g=4985 q=1169 ncpus=2)
rcu: rcu_preempt kthread starved for 10488 jiffies! g4985 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt state:R running task stack:28728 pid:17 ppid:2 flags:0x00004000
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5178 [inline]
__schedule+0xadf/0x5270 kernel/sched/core.c:6490
schedule+0xda/0x1b0 kernel/sched/core.c:6566
schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1935
rcu_gp_fqs_loop+0x190/0x910 kernel/rcu/tree.c:1658
rcu_gp_kthread+0x236/0x360 kernel/rcu/tree.c:1857
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 47 Comm: kworker/u4:3 Not tainted 6.0.0-syzkaller-09589-g55be6084c8e0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
Workqueue: bat_events batadv_nc_worker
RIP: 0010:check_kcov_mode kernel/kcov.c:166 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0x7/0x60 kernel/kcov.c:200
Code: 4c 00 5d be 03 00 00 00 e9 d6 43 84 02 66 0f 1f 44 00 00 48 8b be a8 01 00 00 e8 b4 ff ff ff 31 c0 c3 90 65 8b 05 f9 24 87 7e <89> c1 48 8b 34 24 81 e1 00 01 00 00 65 48 8b 14 25 80 6f 02 00 a9
RSP: 0018:ffffc900001f0c48 EFLAGS: 00000286
RAX: 0000000000000101 RBX: ffff88806b299c90 RCX: ffffffff878c4a1d
RDX: ffff888017893b00 RSI: 0000000000000100 RDI: 0000000000000007
RBP: fffffff0a3da8872 R08: 0000000000000007 R09: 0000000000000000
R10: fffffff0a3da8872 R11: 000000000008c07d R12: fffffff0a3da8872
R13: ffff888018f5ab00 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000026ef0000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
pie_calculate_probability+0x32b/0x7c0 net/sched/sch_pie.c:387
fq_pie_timer+0x170/0x2a0 net/sched/sch_fq_pie.c:380
call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1519 [inline]
__run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
__run_timers kernel/time/timer.c:1768 [inline]
run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
__do_softirq+0x1d0/0x9c8 kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1107
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:rcu_preempt_read_exit kernel/rcu/tree_plugin.h:382 [inline]
RIP: 0010:__rcu_read_unlock+0x2d/0x570 kernel/rcu/tree_plugin.h:421
Code: 55 41 54 55 65 48 8b 2c 25 80 6f 02 00 53 48 8d bd 3c 04 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 0f b6 14 02 <48> 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 24 02 00 00 65
RSP: 0018:ffffc90000b87c58 EFLAGS: 00000a07
RAX: dffffc0000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff891cd30e RDI: ffff888017893f3c
RBP: ffff888017893b00 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000001
R13: 0000000000000000 R14: dffffc0000000000 R15: 0000000000000345
rcu_read_unlock include/linux/rcupdate.h:770 [inline]
batadv_nc_purge_orig_hash net/batman-adv/network-coding.c:412 [inline]
batadv_nc_worker+0x853/0xfa0 net/batman-adv/network-coding.c:719
process_one_work+0x991/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 1.452 msecs
CPU: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.0.0-syzkaller-09589-g55be6084c8e0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
RIP: 0010:pie_calculate_probability+0x1a5/0x7c0 net/sched/sch_pie.c:354
Code: 20 48 b8 82 be e0 12 01 00 00 00 48 89 fa 48 c1 ea 03 4c 0f af e0 48 b8 00 00 00 00 00 fc ff df 80 3c 02 00 0f 85 e4 05 00 00 <4c> 89 ea 4c 8b 7b 20 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80
RSP: 0018:ffffc90000157b40 EFLAGS: 00000246
RAX: dffffc0000000000 RBX: ffff88806b81acc0 RCX: 0000000000000100
RDX: 1ffff1100d70359c RSI: ffffffff878c480f RDI: ffff88806b81ace0
RBP: 0000000225c17d04 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 000000000008c07d R12: 00000015798ee228
R13: ffff888017498300 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffddaac90a8 CR3: 0000000011aec000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
fq_pie_timer+0x170/0x2a0 net/sched/sch_fq_pie.c:380
call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1519 [inline]
__run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
__run_timers kernel/time/timer.c:1768 [inline]
run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
__do_softirq+0x1d0/0x9c8 kernel/softirq.c:571
run_ksoftirqd kernel/softirq.c:934 [inline]
run_ksoftirqd+0x2d/0x60 kernel/softirq.c:926
smpboot_thread_fn+0x645/0x9c0 kernel/smpboot.c:164
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>

Hillf Danton

unread,
Oct 15, 2022, 7:48:42 PM10/15/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On 15 Oct 2022 12:01:38 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 55be6084c8e0 Merge tag 'timers-core-2022-10-05' of git://g..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1623ec72880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
> dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16e2e478880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=149ca17c880000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/9d967e5d91fa/disk-55be6084.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/9a8cffcbc089/vmlinux-55be6084.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+69904c...@syzkaller.appspotmail.com
>
> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu: 0-...!: (1 GPs behind) idle=d61c/1/0x4000000000000000 softirq=5548/5551 fqs=5
> (t=10501 jiffies g=4985 q=1169 ncpus=2)
> rcu: rcu_preempt kthread starved for 10488 jiffies! g4985 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1

See if rcu read lock protects a hog.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 55be6084c8e0

--- l/net/batman-adv/network-coding.c
+++ n/net/batman-adv/network-coding.c
@@ -403,12 +403,18 @@ static void batadv_nc_purge_orig_hash(st

/* For each orig_node */
for (i = 0; i < hash->size; i++) {
+ unsigned long ts;
+
head = &hash->table[i];

rcu_read_lock();
- hlist_for_each_entry_rcu(orig_node, head, hash_entry)
+ ts = jiffies + 20;
+ hlist_for_each_entry_rcu(orig_node, head, hash_entry) {
+ if (time_after(jiffies, ts))
+ break;
batadv_nc_purge_orig(bat_priv, orig_node,
batadv_nc_to_purge_nc_node);
+ }
rcu_read_unlock();
}
}
--

syzbot

unread,
Oct 15, 2022, 8:13:21 PM10/15/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4077 } 2636 jiffies s: 2521 root: 0x0/T
rcu: blocking rcu_node structures (internal RCU debug):


Tested on:

commit: 55be6084 Merge tag 'timers-core-2022-10-05' of git://g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=12ab7aaa880000
kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=14854b8a880000

Hillf Danton

unread,
Oct 15, 2022, 9:40:34 PM10/15/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On 15 Oct 2022 12:01:38 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 55be6084c8e0 Merge tag 'timers-core-2022-10-05' of git://g..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1623ec72880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
> dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16e2e478880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=149ca17c880000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/9d967e5d91fa/disk-55be6084.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/9a8cffcbc089/vmlinux-55be6084.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+69904c...@syzkaller.appspotmail.com
>
> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu: 0-...!: (1 GPs behind) idle=d61c/1/0x4000000000000000 softirq=5548/5551 fqs=5
> (t=10501 jiffies g=4985 q=1169 ncpus=2)
> rcu: rcu_preempt kthread starved for 10488 jiffies! g4985 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1

See if rcu read lock protects a hog.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 55be6084c8e0

--- l/net/batman-adv/network-coding.c
+++ n/net/batman-adv/network-coding.c
@@ -403,12 +403,18 @@ static void batadv_nc_purge_orig_hash(st

/* For each orig_node */
for (i = 0; i < hash->size; i++) {
+ unsigned long ts;
+
head = &hash->table[i];

rcu_read_lock();
- hlist_for_each_entry_rcu(orig_node, head, hash_entry)
+ ts = jiffies + 2;

syzbot

unread,
Oct 15, 2022, 11:03:22 PM10/15/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4088 } 2636 jiffies s: 2445 root: 0x0/T
rcu: blocking rcu_node structures (internal RCU debug):


Tested on:

commit: 55be6084 Merge tag 'timers-core-2022-10-05' of git://g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=12b0a464880000
kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=136f787c880000

Hillf Danton

unread,
Oct 16, 2022, 12:48:48 AM10/16/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On 15 Oct 2022 12:01:38 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 55be6084c8e0 Merge tag 'timers-core-2022-10-05' of git://g..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1623ec72880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
> dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16e2e478880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=149ca17c880000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/9d967e5d91fa/disk-55be6084.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/9a8cffcbc089/vmlinux-55be6084.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+69904c...@syzkaller.appspotmail.com
>
> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu: 0-...!: (1 GPs behind) idle=d61c/1/0x4000000000000000 softirq=5548/5551 fqs=5
> (t=10501 jiffies g=4985 q=1169 ncpus=2)
> rcu: rcu_preempt kthread starved for 10488 jiffies! g4985 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1

See if rcu read lock protects a hog.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 55be6084c8e0

--- l/net/batman-adv/network-coding.c
+++ n/net/batman-adv/network-coding.c
@@ -341,10 +341,13 @@ batadv_nc_purge_orig_nc_nodes(struct bat
struct batadv_nc_node *))
{
struct batadv_nc_node *nc_node, *nc_node_tmp;
+ int i = 0;

/* For each nc_node in list */
spin_lock_bh(lock);
list_for_each_entry_safe(nc_node, nc_node_tmp, list, list) {
+ if (i++ == 2)
+ break;
/* if an helper function has been passed as parameter,
* ask it if the entry has to be purged or not
*/
@@ -403,12 +406,18 @@ static void batadv_nc_purge_orig_hash(st

/* For each orig_node */
for (i = 0; i < hash->size; i++) {
+ unsigned long ts;
+
head = &hash->table[i];

rcu_read_lock();
- hlist_for_each_entry_rcu(orig_node, head, hash_entry)
+ ts = jiffies + 1;

Hillf Danton

unread,
Oct 16, 2022, 7:03:10 AM10/16/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On 15 Oct 2022 12:01:38 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 55be6084c8e0 Merge tag 'timers-core-2022-10-05' of git://g..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1623ec72880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
> dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16e2e478880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=149ca17c880000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/9d967e5d91fa/disk-55be6084.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/9a8cffcbc089/vmlinux-55be6084.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+69904c...@syzkaller.appspotmail.com
>
> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu: 0-...!: (1 GPs behind) idle=d61c/1/0x4000000000000000 softirq=5548/5551 fqs=5
> (t=10501 jiffies g=4985 q=1169 ncpus=2)
> rcu: rcu_preempt kthread starved for 10488 jiffies! g4985 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1

See if pie timer is a hog.
--- l/net/sched/sch_fq_pie.c
+++ n/net/sched/sch_fq_pie.c
@@ -372,13 +372,16 @@ static void fq_pie_timer(struct timer_li
struct Qdisc *sch = q->sch;
spinlock_t *root_lock; /* to lock qdisc for probability calculations */
u32 idx;
+ unsigned long ts = jiffies + 2;

root_lock = qdisc_lock(qdisc_root_sleeping(sch));
spin_lock(root_lock);

- for (idx = 0; idx < q->flows_cnt; idx++)
+ for (idx = 0; idx < q->flows_cnt; idx++) {
pie_calculate_probability(&q->p_params, &q->flows[idx].vars,
q->flows[idx].backlog);
+ WARN_ON_ONCE(time_after(jiffies, ts));
+ }

/* reset the timer to fire after 'tupdate' jiffies. */
if (q->p_params.tupdate)
--

syzbot

unread,
Oct 16, 2022, 8:27:27 AM10/16/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4079 } 2683 jiffies s: 2553 root: 0x0/T
rcu: blocking rcu_node structures (internal RCU debug):


Tested on:

commit: 55be6084 Merge tag 'timers-core-2022-10-05' of git://g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=123f4ad6880000
kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=15e1f4e6880000

syzbot

unread,
Oct 16, 2022, 8:42:24 AM10/16/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4084 } 2677 jiffies s: 2541 root: 0x0/T
rcu: blocking rcu_node structures (internal RCU debug):


Tested on:

commit: 55be6084 Merge tag 'timers-core-2022-10-05' of git://g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=12c5ccb4880000
kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=131f08aa880000

syzbot

unread,
Oct 16, 2022, 1:34:24 PM10/16/22
to a...@unstable.cc, alsa-...@alsa-project.org, b.a.t...@lists.open-mesh.org, bro...@kernel.org, da...@davemloft.net, dvy...@google.com, edum...@google.com, hda...@sina.com, j...@mojatatu.com, ji...@resnulli.us, ku...@kernel.org, lgir...@gmail.com, linux-...@vger.kernel.org, marekl...@neomailbox.ch, net...@vger.kernel.org, pab...@redhat.com, pe...@perex.cz, povi...@cutebit.org, st...@sk2.org, sv...@narfation.org, s...@simonwunderlich.de, syzkall...@googlegroups.com, tg...@linutronix.de, ti...@suse.com, tonymaris...@yandex.com, xiyou.w...@gmail.com
syzbot has bisected this issue to:

commit f8a4018c826fde6137425bbdbe524d5973feb173
Author: Mark Brown <bro...@kernel.org>
Date: Thu Jun 2 13:53:04 2022 +0000

ASoC: tas2770: Use modern ASoC DAI format terminology

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=164d4978880000
start commit: 55be6084c8e0 Merge tag 'timers-core-2022-10-05' of git://g..
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=154d4978880000
console output: https://syzkaller.appspot.com/x/log.txt?x=114d4978880000
Reported-by: syzbot+69904c...@syzkaller.appspotmail.com
Fixes: f8a4018c826f ("ASoC: tas2770: Use modern ASoC DAI format terminology")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Hillf Danton

unread,
Oct 17, 2022, 7:41:58 AM10/17/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On 15 Oct 2022 12:01:38 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 55be6084c8e0 Merge tag 'timers-core-2022-10-05' of git://g..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1623ec72880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
> dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16e2e478880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=149ca17c880000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/9d967e5d91fa/disk-55be6084.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/9a8cffcbc089/vmlinux-55be6084.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+69904c...@syzkaller.appspotmail.com
>
> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu: 0-...!: (1 GPs behind) idle=d61c/1/0x4000000000000000 softirq=5548/5551 fqs=5
> (t=10501 jiffies g=4985 q=1169 ncpus=2)
> rcu: rcu_preempt kthread starved for 10488 jiffies! g4985 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1

See if pie timer is a hog.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 55be6084c8e0

--- x/net/sched/sch_fq_pie.c
+++ s/net/sched/sch_fq_pie.c
@@ -372,9 +372,11 @@ static void fq_pie_timer(struct timer_li
struct Qdisc *sch = q->sch;
spinlock_t *root_lock; /* to lock qdisc for probability calculations */
u32 idx;
+ unsigned long ts = jiffies + 2;

root_lock = qdisc_lock(qdisc_root_sleeping(sch));
- spin_lock(root_lock);
+ if (!spin_trylock(root_lock))
+ return;

for (idx = 0; idx < q->flows_cnt; idx++)
pie_calculate_probability(&q->p_params, &q->flows[idx].vars,
@@ -385,6 +387,7 @@ static void fq_pie_timer(struct timer_li
mod_timer(&q->adapt_timer, jiffies + q->p_params.tupdate);

spin_unlock(root_lock);
+ WARN_ON_ONCE(time_after(jiffies, ts));
}

static int fq_pie_init(struct Qdisc *sch, struct nlattr *opt,
--

syzbot

unread,
Oct 17, 2022, 9:41:23 PM10/17/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 0-...!: (10500 ticks this GP) idle=1a74/1/0x4000000000000000 softirq=7901/7901 fqs=0
(t=10500 jiffies g=7625 q=3882 ncpus=2)
rcu: rcu_preempt kthread starved for 10500 jiffies! g7625 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt state:R running task stack:29320 pid:17 ppid:2 flags:0x00004000
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5178 [inline]
__schedule+0xadf/0x5270 kernel/sched/core.c:6490
schedule+0xda/0x1b0 kernel/sched/core.c:6566
schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1935
rcu_gp_fqs_loop+0x190/0x910 kernel/rcu/tree.c:1658
rcu_gp_kthread+0x236/0x360 kernel/rcu/tree.c:1857
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>
rcu: Stack dump where RCU GP kthread last ran:
CPU: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.0.0-syzkaller-09589-g55be6084c8e0-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
RIP: 0010:write_comp_data+0x7/0x90 kernel/kcov.c:223
Code: ff 00 75 10 65 48 8b 04 25 80 6f 02 00 48 8b 80 b0 15 00 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 65 8b 05 89 29 87 7e <49> 89 f1 89 c6 49 89 d2 81 e6 00 01 00 00 49 89 f8 65 48 8b 14 25
RSP: 0018:ffffc90000157b30 EFLAGS: 00000246
RAX: 0000000000000101 RBX: ffff888066e69060 RCX: ffffffff878c48ee
RDX: 0000000000000000 RSI: 0019999999999998 RDI: 0000000000000007
RBP: 0000000225c17d04 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 00000000b1b78399 R12: 00000015798ee228
R13: ffff8880690a7b00 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f801f5b0000 CR3: 0000000020ffc000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
pie_calculate_probability+0x1ee/0x7c0 net/sched/sch_pie.c:340
fq_pie_timer+0x1b3/0x360 net/sched/sch_fq_pie.c:382
call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1519 [inline]
__run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
__run_timers kernel/time/timer.c:1768 [inline]
run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
__do_softirq+0x1d0/0x9c8 kernel/softirq.c:571
run_ksoftirqd kernel/softirq.c:934 [inline]
run_ksoftirqd+0x2d/0x60 kernel/softirq.c:926
smpboot_thread_fn+0x645/0x9c0 kernel/smpboot.c:164
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>
CPU: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.0.0-syzkaller-09589-g55be6084c8e0-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
RIP: 0010:write_comp_data+0x7/0x90 kernel/kcov.c:223
Code: ff 00 75 10 65 48 8b 04 25 80 6f 02 00 48 8b 80 b0 15 00 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 65 8b 05 89 29 87 7e <49> 89 f1 89 c6 49 89 d2 81 e6 00 01 00 00 49 89 f8 65 48 8b 14 25
RSP: 0018:ffffc90000157b30 EFLAGS: 00000246
RAX: 0000000000000101 RBX: ffff888066e69060 RCX: ffffffff878c48ee
RDX: 0000000000000000 RSI: 0019999999999998 RDI: 0000000000000007
RBP: 0000000225c17d04 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 00000000b1b78399 R12: 00000015798ee228
R13: ffff8880690a7b00 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f801f5b0000 CR3: 0000000020ffc000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
pie_calculate_probability+0x1ee/0x7c0 net/sched/sch_pie.c:340
fq_pie_timer+0x1b3/0x360 net/sched/sch_fq_pie.c:382
call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1519 [inline]
__run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
__run_timers kernel/time/timer.c:1768 [inline]
run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
__do_softirq+0x1d0/0x9c8 kernel/softirq.c:571
run_ksoftirqd kernel/softirq.c:934 [inline]
run_ksoftirqd+0x2d/0x60 kernel/softirq.c:926
smpboot_thread_fn+0x645/0x9c0 kernel/smpboot.c:164
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>
------------[ cut here ]------------
WARNING: CPU: 0 PID: 16 at net/sched/sch_fq_pie.c:390 fq_pie_timer+0x2ba/0x360 net/sched/sch_fq_pie.c:390
Modules linked in:
CPU: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.0.0-syzkaller-09589-g55be6084c8e0-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
RIP: 0010:fq_pie_timer+0x2ba/0x360 net/sched/sch_fq_pie.c:390
Code: 48 c1 ea 03 80 3c 02 00 0f 85 a9 00 00 00 48 8b 35 eb 5f 34 04 48 89 ef 48 01 de e8 e0 70 de f9 e9 62 ff ff ff e8 16 b8 ee f9 <0f> 0b eb a9 48 89 cf e8 0a 1f 3c fa e9 3d fe ff ff e8 00 1f 3c fa
RSP: 0018:ffffc90000157bb8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffffffffffffb7 RCX: 0000000000000100
RDX: ffff888011a7d880 RSI: ffffffff878c922a RDI: 0000000000000007
RBP: ffff8880690a7b50 R08: 0000000000000007 R09: 0000000000000000
R10: ffffffffffffffb7 R11: 00000000b1b78399 R12: dffffc0000000000
R13: 0000000000000007 R14: 0000000000000400 R15: ffff8880690a7b00
FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f801f5b0000 CR3: 000000000bc8e000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1519 [inline]
__run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
__run_timers kernel/time/timer.c:1768 [inline]
run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
__do_softirq+0x1d0/0x9c8 kernel/softirq.c:571
run_ksoftirqd kernel/softirq.c:934 [inline]
run_ksoftirqd+0x2d/0x60 kernel/softirq.c:926
smpboot_thread_fn+0x645/0x9c0 kernel/smpboot.c:164
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>


Tested on:

commit: 55be6084 Merge tag 'timers-core-2022-10-05' of git://g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=14d888aa880000
kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=1503ca3c880000

Hillf Danton

unread,
Oct 18, 2022, 2:16:54 AM10/18/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On 15 Oct 2022 12:01:38 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 55be6084c8e0 Merge tag 'timers-core-2022-10-05' of git://g..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1623ec72880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
> dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16e2e478880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=149ca17c880000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/9d967e5d91fa/disk-55be6084.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/9a8cffcbc089/vmlinux-55be6084.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+69904c...@syzkaller.appspotmail.com
>
> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu: 0-...!: (1 GPs behind) idle=d61c/1/0x4000000000000000 softirq=5548/5551 fqs=5
> (t=10501 jiffies g=4985 q=1169 ncpus=2)
> rcu: rcu_preempt kthread starved for 10488 jiffies! g4985 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1

See if pie timer is a hog.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 55be6084c8e0

--- x/net/sched/sch_fq_pie.c
+++ s/net/sched/sch_fq_pie.c
@@ -372,19 +372,22 @@ static void fq_pie_timer(struct timer_li
struct Qdisc *sch = q->sch;
spinlock_t *root_lock; /* to lock qdisc for probability calculations */
u32 idx;
+ unsigned long ts;

root_lock = qdisc_lock(qdisc_root_sleeping(sch));
- spin_lock(root_lock);
+ if (!spin_trylock(root_lock))
+ return;

- for (idx = 0; idx < q->flows_cnt; idx++)
+ ts = jiffies + 2;
+ for (idx = 0; idx < q->flows_cnt; idx++) {
pie_calculate_probability(&q->p_params, &q->flows[idx].vars,
q->flows[idx].backlog);

- /* reset the timer to fire after 'tupdate' jiffies. */
- if (q->p_params.tupdate)
- mod_timer(&q->adapt_timer, jiffies + q->p_params.tupdate);
-
+ if (time_after(jiffies, ts))
+ break;
+ }
spin_unlock(root_lock);
+ mod_timer(&q->adapt_timer, jiffies + HZ / 2);

syzbot

unread,
Oct 18, 2022, 2:58:25 PM10/18/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4085 } 2680 jiffies s: 2529 root: 0x0/T
rcu: blocking rcu_node structures (internal RCU debug):


Tested on:

commit: 55be6084 Merge tag 'timers-core-2022-10-05' of git://g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=11c5acba880000
kernel config: https://syzkaller.appspot.com/x/.config?x=df75278aabf0681a
dashboard link: https://syzkaller.appspot.com/bug?extid=69904c3b4a09e8fa2e1b
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=15c2fd8a880000

Reply all
Reply to author
Forward
0 new messages