Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

[syzbot] [arm?] WARNING in delayed_work_timer_fn

18 views
Skip to first unread message

syzbot

unread,
Jun 22, 2023, 11:48:05 PM6/22/23
to catalin...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, wi...@kernel.org
Hello,

syzbot found the following issue on:

HEAD commit: 177239177378 Merge branch 'for-next/core', remote-tracking..
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=13d8a337280000
kernel config: https://syzkaller.appspot.com/x/.config?x=8776b67768a3c9af
dashboard link: https://syzkaller.appspot.com/bug?extid=e13e654d315d4da1277c
compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2
userspace arch: arm64

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0c47a40dd633/disk-17723917.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/0ff319b6fb50/vmlinux-17723917.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3ce1ea9e3b7e/Image-17723917.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e13e65...@syzkaller.appspotmail.com

------------[ cut here ]------------
WARNING: CPU: 0 PID: 15561 at kernel/workqueue.c:1447 current_wq_worker kernel/workqueue_internal.h:68 [inline]
WARNING: CPU: 0 PID: 15561 at kernel/workqueue.c:1447 is_chained_work kernel/workqueue.c:1384 [inline]
WARNING: CPU: 0 PID: 15561 at kernel/workqueue.c:1447 __queue_work+0xe48/0x1374 kernel/workqueue.c:1446
Modules linked in:
CPU: 0 PID: 15561 Comm: syz-executor.2 Not tainted 6.4.0-rc5-syzkaller-g177239177378 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __queue_work+0xe48/0x1374 kernel/workqueue_internal.h:68
lr : current_wq_worker kernel/workqueue_internal.h:68 [inline]
lr : is_chained_work kernel/workqueue.c:1384 [inline]
lr : __queue_work+0xe48/0x1374 kernel/workqueue.c:1446
sp : ffff800080007b70
x29: ffff800080007bb0 x28: 1fffe000195506f1 x27: 0000000000000008
x26: ffff0000f6cfb800 x25: dfff800000000000 x24: ffff0000caa83788
x23: ffff0000f6cfb9c0 x22: ffff0000caa83780 x21: 0000000000000000
x20: 0000000000000100 x19: ffff000140bd8b30 x18: 1fffe00019bd827e
x17: ffff800126418000 x16: ffff80008a43bfbc x15: 0000000000010001
x14: 1ffff00011bd80ac x13: dfff800000000000 x12: 00000000beb1c7bb
x11: 0000000000000101 x10: 0000000000000100 x9 : 0000000000000000
x8 : ffff0000caa83780 x7 : ffff800080412e8c x6 : 0000000000000000
x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff80008c805480
x2 : ffff000140bd8b30 x1 : 0000000000000100 x0 : 0000000000000000
Call trace:
current_wq_worker kernel/workqueue_internal.h:68 [inline]
is_chained_work kernel/workqueue.c:1384 [inline]
__queue_work+0xe48/0x1374 kernel/workqueue.c:1446
delayed_work_timer_fn+0x74/0x90 kernel/workqueue.c:1652
call_timer_fn+0x19c/0x8cc kernel/time/timer.c:1700
expire_timers kernel/time/timer.c:1746 [inline]
__run_timers+0x5b4/0x734 kernel/time/timer.c:2022
run_timer_softirq+0x7c/0x114 kernel/time/timer.c:2035
__do_softirq+0x2d0/0xd54 kernel/softirq.c:571
____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:80
call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:882
do_softirq_own_stack+0x20/0x2c arch/arm64/kernel/irq.c:85
invoke_softirq kernel/softirq.c:452 [inline]
__irq_exit_rcu+0x28c/0x534 kernel/softirq.c:650
irq_exit_rcu+0x14/0x84 kernel/softirq.c:662
__el1_irq arch/arm64/kernel/entry-common.c:474 [inline]
el1_interrupt+0x38/0x68 arch/arm64/kernel/entry-common.c:488
el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:493
el1h_64_irq+0x64/0x68 arch/arm64/kernel/entry.S:587
__daif_local_irq_restore arch/arm64/include/asm/irqflags.h:182 [inline]
arch_local_irq_restore arch/arm64/include/asm/irqflags.h:202 [inline]
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
_raw_spin_unlock_irqrestore+0x44/0x98 kernel/locking/spinlock.c:194
spin_unlock_irqrestore include/linux/spinlock.h:405 [inline]
__wake_up_common_lock kernel/sched/wait.c:140 [inline]
__wake_up_sync_key+0x124/0x1ac kernel/sched/wait.c:208
__unix_dgram_recvmsg+0x4c4/0xe2c net/unix/af_unix.c:2479
unix_dgram_recvmsg+0xd0/0xe8 net/unix/af_unix.c:2549
sock_recvmsg_nosec+0x90/0xec net/socket.c:1019
____sys_recvmsg+0x5c0/0x6e8 net/socket.c:2720
___sys_recvmsg net/socket.c:2764 [inline]
do_recvmmsg+0x41c/0xb60 net/socket.c:2858
__sys_recvmmsg net/socket.c:2937 [inline]
__do_sys_recvmmsg net/socket.c:2960 [inline]
__se_sys_recvmmsg net/socket.c:2953 [inline]
__arm64_sys_recvmmsg+0x180/0x23c net/socket.c:2953
__invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52
el0_svc_common+0x138/0x244 arch/arm64/kernel/syscall.c:142
do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:191
el0_svc+0x4c/0x160 arch/arm64/kernel/entry-common.c:647
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:665
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591
irq event stamp: 295995
hardirqs last enabled at (295994): [<ffff80008a521910>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
hardirqs last enabled at (295994): [<ffff80008a521910>] _raw_spin_unlock_irqrestore+0x38/0x98 kernel/locking/spinlock.c:194
hardirqs last disabled at (295995): [<ffff80008a5217b8>] __raw_spin_lock_irq include/linux/spinlock_api_smp.h:117 [inline]
hardirqs last disabled at (295995): [<ffff80008a5217b8>] _raw_spin_lock_irq+0x28/0x70 kernel/locking/spinlock.c:170
softirqs last enabled at (293288): [<ffff8000800218ec>] softirq_handle_end kernel/softirq.c:414 [inline]
softirqs last enabled at (293288): [<ffff8000800218ec>] __do_softirq+0xac0/0xd54 kernel/softirq.c:600
softirqs last disabled at (295981): [<ffff80008002b660>] ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:80
---[ end trace 0000000000000000 ]---


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to change bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Dec 13, 2024, 7:30:27 AM12/13/24
to catalin...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, wi...@kernel.org
syzbot has found a reproducer for the following issue on:

HEAD commit: 2e7aff49b5da Merge branches 'for-next/core' and 'for-next/..
console output: https://syzkaller.appspot.com/x/log.txt?x=11ed5be8580000
kernel config: https://syzkaller.appspot.com/x/.config?x=696fb014d05da3a3
dashboard link: https://syzkaller.appspot.com/bug?extid=e13e654d315d4da1277c
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10192d44580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/ef408f67fde3/disk-2e7aff49.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/414ac17a20dc/vmlinux-2e7aff49.xz
kernel image: https://storage.googleapis.com/syzbot-assets/a93415d2a7e7/Image-2e7aff49.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e13e65...@syzkaller.appspotmail.com

------------[ cut here ]------------
WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 current_wq_worker kernel/workqueue_internal.h:69 [inline]
WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 is_chained_work kernel/workqueue.c:2199 [inline]
WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 __queue_work+0xe50/0x1308 kernel/workqueue.c:2256
Modules linked in:
CPU: 1 UID: 0 PID: 6023 Comm: klogd Not tainted 6.13.0-rc2-syzkaller-g2e7aff49b5da #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __queue_work+0xe50/0x1308 kernel/workqueue_internal.h:69
lr : current_wq_worker kernel/workqueue_internal.h:69 [inline]
lr : is_chained_work kernel/workqueue.c:2199 [inline]
lr : __queue_work+0xe50/0x1308 kernel/workqueue.c:2256
sp : ffff800080017b40
x29: ffff800080017b90 x28: dfff800000000000 x27: ffff700010002f88
x26: dfff800000000000 x25: ffff0000d1c63800 x24: 0000000000000008
x23: 1fffe0001b14d791 x22: ffff0000d1c639c0 x21: 0000000000000100
x20: ffff0000d8a6bc80 x19: ffff0000da2f0930 x18: 1fffe000366c5e7e
x17: ffff800123d64000 x16: ffff80008326d65c x15: 0000000000000002
x14: 1ffff00011f300ca x13: dfff800000000000 x12: 00000000b1aebe1c
x11: 00000000beaa9035 x10: 0000000000ff0100 x9 : 0000000000000000
x8 : ffff0000d8a6bc80 x7 : ffff8000805819c0 x6 : 0000000000000000
x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff80008df6a320
x2 : ffff0000da2f0930 x1 : 0000000000000100 x0 : 0000000000000000
Call trace:
current_wq_worker kernel/workqueue_internal.h:69 [inline] (P)
is_chained_work kernel/workqueue.c:2199 [inline] (P)
__queue_work+0xe50/0x1308 kernel/workqueue.c:2256 (P)
current_wq_worker kernel/workqueue_internal.h:69 [inline] (L)
is_chained_work kernel/workqueue.c:2199 [inline] (L)
__queue_work+0xe50/0x1308 kernel/workqueue.c:2256 (L)
delayed_work_timer_fn+0x74/0x90 kernel/workqueue.c:2485
call_timer_fn+0x1b4/0x8b8 kernel/time/timer.c:1793
expire_timers kernel/time/timer.c:1839 [inline]
__run_timers kernel/time/timer.c:2418 [inline]
__run_timer_base+0x59c/0x7b4 kernel/time/timer.c:2430
run_timer_base kernel/time/timer.c:2439 [inline]
run_timer_softirq+0xcc/0x194 kernel/time/timer.c:2449
handle_softirqs+0x320/0xd34 kernel/softirq.c:554
__do_softirq+0x14/0x20 kernel/softirq.c:588
____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:81
call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:891
do_softirq_own_stack+0x20/0x2c arch/arm64/kernel/irq.c:86
invoke_softirq kernel/softirq.c:435 [inline]
__irq_exit_rcu+0x1d8/0x544 kernel/softirq.c:655
irq_exit_rcu+0x14/0x84 kernel/softirq.c:671
__el1_irq arch/arm64/kernel/entry-common.c:561 [inline]
el1_interrupt+0x38/0x68 arch/arm64/kernel/entry-common.c:575
el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:580
el1h_64_irq+0x6c/0x70 arch/arm64/kernel/entry.S:596
__daif_local_irq_enable arch/arm64/include/asm/irqflags.h:26 [inline] (P)
arch_local_irq_enable arch/arm64/include/asm/irqflags.h:48 [inline] (P)
raw_spin_rq_unlock_irq kernel/sched/sched.h:1526 [inline] (P)
finish_lock_switch+0xc0/0x1e4 kernel/sched/core.c:5123 (P)
raw_spin_rq_unlock_irq kernel/sched/sched.h:1526 [inline] (L)
finish_lock_switch+0xbc/0x1e4 kernel/sched/core.c:5123 (L)
finish_task_switch+0x120/0x608 kernel/sched/core.c:5241
context_switch kernel/sched/core.c:5372 [inline]
__schedule+0x1748/0x27f0 kernel/sched/core.c:6756
preempt_schedule_irq+0x80/0x188 kernel/sched/core.c:7078
arm64_preempt_schedule_irq arch/arm64/kernel/entry-common.c:301 [inline]
__el1_irq arch/arm64/kernel/entry-common.c:563 [inline]
el1_interrupt+0x4c/0x68 arch/arm64/kernel/entry-common.c:575
el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:580
el1h_64_irq+0x6c/0x70 arch/arm64/kernel/entry.S:596
__daif_local_irq_enable arch/arm64/include/asm/irqflags.h:26 [inline] (P)
arch_local_irq_enable arch/arm64/include/asm/irqflags.h:48 [inline] (P)
do_notify_resume+0x5c/0x1f4 arch/arm64/kernel/entry-common.c:133 (P)
do_notify_resume+0x58/0x1f4 arch/arm64/kernel/entry-common.c:133 (L)
exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline]
exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline]
el0_interrupt+0xc8/0x248 arch/arm64/kernel/entry-common.c:832
__el0_irq_handler_common+0x18/0x24 arch/arm64/kernel/entry-common.c:837
el0t_64_irq_handler+0x10/0x1c arch/arm64/kernel/entry-common.c:842
el0t_64_irq+0x198/0x19c arch/arm64/kernel/entry.S:601
irq event stamp: 35689
hardirqs last enabled at (35688): [<ffff80008b687f2c>] __exit_to_kernel_mode arch/arm64/kernel/entry-common.c:85 [inline]
hardirqs last enabled at (35688): [<ffff80008b687f2c>] exit_to_kernel_mode+0xdc/0x10c arch/arm64/kernel/entry-common.c:95
hardirqs last disabled at (35689): [<ffff80008b6a9b44>] __raw_spin_lock_irq include/linux/spinlock_api_smp.h:117 [inline]
hardirqs last disabled at (35689): [<ffff80008b6a9b44>] _raw_spin_lock_irq+0x28/0x70 kernel/locking/spinlock.c:170
softirqs last enabled at (34558): [<ffff800080129934>] local_bh_enable+0x10/0x34 include/linux/bottom_half.h:32
softirqs last disabled at (35685): [<ffff800080020db4>] __do_softirq+0x14/0x20 kernel/softirq.c:588
---[ end trace 0000000000000000 ]---


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

Will Deacon

unread,
Feb 11, 2025, 6:10:54 AMFeb 11
to syzbot, catalin...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, m...@kernel.org, mark.r...@arm.com, richard....@gmail.com, jiangs...@gmail.com, t...@kernel.org, mar...@holtmann.org, johan....@gmail.com, luiz....@gmail.com, linux-b...@vger.kernel.org
[+ some other folks]

On Fri, Dec 13, 2024 at 04:30:24AM -0800, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 2e7aff49b5da Merge branches 'for-next/core' and 'for-next/..
> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> console output: https://syzkaller.appspot.com/x/log.txt?x=11ed5be8580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=696fb014d05da3a3
> dashboard link: https://syzkaller.appspot.com/bug?extid=e13e654d315d4da1277c
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> userspace arch: arm64
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10192d44580000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/ef408f67fde3/disk-2e7aff49.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/414ac17a20dc/vmlinux-2e7aff49.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/a93415d2a7e7/Image-2e7aff49.gz.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+e13e65...@syzkaller.appspotmail.com

I spent a bit of time looking at the reports in the "dashboard link"
above. There are quite a few in there and I ended up suspecting the
bluetooth subsystem and drivers using hci_register_dev(). Specifically,
it looks like one of the workqueues is serviced in softirq context after
its corresponding device has been destroyed.

I wasn't able to reproduce it locally (I don't have a setup for running
the 'syz' reproducers), but the program is pretty small (see the "syz repro"
link above).

Given that this seems to explode a few times a month, I wonder if it's
worth adding some instrumentation to e.g. dump the name of the workqueue?

Will

--->8

(mammoth backtrace below)

Tejun Heo

unread,
Feb 11, 2025, 3:59:33 PMFeb 11
to Will Deacon, syzbot, catalin...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, m...@kernel.org, mark.r...@arm.com, richard....@gmail.com, jiangs...@gmail.com, mar...@holtmann.org, johan....@gmail.com, luiz....@gmail.com, linux-b...@vger.kernel.org
On Tue, Feb 11, 2025 at 11:10:46AM +0000, Will Deacon wrote:
> Given that this seems to explode a few times a month, I wonder if it's
> worth adding some instrumentation to e.g. dump the name of the workqueue?

It's usually most useful to print out the work function when identifying the
culprit. I'd be happy to take the patch.

Thanks.

--
tejun

Will Deacon

unread,
Feb 14, 2025, 11:46:05 AMFeb 14
to Tejun Heo, syzbot, catalin...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, m...@kernel.org, mark.r...@arm.com, richard....@gmail.com, jiangs...@gmail.com, mar...@holtmann.org, johan....@gmail.com, luiz....@gmail.com, linux-b...@vger.kernel.org
Thanks, Tejun. I sent a patch adding some more diagnostics:

https://lore.kernel.org/r/2025021416434...@kernel.org

Will

Will Deacon

unread,
Mar 13, 2025, 3:31:06 PMMar 13
to Tejun Heo, syzbot, catalin...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, m...@kernel.org, mark.r...@arm.com, richard....@gmail.com, jiangs...@gmail.com, mar...@holtmann.org, johan....@gmail.com, luiz....@gmail.com, linux-b...@vger.kernel.org
Ok, we have our first crash [1] with the new debug information:

[ 77.133818][ T10] Bluetooth: hci3: Opcode 0x0c1a failed: -110
[ 77.135329][ T10] Bluetooth: hci3: Error when powering off device on rfkill (-110)
[ 77.143656][ C1] ------------[ cut here ]------------
[ 77.145336][ C1] workqueue: cannot queue hci_cmd_timeout on wq hci3
[ 77.147101][ C1] WARNING: CPU: 1 PID: 7433 at kernel/workqueue.c:2258 __queue_work+0xe94/0x1324
[ 77.149301][ C1] Modules linked in:
[ 77.150287][ C1] CPU: 1 UID: 0 PID: 7433 Comm: syz.2.193 Not tainted 6.14.0-rc5-syzkaller-g77c95b8c7a16 #0
[ 77.152960][ C1] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
[ 77.155583][ C1] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 77.157546][ C1] pc : __queue_work+0xe94/0x1324
[ 77.158811][ C1] lr : __queue_work+0xe94/0x1324
[ 77.160114][ C1] sp : ffff800080017b40
[ 77.161145][ C1] x29: ffff800080017b90 x28: dfff800000000000 x27: ffff700010002f88
[ 77.163285][ C1] x26: ffff80008fbbd600 x25: ffff0000dacac800 x24: 0000000000000008
[ 77.165326][ C1] x23: ffff0000dacac9c0 x22: 0000000000000100 x21: 1fffe0001b595938
[ 77.167421][ C1] x20: ffff800092f09000 x19: ffff0000ecf20948 x18: 0000000000000008
[ 77.169356][ C1] x17: 0000000000000000 x16: ffff80008b72ce6c x15: ffff700011f87b38
[ 77.171279][ C1] x14: 1ffff00011f87b38 x13: 0000000000000004 x12: ffffffffffffffff
[ 77.173271][ C1] x11: 0000000000000103 x10: 0000000000ff0100 x9 : d5bcdc7e4519df00
[ 77.175255][ C1] x8 : d5bcdc7e4519df00 x7 : 0000000000000001 x6 : 0000000000000001
[ 77.177277][ C1] x5 : ffff8000800172f8 x4 : ffff80008fcaf7c0 x3 : ffff8000804a7bb4
[ 77.179372][ C1] x2 : 0000000000000000 x1 : 0000000100000101 x0 : 0000000000000000
[ 77.181374][ C1] Call trace:
[ 77.182230][ C1] __queue_work+0xe94/0x1324 (P)
[ 77.183476][ C1] delayed_work_timer_fn+0x74/0x90
[ 77.184755][ C1] call_timer_fn+0x1b4/0x8b8
[ 77.185932][ C1] __run_timer_base+0x59c/0x7b4
[ 77.187119][ C1] run_timer_softirq+0xcc/0x194
[ 77.188360][ C1] handle_softirqs+0x320/0xd34
[ 77.189576][ C1] __do_softirq+0x14/0x20
[ 77.190672][ C1] ____do_softirq+0x14/0x20
[ 77.191884][ C1] call_on_irq_stack+0x24/0x4c
[ 77.193050][ C1] do_softirq_own_stack+0x20/0x2c
[ 77.194311][ C1] __irq_exit_rcu+0x1d8/0x544
[ 77.195550][ C1] irq_exit_rcu+0x14/0x84
[ 77.196669][ C1] el1_interrupt+0x38/0x68
[ 77.197818][ C1] el1h_64_irq_handler+0x18/0x24
[ 77.199152][ C1] el1h_64_irq+0x6c/0x70
[ 77.200245][ C1] lock_acquire+0x278/0x724 (P)
[ 77.201520][ C1] rcu_lock_acquire+0x44/0x54
[ 77.202722][ C1] page_ext_get+0x2c/0x2e8
[ 77.203908][ C1] page_table_check_set+0xa0/0x408
[ 77.205117][ C1] __page_table_check_ptes_set+0x2d0/0x398
[ 77.206601][ C1] set_pte_range+0x618/0x644
[ 77.207792][ C1] finish_fault+0x968/0xd6c
[ 77.208968][ C1] handle_pte_fault+0x3528/0x57b0
[ 77.210314][ C1] handle_mm_fault+0xfa8/0x188c
[ 77.211592][ C1] __get_user_pages+0x1878/0x3400
[ 77.212967][ C1] populate_vma_page_range+0x220/0x2f0
[ 77.214435][ C1] __mm_populate+0x240/0x3d8
[ 77.215600][ C1] vm_mmap_pgoff+0x304/0x3c4
[ 77.216833][ C1] ksys_mmap_pgoff+0xd0/0x5c8
[ 77.218066][ C1] __arm64_sys_mmap+0xf8/0x110
[ 77.219195][ C1] invoke_syscall+0x98/0x2b8
[ 77.220453][ C1] el0_svc_common+0x130/0x23c
[ 77.221747][ C1] do_el0_svc+0x48/0x58
[ 77.222881][ C1] el0_svc+0x54/0x168
[ 77.223946][ C1] el0t_64_sync_handler+0x84/0x108
[ 77.225305][ C1] el0t_64_sync+0x198/0x19c
[ 77.226425][ C1] irq event stamp: 3967
[ 77.227573][ C1] hardirqs last enabled at (3966): [<ffff80008b7ec538>] _raw_spin_unlock_irqrestore+0x38/0x98
[ 77.230143][ C1] hardirqs last disabled at (3967): [<ffff80008b7ec3e0>] _raw_spin_lock_irq+0x28/0x70
[ 77.232639][ C1] softirqs last enabled at (2318): [<ffff800080311b48>] handle_softirqs+0xb44/0xd34
[ 77.235151][ C1] softirqs last disabled at (3923): [<ffff800080020dbc>] __do_softirq+0x14/0x20
[ 77.237528][ C1] ---[ end trace 0000000000000000 ]---


So it looks like I was right about bluetooth (the work function is
"hci_cmd_timeout") but, unfortunately, "hci3" doesn't tell us much about
the client.

Perhaps the failed power-off right before the warning suggests that
we're not tearing down the wq properly on that error path? Let's see...

hci_rfkill_set_block() calls hci_dev_do_close() if the power-off fails.
That in turn cancels the delayed 'cmd_timer' work but only if HCI_UP is
not set in the device flags. But if hci_dev_do_poweroff() failed, HCI_UP
may well still be set. Then somehow the workqueue is destroyed, I guess
via hci_unregister_dev() setting HCI_UNREGISTER which enables
hci_release_dev() but at that point it's hard to know which bluetooth
driver is responsible.

Assuming hci_unregister_dev() is triggered via hci_dev_do_close()
closing the device, then unconditionally cancelling the delayed cmd work
when clearing HCI_UP sounds like a reasonable starter for ten?

Dunno, this is all guesswork in code that I'm not familiar with, so I'm
hoping somebody can tell me why I'm wrong :)

Will

--->8

diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
index dd770ef5ec36..d2623f6d0593 100644
--- a/net/bluetooth/hci_sync.c
+++ b/net/bluetooth/hci_sync.c
@@ -5147,11 +5147,9 @@ int hci_dev_close_sync(struct hci_dev *hdev)
}

err = hci_dev_shutdown(hdev);
-
- if (!test_and_clear_bit(HCI_UP, &hdev->flags)) {
- cancel_delayed_work_sync(&hdev->cmd_timer);
+ cancel_delayed_work_sync(&hdev->cmd_timer);
+ if (!test_and_clear_bit(HCI_UP, &hdev->flags))
return err;
- }

hci_leds_update_powered(hdev, false);

Reply all
Reply to author
Forward
0 new messages