[syzbot] WARNING in call_timer

syzbot

unread,

Nov 16, 2022, 10:25:38 AM11/16/22

to ben-...@fluff.org, b...@alien8.de, daniel....@linux.intel.com, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, net...@vger.kernel.org, pbon...@redhat.com, sathyanarayan...@linux.intel.com, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org

Hello,

syzbot found the following issue on:

HEAD commit: 7eba4505394e net: dcb: move getapptrust to separate function
git tree: net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=16626ce9880000
kernel config: https://syzkaller.appspot.com/x/.config?x=893a728fb1a6b263
dashboard link: https://syzkaller.appspot.com/bug?extid=6fb78d577e89e69602f9
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16899f35880000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/68b1a18c1aad/disk-7eba4505.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/f03970850c94/vmlinux-7eba4505.xz
kernel image: https://storage.googleapis.com/syzbot-assets/4775c120c4c6/bzImage-7eba4505.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+6fb78d...@syzkaller.appspotmail.com

------------[ cut here ]------------
WARNING: CPU: 0 PID: 5978 at kernel/workqueue.c:1438 __queue_work+0xf70/0x13b0 kernel/workqueue.c:1438
Modules linked in:
CPU: 0 PID: 5978 Comm: syz-executor.3 Not tainted 6.1.0-rc4-syzkaller-01070-g7eba4505394e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
RIP: 0010:__queue_work+0xf70/0x13b0 kernel/workqueue.c:1438
Code: e0 07 83 c0 03 38 d0 7c 09 84 d2 74 05 e8 e8 64 7b 00 8b 5b 2c 31 ff 83 e3 20 89 de e8 09 5e 2e 00 85 db 75 42 e8 50 61 2e 00 <0f> 0b e9 7e f7 ff ff e8 44 61 2e 00 0f 0b e9 10 f7 ff ff e8 38 61
RSP: 0018:ffffc90000007c98 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000100 RCX: 0000000000000100
RDX: ffff888022ba9d40 RSI: ffffffff8151be20 RDI: 0000000000000005
RBP: ffffc90000007d60 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000100 R11: 0000000000000001 R12: ffff88806ccf4b30
R13: ffff88806ccf4b78 R14: 0000000000000000 R15: ffff88807568f000
FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005555558e1708 CR3: 0000000076b84000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
call_timer_fn+0x1da/0x7c0 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1514 [inline]
__run_timers.part.0+0x4a3/0xaf0 kernel/time/timer.c:1790
__run_timers kernel/time/timer.c:1768 [inline]
run_timer_softirq+0xb7/0x1d0 kernel/time/timer.c:1803
__do_softirq+0x1fb/0xadc kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x9/0x20 kernel/softirq.c:662
sysvec_apic_timer_interrupt+0x97/0xc0 arch/x86/kernel/apic/apic.c:1107
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:lock_acquire+0x227/0x630 kernel/locking/lockdep.c:5636
Code: 76 9f 7e 83 f8 01 0f 85 3a 03 00 00 9c 58 f6 c4 02 0f 85 25 03 00 00 48 83 7c 24 08 00 74 01 fb 48 b8 00 00 00 00 00 fc ff df <48> 01 c3 48 c7 03 00 00 00 00 48 c7 43 08 00 00 00 00 48 8b 84 24
RSP: 0018:ffffc90006f678d0 EFLAGS: 00000206
RAX: dffffc0000000000 RBX: 1ffff92000decf1d RCX: 0000000000000001
RDX: 1ffff110045754f1 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000001 R08: 0000000000000001 R09: ffffffff912a75ef
R10: fffffbfff2254ebd R11: 1ffffffff212a1df R12: 0000000000000000
R13: 0000000000000000 R14: ffff88807568f138 R15: 0000000000000000
__flush_workqueue+0x118/0x13a0 kernel/workqueue.c:2809
drain_workqueue+0x1a9/0x3c0 kernel/workqueue.c:2974
hci_dev_close_sync+0x2f3/0x1200 net/bluetooth/hci_sync.c:4809
hci_dev_do_close+0x31/0x70 net/bluetooth/hci_core.c:554
hci_unregister_dev+0x183/0x4e0 net/bluetooth/hci_core.c:2702
vhci_release+0x80/0xf0 drivers/bluetooth/hci_vhci.c:568
__fput+0x27c/0xa90 fs/file_table.c:320
task_work_run+0x16f/0x270 kernel/task_work.c:179
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0xb3d/0x2a30 kernel/exit.c:820
do_group_exit+0xd4/0x2a0 kernel/exit.c:950
__do_sys_exit_group kernel/exit.c:961 [inline]
__se_sys_exit_group kernel/exit.c:959 [inline]
__x64_sys_exit_group+0x3e/0x50 kernel/exit.c:959
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f3d8a68b639
Code: Unable to access opcode bytes at 0x7f3d8a68b60f.
RSP: 002b:00007ffd554b9878 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000029 RCX: 00007f3d8a68b639
RDX: 00007f3d8a68cc8a RSI: 0000000000000000 RDI: 0000000000000007
RBP: 0000000000000007 R08: ffffffffff000000 R09: 0000000000000029
R10: 00000000000003b8 R11: 0000000000000246 R12: 00007ffd554b9f00
R13: 0000000000000003 R14: 00007ffd554b9e9c R15: 00007f3d8a782b60
</TASK>
----------------
Code disassembly (best guess):
0: 76 9f jbe 0xffffffa1
2: 7e 83 jle 0xffffff87
4: f8 clc
5: 01 0f add %ecx,(%rdi)
7: 85 3a test %edi,(%rdx)
9: 03 00 add (%rax),%eax
b: 00 9c 58 f6 c4 02 0f add %bl,0xf02c4f6(%rax,%rbx,2)
12: 85 25 03 00 00 48 test %esp,0x48000003(%rip) # 0x4800001b
18: 83 7c 24 08 00 cmpl $0x0,0x8(%rsp)
1d: 74 01 je 0x20
1f: fb sti
20: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
27: fc ff df
* 2a: 48 01 c3 add %rax,%rbx <-- trapping instruction
2d: 48 c7 03 00 00 00 00 movq $0x0,(%rbx)
34: 48 c7 43 08 00 00 00 movq $0x0,0x8(%rbx)
3b: 00
3c: 48 rex.W
3d: 8b .byte 0x8b
3e: 84 .byte 0x84
3f: 24 .byte 0x24

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Thomas Gleixner

unread,

Nov 16, 2022, 1:29:32 PM11/16/22

to syzbot, ben-...@fluff.org, b...@alien8.de, daniel....@linux.intel.com, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, net...@vger.kernel.org, pbon...@redhat.com, sathyanarayan...@linux.intel.com, syzkall...@googlegroups.com, x...@kernel.org, Steven Rostedt, Marcel Holtmann, Luiz Augusto von Dentz

That's precisely the case I described here:

https://lore.kernel.org/all/87iljhsftt.ffs@tglx

and the final fix is here:

https://lore.kernel.org/all/202211151958...@linutronix.de

but that will take a bit to get upstream. In the meanwhile someone could
sprinkle 'if (in_shutdown)' conditionals into the code to prevent the
splat.

Thanks,

tglx

Hillf Danton

unread,

Nov 16, 2022, 8:08:50 PM11/16/22

to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com

On 16 Nov 2022 07:25:36 -0800

> syzbot found the following issue on:
>
> HEAD commit: 7eba4505394e net: dcb: move getapptrust to separate function
> git tree: net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=16626ce9880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=893a728fb1a6b263
> dashboard link: https://syzkaller.appspot.com/bug?extid=6fb78d577e89e69602f9
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16899f35880000

See if it is the cmd_timer that triggers the warning.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git 7eba4505394e

--- x/net/bluetooth/hci_sync.c
+++ h/net/bluetooth/hci_sync.c
@@ -4803,6 +4803,7 @@ int hci_dev_close_sync(struct hci_dev *h
cancel_delayed_work_sync(&adv_instance->rpa_expired_cb);
}

+ cancel_delayed_work_sync(&hdev->cmd_timer);
/* Avoid potential lockdep warnings from the *_flush() calls by
* ensuring the workqueue is empty up front.
*/
--

syzbot

unread,

Nov 16, 2022, 8:35:19 PM11/16/22

to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4107 } 2643 jiffies s: 2809 root: 0x0/T
rcu: blocking rcu_node structures (internal RCU debug):

Tested on:

commit: 7eba4505 net: dcb: move getapptrust to separate function
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=12b65845880000

kernel config: https://syzkaller.appspot.com/x/.config?x=893a728fb1a6b263
dashboard link: https://syzkaller.appspot.com/bug?extid=6fb78d577e89e69602f9
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

patch: https://syzkaller.appspot.com/x/patch.diff?x=14ef644e880000

Hillf Danton

unread,

Nov 16, 2022, 9:45:25 PM11/16/22

to Thomas Gleixner, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann, Luiz Augusto von Dentz

On 16 Nov 2022 19:29:27 +0100 Thomas Gleixner <tg...@linutronix.de>

I doubt your fix helps in the domain of workqueue in general, and in this
report if it is delayed work that triggered the warning.

Hillf

--- x/net/bluetooth/hci_sync.c
+++ y/net/bluetooth/hci_sync.c
@@ -4806,6 +4806,7 @@ int hci_dev_close_sync(struct hci_dev *h

/* Avoid potential lockdep warnings from the *_flush() calls by
* ensuring the workqueue is empty up front.
*/

+ cancel_delayed_work_sync(&hdev->cmd_timer);
drain_workqueue(hdev->workqueue);

hci_dev_lock(hdev);

Thomas Gleixner

unread,

Nov 17, 2022, 6:54:30 AM11/17/22

to Hillf Danton, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann, Luiz Augusto von Dentz

On Thu, Nov 17 2022 at 10:45, Hillf Danton wrote:
> On 16 Nov 2022 19:29:27 +0100 Thomas Gleixner <tg...@linutronix.de>
>>
>

> I doubt your fix helps in the domain of workqueue in general, and in this
> report if it is delayed work that triggered the warning.

Indeed. I misread the backtrace and my brain connected the wrong dots
here. It's a different bluethooth hickup.

> --- x/net/bluetooth/hci_sync.c
> +++ y/net/bluetooth/hci_sync.c
> @@ -4806,6 +4806,7 @@ int hci_dev_close_sync(struct hci_dev *h
> /* Avoid potential lockdep warnings from the *_flush() calls by
> * ensuring the workqueue is empty up front.
> */
> + cancel_delayed_work_sync(&hdev->cmd_timer);
> drain_workqueue(hdev->workqueue);

The work has been canceled already before in the same function and there
are some more delayed works which can trigger this.

So no, this whole close_sync() function is prone to teardown races and
just slapping a single cancel here without deeper analysis does not cut
it.

Thanks,

tglx

Hillf Danton

unread,

Nov 17, 2022, 7:55:38 AM11/17/22

to Thomas Gleixner, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann, Luiz Augusto von Dentz

On 17 Nov 2022 12:54:28 +0100 Thomas Gleixner <tg...@linutronix.de>

>
> The work has been canceled already before in the same function and there
> are some more delayed works which can trigger this.
>
> So no, this whole close_sync() function is prone to teardown races and
> just slapping a single cancel here without deeper analysis does not cut
> it.

Agree.

A set of sync cancelations can do the job, given what is defined in struct
hci_dev wrt workqueue.

Hillf

struct workqueue_struct *workqueue;
struct workqueue_struct *req_workqueue;

struct delayed_work power_off;
struct delayed_work discov_off;
struct delayed_work service_cache;
struct delayed_work cmd_timer;
struct delayed_work ncmd_timer;
struct delayed_work le_scan_disable;
struct delayed_work le_scan_restart;
struct delayed_work adv_instance_expire;
struct delayed_work rpa_expired;
struct delayed_work interleave_scan;

struct work_struct power_on;
struct work_struct error_reset;
struct work_struct cmd_sync_work;
struct work_struct cmd_sync_cancel_work;
struct work_struct rx_work;
struct work_struct cmd_work;
struct work_struct tx_work;

Thomas Gleixner

unread,

Nov 17, 2022, 11:06:50 AM11/17/22

to Hillf Danton, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann, Luiz Augusto von Dentz

On Thu, Nov 17 2022 at 20:55, Hillf Danton wrote:
> On 17 Nov 2022 12:54:28 +0100 Thomas Gleixner <tg...@linutronix.de>
>>
>> The work has been canceled already before in the same function and there
>> are some more delayed works which can trigger this.
>>
>> So no, this whole close_sync() function is prone to teardown races and
>> just slapping a single cancel here without deeper analysis does not cut
>> it.
>
> Agree.
>
> A set of sync cancelations can do the job, given what is defined in struct
> hci_dev wrt workqueue.

It's only part of the solution because you also have to prevent that
work is queued from other parts of the code....

Luiz Augusto von Dentz

unread,

Nov 17, 2022, 4:04:26 PM11/17/22

to Thomas Gleixner, Hillf Danton, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann

Hi Thomas,

I thought we would have something similar to shutdown_timer (e.g.
shutdown_delayed_work) so we can safely free its object/struct, at
least that was the impression I got when discussing with Steven.

--
Luiz Augusto von Dentz

Luiz Augusto von Dentz

unread,

Nov 17, 2022, 4:16:22 PM11/17/22

to Thomas Gleixner, Hillf Danton, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann, Tetsuo Handa

Hi Tetsuo,

Wasn't the following patch suppose to address such problem:

https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git/commit/?id=deee93d13d385103205879a8a0915036ecd83261

It was merged in the last pull request to net-next:

https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git/commit/?id=a507ea32b9c2c407012bf89109ac0cf89fae313c

Tetsuo Handa

unread,

Nov 17, 2022, 7:53:51 PM11/17/22

to Luiz Augusto von Dentz, Thomas Gleixner, Hillf Danton, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann

On 2022/11/18 6:16, Luiz Augusto von Dentz wrote:
> Wasn't the following patch suppose to address such problem:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git/commit/?id=deee93d13d385103205879a8a0915036ecd83261
>
> It was merged in the last pull request to net-next:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git/commit/?id=a507ea32b9c2c407012bf89109ac0cf89fae313c
>

No. Commit deee93d13d38 ("Bluetooth: use hdev->workqueue when queuing hdev->{cmd,ncmd}_timer works")
is for handling queue_work() from hci_cmd_timeout(struct work_struct *work) from process_one_work()
(that is, called from a kernel workqueue thread from process context).

But this report says that queue_work() is called from timer interrupt handler from interrupt context
while drain_workqueue(hdev->workqueue) is in progress from process context.

But... is is_chained_work() check appropriate?
Why can't we exclude "timer interrupt handler" from "somebody else" ?

The comment for drain_workqueue() says

* Wait until the workqueue becomes empty. While draining is in progress,
* only chain queueing is allowed. IOW, only currently pending or running
* work items on @wq can queue further work items on it. @wq is flushed
* repeatedly until it becomes empty. The number of flushing is determined
* by the depth of chaining and should be relatively short. Whine if it
* takes too long.

but why limited to "only currently pending or running work items on @wq" (that is,
only process context) ?

Although drain_workqueue() is also called from destroy_workqueue() (which would cause
use-after-free bug if interrupt handler calls queue_work() some time after drain_workqueue()),
I think that we can wait for drain_workqueue() to call __flush_workqueue() again if a further
work item is queued from interrupt handler...

Anyway, stopping all works and delayed works before calling drain_workqueue() would address
this problem.

Thomas Gleixner

unread,

Nov 17, 2022, 8:18:00 PM11/17/22

to Tetsuo Handa, Luiz Augusto von Dentz, Hillf Danton, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann

Which is correct because at that point it's expecting to only accept the
pending and chained pending work otherwise it will go on in circles and/or
just be unable to provide the functionality of draining work, right?

> Anyway, stopping all works and delayed works before calling
> drain_workqueue() would address this problem.

Only partially.

You also have to make sure that none of the works can be rearmed or
rescheduled after that point by any other context, e.g interrupts ...

Thanks,

tglx

Hillf Danton

unread,

Nov 17, 2022, 8:28:21 PM11/17/22

to Luiz Augusto von Dentz, Thomas Gleixner, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann, Tetsuo Handa

On Thu, Nov 17, 2022 at 1:04 PM Luiz Augusto von Dentz <luiz....@gmail.com> wrote:
> >
> > I thought we would have something similar to shutdown_timer (e.g.
> > shutdown_delayed_work) so we can safely free its object/struct, at
> > least that was the impression I got when discussing with Steven.

Because of the cough in Bluetooth's throat does not mean it makes sense
to ask workqueue to take a flu jab. Why not cure it directly locally,
given the cases of workqueue under the drivers dir. And timer?

Luiz Augusto von Dentz

unread,

Nov 18, 2022, 3:01:43 PM11/18/22

to Hillf Danton, Thomas Gleixner, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann, Tetsuo Handa

Hi Hillf,

Like Thomas said we can only resolve this partially with the likes of
cancel_workqueue(_sync), though we can use HCI_UNREGISTER to stop
rescheduling cmd_timer, so imo having something similar to
shutdown_timer makes things simpler for subsystems/drivers, anyway I'm
fine fixing it directly since it is probably going to be simpler to
backport, perhaps something like the following is all we need:

diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 0540555b3704..977684f5fb57 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -4090,6 +4090,7 @@ static void hci_cmd_work(struct work_struct *work)

rcu_read_lock();
if (test_bit(HCI_RESET, &hdev->flags) ||
+ test_bit(HCI_UNREGISTER, &hdev->flags) ||
hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
cancel_delayed_work(&hdev->cmd_timer);
else

Hillf Danton

unread,

Nov 18, 2022, 9:28:13 PM11/18/22

to Luiz Augusto von Dentz, Thomas Gleixner, syzbot, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Steven Rostedt, Marcel Holtmann, Tetsuo Handa

On Fri, 18 Nov 2022 12:01:29 -0800 Luiz Augusto von Dentz <luiz....@gmail.com> wrote:
> Hi Hillf,

Hi Luiz

>
> On Thu, Nov 17, 2022 at 5:28 PM Hillf Danton <hda...@sina.com> wrote:
> >
> > On Thu, Nov 17, 2022 at 1:04 PM Luiz Augusto von Dentz <luiz....@gmail.com> wrote:
> > > >
> > > > I thought we would have something similar to shutdown_timer (e.g.
> > > > shutdown_delayed_work) so we can safely free its object/struct, at
> > > > least that was the impression I got when discussing with Steven.
> >
> > Because of the cough in Bluetooth's throat does not mean it makes sense
> > to ask workqueue to take a flu jab. Why not cure it directly locally,
> > given the cases of workqueue under the drivers dir. And timer?
>
> Like Thomas said we can only resolve this partially with the likes of
> cancel_workqueue(_sync), though we can use HCI_UNREGISTER to stop
> rescheduling cmd_timer, so imo having something similar to
> shutdown_timer makes things simpler for subsystems/drivers, anyway I'm
> fine fixing it directly since it is probably going to be simpler to
> backport, perhaps something like the following is all we need:

Forward step like this is welcome.

>
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index 0540555b3704..977684f5fb57 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -4090,6 +4090,7 @@ static void hci_cmd_work(struct work_struct *work)
>
> rcu_read_lock();
> if (test_bit(HCI_RESET, &hdev->flags) ||
> + test_bit(HCI_UNREGISTER, &hdev->flags) ||
> hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
> cancel_delayed_work(&hdev->cmd_timer);

Given known missing sync cancelation before draining workqueue, it makes more
sense to audit re-arming of cmd_timer, and other works/delayed works defined
in struct hci_dev, than canceling.

Hillf

Reply all

Reply to author

Forward

[syzbot] WARNING in call_timer_fn

syzbot

Thomas Gleixner

Hillf Danton

syzbot

Hillf Danton

Thomas Gleixner

Hillf Danton

Thomas Gleixner

Luiz Augusto von Dentz

Luiz Augusto von Dentz

Tetsuo Handa

Thomas Gleixner

Hillf Danton

Luiz Augusto von Dentz

Hillf Danton