What are preventing linux-next tree from reproducing bugs that is reproducible for other trees?

28 views
Skip to first unread message

Tetsuo Handa

unread,
Apr 18, 2026, 2:35:39 AMApr 18
to syzkaller
Hello.

I'm currently trying to debug "WARNING in kcov_remote_start (6)" at
https://syzkaller.appspot.com/bug?extid=3f51ad7ac3ae57a6fdcc .

Since this bug is reproduced in upstream kernels where the kernel config has
CONFIG_PREEMPT_RT=y but is not reproduced in linux-next kernels where
the kernel config has CONFIG_PREEMPT_RT=n, I tried temporarily enabling
CONFIG_PREEMPT_RT using CONFIG_DEBUG_AID_FOR_SYZBOT kernel config option
( https://github.com/google/syzkaller/blob/master/docs/syzbot.md#no-custom-patches ).

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20260417&id=f9f786564e1ea6622a8983726134b1b07c56371d
made linux-next kernels to use CONFIG_PREEMPT_RT=y but did not reproduce the problem.
Since I don't see big difference between upstream kernel config that is reporting
the problem and linux-next kernel config that started using CONFIG_PREEMPT_RT=y,
I suspect that there are some differences (such as testcases to run, hardware used
for testing) between "upstream" and "linux-next".

Since linux-next is the only kernel which I can temporarily apply custom patches
via my tree, being able to reproduce the problem in "linux-next" is important.
How can we make it possible to behave identically except the kernel and the kernel
config? (The same situation applies to "unregister_netdevice: waiting for DEV to
become free (8)" at https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 .
Several remaining bugs are reproduced almost once per a month for linux-next kernels
while these are reproduced almost everyday for other kernels; making it too difficult
to apply custom patches via my tree for obtaining more debug information.)



By the way, the patch above resulted in discovering "BUG: sleeping function called from
invalid context" bugs that were not previously discovered due to CONFIG_PREEMPT_RT=n
or behavior difference. I consider that testing with various kernel configs (e.g.
both CONFIG_PREEMPT_RT=y and CONFIG_PREEMPT_RT=n) for each kernel (e.g. "upstream"
and "linux-next" and "net" and so on) is helpful for discovering more bugs.

Alexander Potapenko

unread,
May 7, 2026, 6:22:31 AMMay 7
to Tetsuo Handa, syzkaller
On Sat, Apr 18, 2026 at 8:35 AM Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
>
> Hello.
>
> I'm currently trying to debug "WARNING in kcov_remote_start (6)" at
> https://syzkaller.appspot.com/bug?extid=3f51ad7ac3ae57a6fdcc .
>
> Since this bug is reproduced in upstream kernels where the kernel config has
> CONFIG_PREEMPT_RT=y but is not reproduced in linux-next kernels where
> the kernel config has CONFIG_PREEMPT_RT=n, I tried temporarily enabling
> CONFIG_PREEMPT_RT using CONFIG_DEBUG_AID_FOR_SYZBOT kernel config option
> ( https://github.com/google/syzkaller/blob/master/docs/syzbot.md#no-custom-patches ).
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20260417&id=f9f786564e1ea6622a8983726134b1b07c56371d
> made linux-next kernels to use CONFIG_PREEMPT_RT=y but did not reproduce the problem.
> Since I don't see big difference between upstream kernel config that is reporting
> the problem and linux-next kernel config that started using CONFIG_PREEMPT_RT=y,
> I suspect that there are some differences (such as testcases to run, hardware used
> for testing) between "upstream" and "linux-next".

Hi Tetsuo,

The problem is that we are interested in testing many upstream configurations.
CONFIG_PREEMPT_RT=y/n is just one dimension, there are also different
arches, different tools (KASAN/KMSAN/KCSAN), different security
modules, and many more.
However, we cannot afford to test every combination, so we must
combine several choices into a single instance, hoping to cover each
option at least once.
"upstream" vs "linux-next" is yet another dimension, and it isn't
special enough to justify multiplying the number of instances by two.
Mirroring a particular upstream instance on linux-next will result in
someone else requesting to switch to another upstream instance.

Dynamic instance creation that you suggested on another thread would
have resolved this situation, but we aren't there yet.

> Since linux-next is the only kernel which I can temporarily apply custom patches
> via my tree, being able to reproduce the problem in "linux-next" is important.
> How can we make it possible to behave identically except the kernel and the kernel
> config? (The same situation applies to "unregister_netdevice: waiting for DEV to
> become free (8)" at https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 .
> Several remaining bugs are reproduced almost once per a month for linux-next kernels
> while these are reproduced almost everyday for other kernels; making it too difficult
> to apply custom patches via my tree for obtaining more debug information.)

Aleksandr says we can create a special mailing list to test draft
patches, so the series sent to that list is fuzzed for some time,
similarly to how upstream patch testing works now.
Do you think that would help you to debug these issues?

>
> By the way, the patch above resulted in discovering "BUG: sleeping function called from
> invalid context" bugs that were not previously discovered due to CONFIG_PREEMPT_RT=n
> or behavior difference. I consider that testing with various kernel configs (e.g.
> both CONFIG_PREEMPT_RT=y and CONFIG_PREEMPT_RT=n) for each kernel (e.g. "upstream"
> and "linux-next" and "net" and so on) is helpful for discovering more bugs.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/syzkaller/03e01a25-94f1-43ff-adf8-0b1997a061cf%40I-love.SAKURA.ne.jp.



--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

Tetsuo Handa

unread,
May 7, 2026, 12:29:32 PMMay 7
to Alexander Potapenko, syzkaller
On 2026/05/07 19:21, Alexander Potapenko wrote:
> On Sat, Apr 18, 2026 at 8:35 AM Tetsuo Handa
> <penguin...@i-love.sakura.ne.jp> wrote:
>>
>> I'm currently trying to debug "WARNING in kcov_remote_start (6)" at
>> https://syzkaller.appspot.com/bug?extid=3f51ad7ac3ae57a6fdcc .

A patch for above problem was proposed. Please review and test.



>> I suspect that there are some differences (such as testcases to run, hardware used
>> for testing) between "upstream" and "linux-next".

> The problem is that we are interested in testing many upstream configurations.
> CONFIG_PREEMPT_RT=y/n is just one dimension, there are also different
> arches, different tools (KASAN/KMSAN/KCSAN), different security
> modules, and many more.

Yes.

> However, we cannot afford to test every combination, so we must
> combine several choices into a single instance, hoping to cover each
> option at least once.

Yes.

> "upstream" vs "linux-next" is yet another dimension, and it isn't
> special enough to justify multiplying the number of instances by two.
> Mirroring a particular upstream instance on linux-next will result in
> someone else requesting to switch to another upstream instance.

My question is what factors are making problems impossible or difficult
to reproduce.

For example, if upstream is running on 10000 VM instances and linux-next is
running on 10 VM instances, it might be 1000 times difficult to reproduce
same problem in linux-next than upstream. Are number of VM instances varies
depending on manager (or trees)?

For another example, if upstream is fuzzing functionality1, functionality2
and functionality3 while linux-next is fuzzing functionality1 and functionality2,
it is impossible to reproduce bugs in functionality3 using linux-next.
Are fuzzing targets vary depending on manager (or trees)?

For yet another example, if upstream is using hardware settings 1 while
linux-next is using hardware settings 2, it is impossible to reproduce bugs
which happens with only hardware settings 1 using linux-next.
Are hardware settings vary depending on manager (or trees)?

For final example, if syzbot dashboard prefers saving crashes not in "linux-next"
tree over same crashes in "linux-next" tree, it will become difficult to obtain
debug information by debug printk() patch guarded by CONFIG_DEBUG_AID_FOR_SYZBOT=y.
Are all crashes evenly saved to dashboard, or is there some barrier that prevents
newer crashes (which might include debug information which does not exist in older
crashes) from being saved to dashboard?

> Aleksandr says we can create a special mailing list to test draft
> patches, so the series sent to that list is fuzzed for some time,
> similarly to how upstream patch testing works now.
> Do you think that would help you to debug these issues?

That sounds like "to some degree carrying custom patches". But the answer
depends on whether the series sent to that list is fuzzed using same conditions
(such as number of VM instances, fuzzing targets, hardware settings are same
with unpatched managers / trees).

Tetsuo Handa

unread,
May 9, 2026, 10:20:31 AMMay 9
to Alexander Potapenko, syzkaller
On 2026/05/08 1:29, Tetsuo Handa wrote:
> For final example, if syzbot dashboard prefers saving crashes not in "linux-next"
> tree over same crashes in "linux-next" tree, it will become difficult to obtain
> debug information by debug printk() patch guarded by CONFIG_DEBUG_AID_FOR_SYZBOT=y.
> Are all crashes evenly saved to dashboard, or is there some barrier that prevents
> newer crashes (which might include debug information which does not exist in older
> crashes) from being saved to dashboard?

I'm currently monitoring https://syzkaller.appspot.com/bug?extid=cd8a9a308e879a4e2c28 ,
and I can observe that sometimes crash counter increases without updating corresponding row.

516th = 2026/05/09 10:17
518th = 2026/05/09 11:40
519th = 2026/05/09 13:21
521th = 2026/05/09 14:10

Something is preventing several crash records from making visible from the dashboard page.
I hope that the dropped records are not for linux-next tree which I'm monitoring.

Reply all
Reply to author
Forward
0 new messages