A patch for above problem was proposed. Please review and test.
>> I suspect that there are some differences (such as testcases to run, hardware used
>> for testing) between "upstream" and "linux-next".
> The problem is that we are interested in testing many upstream configurations.
> CONFIG_PREEMPT_RT=y/n is just one dimension, there are also different
> arches, different tools (KASAN/KMSAN/KCSAN), different security
> modules, and many more.
Yes.
> However, we cannot afford to test every combination, so we must
> combine several choices into a single instance, hoping to cover each
> option at least once.
Yes.
> "upstream" vs "linux-next" is yet another dimension, and it isn't
> special enough to justify multiplying the number of instances by two.
> Mirroring a particular upstream instance on linux-next will result in
> someone else requesting to switch to another upstream instance.
My question is what factors are making problems impossible or difficult
to reproduce.
For example, if upstream is running on 10000 VM instances and linux-next is
running on 10 VM instances, it might be 1000 times difficult to reproduce
same problem in linux-next than upstream. Are number of VM instances varies
depending on manager (or trees)?
For another example, if upstream is fuzzing functionality1, functionality2
and functionality3 while linux-next is fuzzing functionality1 and functionality2,
it is impossible to reproduce bugs in functionality3 using linux-next.
Are fuzzing targets vary depending on manager (or trees)?
For yet another example, if upstream is using hardware settings 1 while
linux-next is using hardware settings 2, it is impossible to reproduce bugs
which happens with only hardware settings 1 using linux-next.
Are hardware settings vary depending on manager (or trees)?
For final example, if syzbot dashboard prefers saving crashes not in "linux-next"
tree over same crashes in "linux-next" tree, it will become difficult to obtain
debug information by debug printk() patch guarded by CONFIG_DEBUG_AID_FOR_SYZBOT=y.
Are all crashes evenly saved to dashboard, or is there some barrier that prevents
newer crashes (which might include debug information which does not exist in older
crashes) from being saved to dashboard?
> Aleksandr says we can create a special mailing list to test draft
> patches, so the series sent to that list is fuzzed for some time,
> similarly to how upstream patch testing works now.
> Do you think that would help you to debug these issues?
That sounds like "to some degree carrying custom patches". But the answer
depends on whether the series sent to that list is fuzzed using same conditions
(such as number of VM instances, fuzzing targets, hardware settings are same
with unpatched managers / trees).