Question about prog minimization

5 views
Skip to first unread message

Daniel Wappner

unread,
Jun 5, 2026, 4:34:59 PM (5 days ago) Jun 5
to syzkaller
Hi,

As I'm working on some modifications on Syzkaller, something caught my
attention regarding the minimization process that gets triggered
during a triageJob.

The way I understand it, at the beginning of the triage process,
deflake computes in stableSignal and newStableSignal the set of all
signal that is "stable" (i.e. it can be reproduced across many
executions of the program) and the set of such signal that is new to
the fuzzer, respectively.

Later down the line, after it's been confirmed that newStableSignal is
nonempty, minimize attempts to make the program smaller before
smashing it and saving it into the corpus, and relies on
minimization.go's Minimize by passing a lambda function that checks
whether any signal was lost as the algorithm minimizes the program.

Said lambda executes the new minimized program a number of times, and
certifies it's acceptable if the union of the signal gathered along
those executions adds up to newStableSignal. In other words, it's a
bit generous about the minimized program maintaining newStableSignal;
while it makes no promises on whether stableSignal was preserved or
not.

The thing that bugs me, is that when the minimized program is
eventually saved into the corpus, it's reported to provide the same
stableSignal that was calculated during deflake. The way I understand
it, the place that this may mainly affect is corpus minimization down
the line, where the saved program could theoretically be kept as a
witness of some signal that it doesn't actually provide.

I mainly have questions regarding what I just described. Have I
understood the implications of how this process works correctly? If
so, I'd love to know some of the reasoning behind these design
decisions, as to me it seems a bit like buggy behaviour.

Thanks a lot in advance!

Best,
Daniel

Aleksandr Nogikh

unread,
7:22 AM (14 hours ago) 7:22 AM
to Daniel Wappner, syzkaller
Hi Daniel,

On Fri, Jun 5, 2026 at 10:34 PM Daniel Wappner <danieli...@gmail.com> wrote:
>
> Hi,
>
> As I'm working on some modifications on Syzkaller, something caught my
> attention regarding the minimization process that gets triggered
> during a triageJob.
>
> The way I understand it, at the beginning of the triage process,
> deflake computes in stableSignal and newStableSignal the set of all
> signal that is "stable" (i.e. it can be reproduced across many
> executions of the program) and the set of such signal that is new to
> the fuzzer, respectively.
>
> Later down the line, after it's been confirmed that newStableSignal is
> nonempty, minimize attempts to make the program smaller before
> smashing it and saving it into the corpus, and relies on
> minimization.go's Minimize by passing a lambda function that checks
> whether any signal was lost as the algorithm minimizes the program.
>
> Said lambda executes the new minimized program a number of times, and
> certifies it's acceptable if the union of the signal gathered along
> those executions adds up to newStableSignal. In other words, it's a
> bit generous about the minimized program maintaining newStableSignal;
> while it makes no promises on whether stableSignal was preserved or
> not.

The description above is correct.

>
> The thing that bugs me, is that when the minimized program is
> eventually saved into the corpus, it's reported to provide the same
> stableSignal that was calculated during deflake. The way I understand
> it, the place that this may mainly affect is corpus minimization down
> the line, where the saved program could theoretically be kept as a
> witness of some signal that it doesn't actually provide.
>
> I mainly have questions regarding what I just described. Have I
> understood the implications of how this process works correctly? If
> so, I'd love to know some of the reasoning behind these design
> decisions, as to me it seems a bit like buggy behaviour.

If some signal is in stableSignal, it must have appeared at least a
few times during the deflake runs. I don't have any figures, but I'd
assume that it shouldn't be very common for a syscall to preserve
newStableSignal while losing the ability to trigger some parts of
stableSignal (though this should be fairly easy to check in an
experiment).

When writing all that code, we were concerned about not losing
signal/coverage during corpus triage after a syz-manager restart:
daily coverage figures used to jump up and down every day instead of
slowly growing. They still jump, but now to a smaller degree,
especially on the snapshot instance.

We've also been facing issues with kernel state accumulation. We do
isolate program executions from each other, but not perfectly, so it's
totally expected that the same program may give slightly different
coverage across runs on different VMs.

So these are the two reasons why the logic ended up being generous to
the programs that don't produce 100% stable coverage.

AFAIK we haven't looked at signal set calculation from the perspective
of the corpus minimization, this is a very interesting point. If we do
overestimate the signal a program can trigger, the corpus minimization
logic could indeed end up deleting the programs that actually trigger
that signal.

>
> Thanks a lot in advance!
>
> Best,
> Daniel
>

--
Aleksandr
Reply all
Reply to author
Forward
0 new messages