syzbot bisection analysis

142 views
Skip to first unread message

Dmitry Vyukov

unread,
Mar 27, 2019, 1:20:13 PM3/27/19
to syzkaller, LKML, Linus Torvalds, Tetsuo Handa, Theodore Ts'o, Andrey Ryabinin, Ido Schimmel, Al Viro
Hello,

As most of you probably already noticed, syzbot started bisecting
cause commits for crashes about 2 weeks ago and sending emails like
this:
https://groups.google.com/d/msg/syzkaller-bugs/2XhfN2Kfbqs/0U3YnKsGBQAJ
The bisection results are also available on the dashboard, e.g.:
https://syzkaller.appspot.com/bug?id=02bde0600a225e8efa31bdce2e7f1b822542fef1

Bisection was probably the most popular feature request for syzbot.
Cause commits allow to add the right people to CC and also should help
to pin-point the harder bugs. If you are interested in details of the
bisection process, some are described here:
https://github.com/google/syzkaller/blob/master/docs/syzbot.md#bisection
The next step step will be fix commit bisection to help identify and
close bugs that are already fixed but syzbot is not aware yet.

As expected automatic bisection of kernel bugs is not completely
trivial and we've got lots of incorrect results. To better understand
what happens, why and how we are doing, I've analyzed the 118
bisections that we have so far for the following metrics:
- if the bisection was correct or not
- the crash has multiple manifestations (on the same commit or on
different commits)
- if the fact that bug hard to reproduce contributed to incorrect bisection
- if unrelated bugs contributed to incorrect bisection
- if skipped commits contributed to incorrect bisection
- if disabled configs contributed to incorrect bisection
There are also some auto-extracted metrics like the start release of
bisection, start/end crash, etc. I won't claim that the analysis is
100% correct, which would require spending a day on each case. But it
should be 95% correct or so. The results are here (there is a second
tab with raw data):
https://docs.google.com/spreadsheets/d/1WdBAN54-csaZpD3LgmTcIMR7NDFuQoOZZqPZ-CUqQgA

Total success rate is slightly above 50%. But there is strong
correlation with how far back in history we have to go: for recently
introduced bugs the rate is 70+%. And for bugs introduced since v5.0
it's 95%. So hopefully this is a good forecast for future.

The 2 major contributors to incorrect results look quite fundamental:
- unrelated bugs contributed to 66% of incorrect results
- hard to reproduce bugs contributed to 46% of incorrect results

I've started collecting feedback/ideas re improving bisection quality here:
https://github.com/google/syzkaller/issues/1051
But so far no magic bullet come up. So please continue treating the
results with understanding. The incorrect results were usually easy to
identify: commit to a completely unrelated subsystem, or even
non-current arch. There is always a detailed bisection log attached as
well.

If you are still here, there were some curious cases too, e.g.:
A bug bisected to a comment-only commit:
https://groups.google.com/d/msg/syzkaller-bugs/1BSkmb_fawo/vz7GhBd0CQAJ
A bug bisected to a release tag:
https://groups.google.com/d/msg/syzkaller-bugs/38HP_pUXJ3s/ehD37HSxDAAJ
And a fault-injection-provoked bug bisected to addition of the fault
injection facility by me (which is, well, kinda expected):
https://groups.google.com/d/msg/syzkaller-bugs/GYiA5CKTPXw/MA4mO01wDAAJ

Thanks

Dmitry Vyukov

unread,
Jul 29, 2019, 7:08:30 AM7/29/19
to syzkaller, LKML, Tetsuo Handa, Eric Biggers, Eric Dumazet, Catalin Marinas
A mini analysis of memory leaks bisection:
https://docs.google.com/spreadsheets/d/1WdBAN54-csaZpD3LgmTcIMR7NDFuQoOZZqPZ-CUqQgA/edit#gid=1421280815

Out of 12 leak bisections that happened so far only 2 are semi-correct:
1. The bug is too old, so syzbot just proved that it happens back to
the horizon (v4.1):
https://syzkaller.appspot.com/bug?id=ecc7f04cd94b5c062c000865d43bfb682d718b8e
2. The bug requires both leak detection and fault injection, was
bisected to the commit that allows them to work together:
https://syzkaller.appspot.com/bug?id=58c436c13ed984480edba66a224daff9c184de12
Both results are not too useful.

The remaining 10 were all diverged due to other unrelated memory leaks
and other non-leak bugs. It seems the main 2 reasons for this:
1. Lots of leaks are old (kernel is under-tested with KMEMLEAK).
2. Lots of unrelated bugs.
It's unclear how much KMEMLEAK potential for false positives is in
play. For example, lots of bisections are diverged by "memory leak in
batadv_tvlv_handler_register", but this is a true bug reported at:
https://syzkaller.appspot.com/bug?id=0654529ad3cc1d67a6d9812d8b75489c03dfb983
However, some are diverged by e.g. "memory leak in __neigh_create" and
"memory leak in copy_process" and these were not reported as separate
leaks, so either false positives or true leaks fixed in previous
releases.

I am going to turn off leak bisection for now (easiest way to raise
average bisection precision).
Maybe when we get overall bug levels down, and in particular number of
memory leaks, then newly-introduced leaks will be at least as
bisectable as other bug types. But this will probably require at least
several years of active work.

Catalin Marinas

unread,
Jul 29, 2019, 9:36:23 AM7/29/19
to Dmitry Vyukov, syzkaller, LKML, Tetsuo Handa, Eric Biggers, Eric Dumazet
Hi Dmitry,

On Mon, Jul 29, 2019 at 01:08:16PM +0200, Dmitry Vyukov wrote:
> The remaining 10 were all diverged due to other unrelated memory leaks
> and other non-leak bugs. It seems the main 2 reasons for this:
> 1. Lots of leaks are old (kernel is under-tested with KMEMLEAK).
> 2. Lots of unrelated bugs.
> It's unclear how much KMEMLEAK potential for false positives is in
> play. For example, lots of bisections are diverged by "memory leak in
> batadv_tvlv_handler_register", but this is a true bug reported at:
> https://syzkaller.appspot.com/bug?id=0654529ad3cc1d67a6d9812d8b75489c03dfb983
> However, some are diverged by e.g. "memory leak in __neigh_create" and
> "memory leak in copy_process" and these were not reported as separate
> leaks, so either false positives or true leaks fixed in previous
> releases.

Out of curiosity, when the tool tries to bisect a memory leak, does it
check for precisely that leak (e.g. by function name, object size) or
any other unrelated leak can confuse the bisection?

--
Catalin

Dmitry Vyukov

unread,
Jul 30, 2019, 5:16:55 AM7/30/19
to Catalin Marinas, syzkaller, LKML, Tetsuo Handa, Eric Biggers, Eric Dumazet
Bisection of leaks uses the common scheme which is just "crashed"/"not
crashed" (black/white, no further classification) for reasons outlined
here:
https://groups.google.com/forum/#!msg/syzkaller/sR8aAXaWEF4/tTWYRgvmAwAJ
Consider object size changes across revisions, or the function is
renamed, or code changes. Even if we take just leaks, I am not sure if
it's possible to understand if it's the same leak or not.
Reply all
Reply to author
Forward
0 new messages