Column for keywords?

99 views
Skip to first unread message

Tetsuo Handa

unread,
Apr 25, 2018, 7:15:56 AM4/25/18
to dvy...@google.com, syzk...@googlegroups.com
Hello.

Is it possible to append a column to the tables at https://syzkaller.appspot.com/ ?

Since there are many reports, it would be helpful to add keywords column (e.g.
file's pathname which seems to be the culprit, directory or subsystem's name which
seems to be involved, or a few words for problem description/status) controlled by
users (e.g. a syzbot commands for fulfilling the content of that column which is
overwritten by the latest post:

#syz keywords: subsystem1, subsystem2
#syz keywords: path/to/file.c
#syz keywords: bisected
#syz keywords: patch needs reviewers
#syz keywords:

).

Dmitry Vyukov

unread,
Apr 25, 2018, 8:25:07 AM4/25/18
to Tetsuo Handa, syzkaller
Hi Tetsuo,

I need to think more about this.

Some assorted first thoughts:

1. Collecting/showing subsystem was already proposed by Guenter Roeck:
https://github.com/google/syzkaller/issues/544

2. Syzkaller already knows about "guilty" file (that's how it finds
maintainers). File name should be mappable to a subsystem either via
MAINTAINERS or with some custom path-based logic if MAINTAINERS gives
too fine-grained split (MAINTAINERS has 1771 subsystems). We can tread
this info to the UI. Though, there is currently no way to override it
if it's not correct.

3. Can we extract more info automatically? My concern is that few
people (like you) will set some tags for few bugs. But they generally
won't be maintained, thus any reasoning based on manual tags won't be
very useful. Consider that a kernel developer wants to look only on
subsystem:CRYPTO bug, there can be lots of them, but untagged, so they
will see 0 bugs and conclude they have nothing to do. Any automatic
tagging is more valuable than manual (and less burden on humans).

4. How do you want to use these tags? What's the workflow?

5. Some of the examples you provided seem to relate to the "current
status/progress" for a bug. What we do sometimes is the following (and
I think it's a common practice for "bug tracking" systems). We briefly
describe current status in the email thread (e.g. "patch X mailed,
waiting for review"). Then when you later open it, you can see the
last message and recover the current status. It's also useful for
other people to see, e.g. to not do duplicate work. Will it work for
your cases?

6. Do we want to just show them? Or also somehow affect syzbot
behavior? E.g. filtering by subsystem, or preventing future pings for
this bug? How much of them need to be machine-understandable? My
concern is that if they all are just unsystematic "plain English",
then there will be no way for computer programs to understand them and
change behavior. Do we want some system there?

Tetsuo Handa

unread,
Apr 26, 2018, 9:41:51 AM4/26/18
to dvy...@google.com, syzk...@googlegroups.com
Dmitry Vyukov wrote:
> On Wed, Apr 25, 2018 at 1:15 PM, Tetsuo Handa
> <penguin...@i-love.sakura.ne.jp> wrote:
> > Hello.
> >
> > Is it possible to append a column to the tables at https://syzkaller.appspot.com/ ?
> >
> > Since there are many reports, it would be helpful to add keywords column (e.g.
> > file's pathname which seems to be the culprit, directory or subsystem's name which
> > seems to be involved, or a few words for problem description/status) controlled by
> > users (e.g. a syzbot commands for fulfilling the content of that column which is
> > overwritten by the latest post:
> >
> > #syz keywords: subsystem1, subsystem2
> > #syz keywords: path/to/file.c
> > #syz keywords: bisected
> > #syz keywords: patch needs reviewers
> > #syz keywords:
>
>
> Hi Tetsuo,
>
> I need to think more about this.
>
> Some assorted first thoughts:
>
> 1. Collecting/showing subsystem was already proposed by Guenter Roeck:
> https://github.com/google/syzkaller/issues/544

OK.

>
> 2. Syzkaller already knows about "guilty" file (that's how it finds
> maintainers). File name should be mappable to a subsystem either via
> MAINTAINERS or with some custom path-based logic if MAINTAINERS gives
> too fine-grained split (MAINTAINERS has 1771 subsystems). We can tread
> this info to the UI. Though, there is currently no way to override it
> if it's not correct.

I wonder Maintainers column of the "All crashes" table is useful.
In most cases, same addresses are repeatedly shown for each row.
It can be removed or moved to {Status:,Reported-by:,First:,last:} area.

>
> 3. Can we extract more info automatically? My concern is that few
> people (like you) will set some tags for few bugs. But they generally
> won't be maintained, thus any reasoning based on manual tags won't be
> very useful. Consider that a kernel developer wants to look only on
> subsystem:CRYPTO bug, there can be lots of them, but untagged, so they
> will see 0 bugs and conclude they have nothing to do. Any automatic
> tagging is more valuable than manual (and less burden on humans).

Sorry. My concern is how to "manually compensate things" where automatic
processing is failing. Automatic tagging becomes valuable and less burden
on humans only if it is tagged appropriately. I want manual tagging.

>
> 4. How do you want to use these tags? What's the workflow?

For describing current status. Possibly including trivial mumble like
"$(MyName) is analyzing now" which do not worth spamming LKML.

>
> 5. Some of the examples you provided seem to relate to the "current
> status/progress" for a bug. What we do sometimes is the following (and
> I think it's a common practice for "bug tracking" systems). We briefly
> describe current status in the email thread (e.g. "patch X mailed,
> waiting for review"). Then when you later open it, you can see the
> last message and recover the current status. It's also useful for
> other people to see, e.g. to not do duplicate work. Will it work for
> your cases?

Current table lacks "Last modified" field which would be updated when
there is a news other than "the crash occurred again" (e.g. found a
syz/C reproducer, a message was posted). Therefore, I can't check
whether somebody is working on a report unless I open each entry.

>
> 6. Do we want to just show them? Or also somehow affect syzbot
> behavior? E.g. filtering by subsystem, or preventing future pings for
> this bug? How much of them need to be machine-understandable? My
> concern is that if they all are just unsystematic "plain English",
> then there will be no way for computer programs to understand them and
> change behavior. Do we want some system there?

This tag does not affect syzbot behavior.

Dmitry Vyukov

unread,
May 15, 2018, 2:27:31 PM5/15/18
to Tetsuo Handa, syzkaller
I've filed https://github.com/google/syzkaller/issues/608 to not lose
track of this.

On Thu, Apr 26, 2018 at 3:41 PM, Tetsuo Handa

Tetsuo Handa

unread,
May 15, 2018, 4:59:19 PM5/15/18
to dvy...@google.com, syzk...@googlegroups.com
Dmitry Vyukov wrote:
> I've filed https://github.com/google/syzkaller/issues/608 to not lose
> track of this.

Thanks. Since the time lag between a patch was proposed and that patch is
applied to a git tree tends to become long, duplicated works like
https://www.spinics.net/lists/linux-fsdevel/msg125240.html and
http://lkml.kernel.org/r/964a8b27-cd69-357c...@I-love.SAKURA.ne.jp
are already occurring. Therefore, it is important that the state of the bug (e.g.
bisected, cause identified, patch proposed) is visible from the table.

Dmitry Vyukov

unread,
May 16, 2018, 10:15:01 AM5/16/18
to Tetsuo Handa, syzkaller
On Tue, May 15, 2018 at 10:59 PM, Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
> Dmitry Vyukov wrote:
>> I've filed https://github.com/google/syzkaller/issues/608 to not lose
>> track of this.
>
> Thanks. Since the time lag between a patch was proposed and that patch is
> applied to a git tree tends to become long, duplicated works like
> https://www.spinics.net/lists/linux-fsdevel/msg125240.html and
> http://lkml.kernel.org/r/964a8b27-cd69-357c...@I-love.SAKURA.ne.jp
> are already occurring.

This is bad.

> Therefore, it is important that the state of the bug (e.g.
> bisected, cause identified, patch proposed) is visible from the table.

What do you think about the last section of:
https://groups.google.com/d/msg/syzkaller-bugs/nw7BIW9V2wk/NE0P_Au4AQAJ
there is already a way to say "there is a pending fix for this".

But one problem with manual tagging is how to make everybody update
these tags. If only few people do it, it can still lead to duplicate
work. And it's not syzbot-specific. Can happen with just any bug
report on kernel mailing lists. Traditionally it's solved with bug
tracking systems and assigning bugs when a developer starts working on
it. But kernel does not have a working bug tracker.

One simple thing we can do is make syzbot poll more trees to discover
Reported-by tags faster. This will automatically update status on
dashboard to "fix pending". I've filed
https://github.com/google/syzkaller/issues/610 for this. Ideally, we
would intercept all mailed patches, but it's hard with kernel
development process because there is no system that tracks all pending
patches.

Tetsuo Handa

unread,
May 16, 2018, 12:50:20 PM5/16/18
to dvy...@google.com, syzk...@googlegroups.com
Dmitry Vyukov wrote:
> On Tue, May 15, 2018 at 10:59 PM, Tetsuo Handa
> <penguin...@i-love.sakura.ne.jp> wrote:
> > Dmitry Vyukov wrote:
> >> I've filed https://github.com/google/syzkaller/issues/608 to not lose
> >> track of this.
> >
> > Thanks. Since the time lag between a patch was proposed and that patch is
> > applied to a git tree tends to become long, duplicated works like
> > https://www.spinics.net/lists/linux-fsdevel/msg125240.html and
> > http://lkml.kernel.org/r/964a8b27-cd69-357c...@I-love.SAKURA.ne.jp
> > are already occurring.
>
> This is bad.
>
> > Therefore, it is important that the state of the bug (e.g.
> > bisected, cause identified, patch proposed) is visible from the table.
>
> What do you think about the last section of:
> https://groups.google.com/d/msg/syzkaller-bugs/nw7BIW9V2wk/NE0P_Au4AQAJ
> there is already a way to say "there is a pending fix for this".

That lacks a way to annotate "there is a pending fix for this, but the fix
is not yet applied to any git tree". I mean not only "git trees which syzbot
is checking" but also "git trees which are publicly visible".

(Also, if we can later correct the patch using "#syz fix:" in case the patch
title was renamed, it is not clear how to specify multiple patches using
"#syz fix:" when a patch which meant to fix the reported problem contained
a regression or was incomplete and thus fixup patch followed shortly. An
example is commit 5f3e3b85cc0a5eae and commit ef95a90ae6f4f219 in
"WARNING: kmalloc bug in memdup_user (2)". I've tried

#syz fix: RDMA/ucma: Correct option size check using optlen
#syz fix: RDMA/ucma: ucma_context reference leak in error path

but only the former patch was recorded.)

>
> But one problem with manual tagging is how to make everybody update
> these tags. If only few people do it, it can still lead to duplicate
> work. And it's not syzbot-specific. Can happen with just any bug
> report on kernel mailing lists. Traditionally it's solved with bug
> tracking systems and assigning bugs when a developer starts working on
> it. But kernel does not have a working bug tracker.
>
> One simple thing we can do is make syzbot poll more trees to discover
> Reported-by tags faster. This will automatically update status on
> dashboard to "fix pending". I've filed
> https://github.com/google/syzkaller/issues/610 for this. Ideally, we
> would intercept all mailed patches, but it's hard with kernel
> development process because there is no system that tracks all pending
> patches.
>

The problem is that the pending fix won't be applied to any git tree.
It depends on when reviewers and maintainers can find time for
reviewing/committing the fix. Scanning all git trees unlikely helps.

the criteria is that you are "reasonably sure that the commit will
reach upstream under this title", for whatever reason

won't apply to not yet reviewed patches. What I want is a way to specify
"a patch was proposed but the patch is not yet reviewed/tested/applied".

Generally, progresses are not recorded frequently enough to avoid duplicated
works. I want to check not only "fix pending" stage but also e.g. "problem
guessed", "bisected", "cause identified", "patch proposed", "patch reviewed"
stages from the top page's table.

Dmitry Vyukov

unread,
Jan 4, 2019, 8:01:25 AM1/4/19
to Tetsuo Handa, syzkaller, LKML, Linus Torvalds, Greg Kroah-Hartman
On Wed, May 16, 2018 at 6:50 PM Tetsuo Handa
1. This sounds very much like general bug tracking system. We
specifically didn't want to go down the slippery slope of implementing
yet another bug tracking system.
2. This problem is not specific to syzbot in any way (just like lost
bug reports). Kernel developers waste time on duplicate work for other
bug reports too.
So I think (1) we need a bug tracking system, (2) use that system for
syzbot to solve this local problem.

Dmitry Vyukov

unread,
Jan 4, 2019, 8:26:15 AM1/4/19
to Tetsuo Handa, syzkaller, LKML, Linus Torvalds, Greg Kroah-Hartman, Theodore Ts'o, dled...@redhat.com
+Ted who also says that it is not possible to make sense out of
current state of kernel bug reports (e.g. what are open bugs for ext4
sorted by priority).
+Doug who says the same re rdma_cm subsystem.

Both said this in the context of syzbot, but I fail to see how this is
any syzbot-specific. This highlights the more broad problem with
kernel development process.
Reply all
Reply to author
Forward
0 new messages