Re: "Learning-based Controlled Concurrency Testing"

208 views
Skip to first unread message

Dmitry Vyukov

unread,
May 17, 2021, 11:36:30 AM5/17/21
to Paul E. McKenney, syzkaller, Marco Elver, kasan-dev
On Wed, May 12, 2021 at 8:18 PM Paul E. McKenney <pau...@kernel.org> wrote:
>
> Hello, Dmitry!
>
> On the perhaps-unlikely off-chance that this is useful new news, there
> is a paper by Mukherjee et al. entitled "Learning-based Controlled
> Concurrency Testing" that suggests use of an augmented coverage state
> as a goal driving random testing. The meat of this discussion is on
> the eighth page, the one labeled as "230:8".
>
> This builds on tools such as American Fuzzy Lop (AFL) that use straight
> coverage as a testing goal by adding carefully abstracted concurrency
> state, such as which locks are held and which threads are blocked/spinning
> on which lock. This of course does not help for lockless algorithms,
> but there are plenty of bugs involving straight locking.
>
> Thoughts?
>
> Thanx, Paul


+syzkaller, kasan-dev

Hi Paul,

Thanks for notifying me, I wasn't aware of this work.

FTR here is a link to the paper I found:
https://www.microsoft.com/en-us/research/uploads/prod/2019/12/QL-OOPSLA-2020.pdf

That's an interesting approach. Initially how they obtain the program
"state" and calculate the reward, but the "default observation" thing
answered my question.
I think such approaches may be useful for the SPIN-territory where we
verify a reasonably local and isolated algorithm, e.g. RAFT
verification they used for benchmarking.
But if we take, say, whole Linux kernel then such approaches become
somewhat fragile, inefficient and impractical, e.g. capturing all
tasks and mutexes may be impractical and inefficient (state
explosion), or controlling all sources of non-determinism may be
infeasible. And at the same time it's unnecessary because we still
don't have even the most basic implementation, the random scheduler,
which is not even what they are trying to improve on, it's several
steps back.
I would start with a random scheduler, maybe with few simple
heuristics. That should be simple and robust and I am sure it will
give us enough low hanging fruits to keep us busy for a prolonged
period of time :) Here are tracking issues for that:
https://bugzilla.kernel.org/show_bug.cgi?id=209219
https://github.com/google/syzkaller/issues/1891

Maybe you did not mean Linux kernel at all, I don't know. For
something like RCU verification (like what you did with SPIN) it's
definitely more suitable.
Interestingly, if we have a notion of "state" we can use
coverage-guided fuzzing techniques as well. Though, I don't see it
mentioned in the text explicitly. But you mentioned AFL, did you see
this mentioned in the paper?
They set a goal of maximizing state coverage, but they don't seem to
preserve a "corpus" of schedules that give maximum coverage. If we do
this, we can mutate schedules in the corpus, splice them, or prime the
corpus with context-bound schedules (see CHESS, another seminal paper
MS research). Generally, the more technique we include into the same
feedback loop, the better, because they all start helping each other
progress deeper.

Paul E. McKenney

unread,
May 17, 2021, 12:44:12 PM5/17/21
to Dmitry Vyukov, syzkaller, Marco Elver, kasan-dev
My hope is that some very clever notion of "state" would allow
coverage-guided fuzzing techniques to be applied across the full kernel.
Here are a few not-so-clever notions I have thought of, in the hope that
they inspire some notion that is within the realm of sanity:

1. The current coverage state plus the number of locks held by the
current CPU/task. This is not so clever because the PC value
normally implies the number of locks.

It might be possible to do a little bit better by using the
lockdep hash instead of the number of locks, which could help
with code that is protected by a lock selected by the caller.

2. #1 above, but the number of locks held globally, not just by
the current CPU/task. This is not so clever because maintaining
the global number of locks held is quite expensive.

3. #2 above, but approximate the number of locks held. The
question is whether there is an approximation that is
both efficient and useful to fuzzing.

4. Run lockdep and periodically stop all the CPUs to gather the
hashes of their current lock state plus PC. The result is a set
of states, one for each pair of CPUs, consisting of the first
CPU's PC and both CPU's lockdep hash. Combine this with the
usual PC-only state.

I could probably talk myself into believing that this one is
clever, but who knows? One not-so-clever aspect is the size of
the state space, but perhaps bloom-filter techniques can help.

5. KCSAN-like techniques, but where marking accesses forgives
nothing. No splats, but instead hash the "conflicting" accesses,
preferably abstracting with type information, and add this hash
to the notion of state. This might not be so clever given how
huge the state space would be, but again, perhaps bloom-filter
techniques can help.

6. Your more-clever ideas here!

Thanx, Paul

Marco Elver

unread,
May 17, 2021, 1:12:48 PM5/17/21
to Paul E. McKenney, Dmitry Vyukov, syzkaller, kasan-dev
On Mon, 17 May 2021 at 18:44, Paul E. McKenney <pau...@kernel.org> wrote:
[...]
All the above sound like "functional coverage" to me, and could be
implemented on top of a well-thought-out functional coverage API.
Functional coverage is common in the hardware verification space to
drive simulation and model checking; for example, functional coverage
could be "buffer is full" vs just structural (code) coverage which
cannot capture complex state properties like that easily.

Similarly, you could then say things like "number of held locks" or
even alluding to your example (5) above, "observed race on address
range". In the end, with decent functional coverage abstractions,
anything should hopefully be possible.

I've been wondering if this could be something useful for the Linux
kernel, but my guess has always been that it'd not be too-well
received because people don't like to see strange annotations in their
code. But maybe I'm wrong.

My ideal abstractions I've been thinking of isn't just for coverage,
but to also capture temporal properties (which should be inspired by
something like LTL or such), on top of which you can also build
coverage. Then we can specify things like "if I observe some state X,
then eventually we observe state Y", and such logic can also just be
used to define functional coverage of interest (again all this
inspired by what's already done in hardware verification).

This is of course a ton of work, and I wouldn't want this to be a
pre-requisite for the more concurrency-oriented functional coverage
you suggest above. Just wanted to throw it out there. The major
technical hurdle I think is that of generalization vs. specialization,
and I think specialized functional coverage can probably be
implemented more efficiently. But if it's not supposed to be used in
production, but only for debugging, maybe it's possible.

Thanks,
-- Marco

Vegard Nossum

unread,
May 17, 2021, 2:15:09 PM5/17/21
to pau...@kernel.org, Dmitry Vyukov, syzkaller, Marco Elver, kasan-dev
Somewhat tangential in the context of the paper posted (and probably
less clever), and not based on state... but how about a new gcc plugin
that records which struct members are being accessed? You could for
example hash struct name + member name into a single number that can be
recorded AFL-style in a fixed-size bitmap or kcov-style...

The fundamental idea is to just ignore everything about locking and
concurrent accesses -- if you have the data above you'll know which
independent test cases are likely to *try* accessing the same data (but
from different code paths), so if there's a race somewhere it might be
triggered more easily if they're run concurrently.


Vegard

Paul E. McKenney

unread,
May 18, 2021, 4:42:28 PM5/18/21
to Marco Elver, Dmitry Vyukov, syzkaller, kasan-dev
Those were in fact the lines along which I was thinking.

> I've been wondering if this could be something useful for the Linux
> kernel, but my guess has always been that it'd not be too-well
> received because people don't like to see strange annotations in their
> code. But maybe I'm wrong.

I agree that it is much easier to get people to use a tool that does not
require annotations. In fact, it is best if it requires nothing at all
from them...

> My ideal abstractions I've been thinking of isn't just for coverage,
> but to also capture temporal properties (which should be inspired by
> something like LTL or such), on top of which you can also build
> coverage. Then we can specify things like "if I observe some state X,
> then eventually we observe state Y", and such logic can also just be
> used to define functional coverage of interest (again all this
> inspired by what's already done in hardware verification).

Promela/spin provides an LTL interface, but of course cannot handle
much of RCU, let alone of the entire kernel. And LTL can be quite
useful. But in a runtime system, how do you decide when "eventually"
has arrived? The lockdep system does so by tracking entry to idle
and to userspace execution, along with exit from interrupt handlers.
Or did you have something else in mind?

> This is of course a ton of work, and I wouldn't want this to be a
> pre-requisite for the more concurrency-oriented functional coverage
> you suggest above. Just wanted to throw it out there. The major
> technical hurdle I think is that of generalization vs. specialization,
> and I think specialized functional coverage can probably be
> implemented more efficiently. But if it's not supposed to be used in
> production, but only for debugging, maybe it's possible.

No argument with any of this!

Thanx, Paul

Dmitry Vyukov

unread,
May 19, 2021, 3:19:45 AM5/19/21
to Vegard Nossum, Paul E. McKenney, syzkaller, Marco Elver, kasan-dev, Mathias Payer
Hi Vegard,

Interesting idea.
Also +Mathias who was interested in dependency analysis between syscalls.

A similar analysis can be done statically as well... I can't make up
my mind which one would be better... both have pros and cons...

However, again, I think we are missing some lower hanging fruit here.
The current collide mode is super dumb and simple, I added it very
early to trigger at least some races. It turned out to be efficient
enough for now to never get back to it. The tracking issues for better
collider with some ideas is:
https://github.com/google/syzkaller/issues/612
I think we need to implement it before we do anything more fancy. Just
because we need an engine that could accept and act on the signal you
describe. That engine is indepent of the actual signal we use to
determine related syscalls, and it's useful on its own. And we have
some easy to extract dependency information already in syscall
descriptions in the form of /resources/. Namely, if we have 2 syscalls
operating on, say, SCTP sockets, that's a pretty good signal that they
are related and may operate on the same data.
Once we have it, we could plug in more elaborate dynamic analysis info
that will give a much higher quality signal regarding the relation of
2 exact syscall invocations in the exact program.

Marco Elver

unread,
May 19, 2021, 5:02:57 AM5/19/21
to Paul E. McKenney, Dmitry Vyukov, syzkaller, kasan-dev
On Tue, 18 May 2021 at 22:42, Paul E. McKenney <pau...@kernel.org> wrote:
[...]
> > All the above sound like "functional coverage" to me, and could be
> > implemented on top of a well-thought-out functional coverage API.
> > Functional coverage is common in the hardware verification space to
> > drive simulation and model checking; for example, functional coverage
> > could be "buffer is full" vs just structural (code) coverage which
> > cannot capture complex state properties like that easily.
> >
> > Similarly, you could then say things like "number of held locks" or
> > even alluding to your example (5) above, "observed race on address
> > range". In the end, with decent functional coverage abstractions,
> > anything should hopefully be possible.
>
> Those were in fact the lines along which I was thinking.
>
> > I've been wondering if this could be something useful for the Linux
> > kernel, but my guess has always been that it'd not be too-well
> > received because people don't like to see strange annotations in their
> > code. But maybe I'm wrong.
>
> I agree that it is much easier to get people to use a tool that does not
> require annotations. In fact, it is best if it requires nothing at all
> from them...

While I'd like to see something like that, because it'd be beneficial
to see properties of the code written down to document its behaviour
better and at the same time machine checkable, like you say, if it
requires additional effort, it's a difficult sell. (Although the same
is true for all other efforts to improve reliability that require a
departure from the "way it used to be done", be it data_race(), or
even efforts introducing whole new programming languages to the
kernel.)

> > My ideal abstractions I've been thinking of isn't just for coverage,
> > but to also capture temporal properties (which should be inspired by
> > something like LTL or such), on top of which you can also build
> > coverage. Then we can specify things like "if I observe some state X,
> > then eventually we observe state Y", and such logic can also just be
> > used to define functional coverage of interest (again all this
> > inspired by what's already done in hardware verification).
>
> Promela/spin provides an LTL interface, but of course cannot handle
> much of RCU, let alone of the entire kernel. And LTL can be quite
> useful. But in a runtime system, how do you decide when "eventually"
> has arrived? The lockdep system does so by tracking entry to idle
> and to userspace execution, along with exit from interrupt handlers.
> Or did you have something else in mind?

For coverage, one could simply await the transition to the "eventually
state" indefinitely; once reached we have coverage.

But for verification, because unlike explicit state model checkers
like Spin, we don't have the complete state and can't build an
exhaustive state-graph, we'd have to approximate. And without knowing
exactly what it is we're waiting for, the simplest option would be to
just rely on a timeout, either part of the property or implicit. What
the units of that timeout are I'm not sure, because a system might
e.g. be put to sleep.

Also see Dmitry's answer, where he has concerns adding more dimensions
to coverage.

Thanks,
-- Marco

Paul E. McKenney

unread,
May 19, 2021, 2:53:07 PM5/19/21
to Marco Elver, Dmitry Vyukov, syzkaller, kasan-dev
Fair point! But what exactly did you have in mind?

> > > My ideal abstractions I've been thinking of isn't just for coverage,
> > > but to also capture temporal properties (which should be inspired by
> > > something like LTL or such), on top of which you can also build
> > > coverage. Then we can specify things like "if I observe some state X,
> > > then eventually we observe state Y", and such logic can also just be
> > > used to define functional coverage of interest (again all this
> > > inspired by what's already done in hardware verification).
> >
> > Promela/spin provides an LTL interface, but of course cannot handle
> > much of RCU, let alone of the entire kernel. And LTL can be quite
> > useful. But in a runtime system, how do you decide when "eventually"
> > has arrived? The lockdep system does so by tracking entry to idle
> > and to userspace execution, along with exit from interrupt handlers.
> > Or did you have something else in mind?
>
> For coverage, one could simply await the transition to the "eventually
> state" indefinitely; once reached we have coverage.
>
> But for verification, because unlike explicit state model checkers
> like Spin, we don't have the complete state and can't build an
> exhaustive state-graph, we'd have to approximate. And without knowing
> exactly what it is we're waiting for, the simplest option would be to
> just rely on a timeout, either part of the property or implicit. What
> the units of that timeout are I'm not sure, because a system might
> e.g. be put to sleep.
>
> Also see Dmitry's answer, where he has concerns adding more dimensions
> to coverage.

And I must of course defer to Dmitry's much greater experience with this
sort of thing.

Thanx, Paul

Marco Elver

unread,
May 19, 2021, 4:24:59 PM5/19/21
to Paul E. McKenney, Dmitry Vyukov, syzkaller, kasan-dev
Good question, I'll try to be more concrete -- most of it are
half-baked ideas and questions ;-), but if any of it makes sense, I
should maybe write a doc to summarize.

What I had in mind is a system to write properties for both functional
coverage, but also checking more general properties of the kernel. The
latter I'm not sure about how useful. But all this isn't really used
for anything other than in debug builds.

Assume we start with macros such as "ASSERT_COVER(...)" (for
functional coverage) and "ASSERT(...)" (just plain-old assertions).
The former is a way to document potentially interesting states (useful
for fuzzers to reach them), and the latter just a way to just specify
properties of the system (useful for finding the actual bugs).
Implementation-wise the latter is trivial, the former requires some
thought on how to expose that information to fuzzers and how to use
(as Dmitry suggested it's not trivial). I'd also imagine we can have
module-level variants ("GLOBAL_ASSERT*(...)") that monitor some global
state, and also add support for some subset of temporal properties
like "GLOBAL_ASSERT_EVENTUALLY(precond, eventually_holds)" as
suggested below.

I guess maybe I'd have to take a step back and just ask why we have no
way to write plain and simple assertions that are removed in non-debug
builds? Some subsystems seem to roll their own, which a 'git grep
"#define ASSERT"' tells me.

Is there a fundamental reason why we shouldn't have them, perhaps
there was some past discussion? Today we have things like
lockdep_assert_held(), but nothing to even write a simple assert
otherwise. If I had to guess why something like ASSERT is bad, it is
because it gives people a way to check for unexpected conditions, but
if those checks disappear in non-debug builds, the kernel might be
unstable. Therefore every possible state must be handled and we must
always be able to recover. The argument in favor is, if the ASSERT()s
are proven invariants or conditions where we'd recover either way, and
are only there to catch accidental regressions during testing; and in
non-debug builds we don't suffer the performance overheads.

Thoughts?

Thanks,
-- Marco

Paul E. McKenney

unread,
May 19, 2021, 7:22:24 PM5/19/21
to Marco Elver, Dmitry Vyukov, syzkaller, kasan-dev
One thing that I personally might find useful would be a way of marking
sections of interest from a concurrency viewpoint. "Here is region
A and here is region B. You are not done validating until you have a
goodly number of samples showing concurrent execution of A with A, B
with B, and A with B." Easy to imagine, perhaps somewhat more difficult
to implement efficiently. To say nothing of designing a good way of
marking the regions. Exactly where is A? Well, I likely am concerned
about particular operations...

So please treat this as mostly speculation.

Thanx, Paul

Dmitry Vyukov

unread,
May 20, 2021, 12:46:48 AM5/20/21
to Marco Elver, Paul E. McKenney, syzkaller, kasan-dev
There are some (see below) and I am sure there are precedents in other
subsystems as well.
What's the rationale behind not having a common debug assert/config...
maybe because nobody cared enough. The current approach is poorly
suited for CIs/generic testing but fine for human-oriented workflows
for testing a single subsystem only.

$ grep DEBUG_VM mm/*.c
mm/debug.c:#ifdef CONFIG_DEBUG_VM
mm/debug.c:#endif /* CONFIG_DEBUG_VM */
mm/filemap.c: if (!IS_ENABLED(CONFIG_DEBUG_VM) && unlikely(page_mapped(page))) {
mm/huge_memory.c: if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount) {
mm/interval_tree.c:#ifdef CONFIG_DEBUG_VM_RB
mm/interval_tree.c:#ifdef CONFIG_DEBUG_VM_RB
mm/ksm.c:#ifdef CONFIG_DEBUG_VM
mm/ksm.c:#if defined (CONFIG_DEBUG_VM) && defined(CONFIG_NUMA)
mm/memcontrol.c:#ifdef CONFIG_DEBUG_VM
mm/memcontrol.c:#ifdef CONFIG_DEBUG_VM
mm/mmap.c:#ifdef CONFIG_DEBUG_VM_RB
mm/page_alloc.c:#ifdef CONFIG_DEBUG_VM
mm/page_alloc.c: if (!IS_ENABLED(CONFIG_DEBUG_VM)) {
mm/page_alloc.c:#ifdef CONFIG_DEBUG_VM
mm/page_alloc.c: * With DEBUG_VM enabled, order-0 pages are checked
immediately when being freed
mm/page_alloc.c: * With DEBUG_VM disabled, order-0 pages being freed
are checked only when
mm/page_alloc.c:#endif /* CONFIG_DEBUG_VM */
mm/page_alloc.c:#ifdef CONFIG_DEBUG_VM
mm/page_alloc.c: * With DEBUG_VM enabled, order-0 pages are checked
for expected state when
mm/page_alloc.c: * With DEBUG_VM disabled, free order-0 pages are
checked for expected state
mm/page_alloc.c:#endif /* CONFIG_DEBUG_VM */
mm/slab_common.c:#ifdef CONFIG_DEBUG_VM
mm/vmacache.c:#ifdef CONFIG_DEBUG_VM_VMACACHE
mm/vmstat.c:#ifdef CONFIG_DEBUG_VM_VMACACHE

Paul E. McKenney

unread,
May 20, 2021, 12:21:04 PM5/20/21
to Dmitry Vyukov, Marco Elver, syzkaller, kasan-dev
One possible work-around would be to create a CONFIG_DEBUG Kconfig
option that selected all of these subsystem-specific CONFIG_DEBUG_*
Kconfig options. But I would not necessarily expect that the resulting
kernel would be stable.

Here are RCU's:

CONFIG_DEBUG_OBJECTS_RCU_HEAD, which checks for things like double
call_rcu()s. It depends on CONFIG_DEBUG_OBJECTS.

CONFIG_PROVE_RCU, which is equivalent to CONFIG_PROVE_LOCKING.

CONFIG_PROVE_RCU_LIST, which enables additional lockdep checking for
RCU-protected linked lists, and which is supposed to be retired after a
conversion process is completed, and one that I had completely forgotten
about.

CONFIG_RCU_TRACE, which enables additional RCU event tracing. Not sure
that this is particularly relevant.

CONFIG_RCU_EQS_DEBUG, which provides additional idle-entry checks that
have proven valuable for hardware bringup.

CONFIG_RCU_STRICT_GRACE_PERIOD, which shortens RCU grace periods, to
the detriment of system performance.

It is not clear to me that blanket-enabling of these guys would be all
that helpful.

Thanx, Paul

Mathias Payer

unread,
May 20, 2021, 3:59:57 PM5/20/21
to Dmitry Vyukov, Vegard Nossum, Paul E. McKenney, syzkaller, Marco Elver, kasan-dev
Thanks for the include and hi everyone! I'm running the HexHive research
lab at EPFL, we develop techniques to find bugs and also target the
kernel. So far, we focused mostly on spatial/temporal memory safety and
type safety.

As I'm late to the party, I may be missing some context. I assume the
goal is to develop fuzzers that explore more complex kernel state and
find unsynchronized concurrent access to the same state.


> A similar analysis can be done statically as well... I can't make up
> my mind which one would be better... both have pros and cons..
> However, again, I think we are missing some lower hanging fruit here.
> The current collide mode is super dumb and simple, I added it very
> early to trigger at least some races. It turned out to be efficient
> enough for now to never get back to it. The tracking issues for better
> collider with some ideas is:
> https://github.com/google/syzkaller/issues/612
> I think we need to implement it before we do anything more fancy. Just
> because we need an engine that could accept and act on the signal you
> describe. That engine is indepent of the actual signal we use to
> determine related syscalls, and it's useful on its own. And we have
> some easy to extract dependency information already in syscall
> descriptions in the form of /resources/. Namely, if we have 2 syscalls
> operating on, say, SCTP sockets, that's a pretty good signal that they
> are related and may operate on the same data.
> Once we have it, we could plug in more elaborate dynamic analysis info
> that will give a much higher quality signal regarding the relation of
> 2 exact syscall invocations in the exact program.

There were a couple of static analyses that applied to the whole kernel.
K-Miner from NDSS'18 comes to mind:
http://lib.21h.io/library/XHEQU6AX/download/SLDEJFQG/2018_K-Miner_-_Uncovering_Memory_Corruption_in_Linux_Internet_Society.pdf

Now, such researchy approaches may be a bit too brittle (and imprecise)
if we do it static only due to the potentially large amount of false
positives. IMO we can profit from a combination of static and dynamic
analyses: dynamic analysis to get an idea of how control flow connects
different parts of the kernel (due to the massive amount of indirect
control flow transfers which would make static analysis next to
impossible) along with a marking technique such as the one proposed by
Vegard. Then, based on "matches", follow up with a static analysis that
tracks state along this observed control flow state to see if the target
state is feasible. Not sure if this is already too complex though...

Best,
Mathias

kaip...@gmail.com

unread,
May 21, 2021, 2:59:36 AM5/21/21
to syzkaller
Hi folks!

I made a reply earlier, but I'm not sure if that delivered successfully, sorry for noise if I make a duplication.

Glad to see syzkaller may introduce state feedback in fuzzing!

The goal of HardenedLinux "Kernel state based fuzzer" is not for
concurrency testing.
Also, it's not for socket only. It's very similar to what Vegard
mentioned above.
I implement a llvm pass instrument to trace when struct and its member
is written.
Actually, overwriting of any type of struct can be traced too, not
just for socket.

Since syzkaller has merged coverage-filter, I have implemented a new
kernel state trace for it.
For every "kernel state", I collect three 64-bit data:
1. struct and member name hash, it is something like Vegard described
above. Also I collect member len in some bits here.
2. the value of the member, it is exactly the value of the state.
3. the address of the instrument. It is for coverage-filter.
I instrument the whole kernel if a "store" operation of a
"getelementpoint" occurred.
Use syzkaller coverage filter and struct-member hash to select which
type of "kernel state" from which code as syzkaller feedback.
(It's a little different from what I first introduced to the syz
mailing list, and some code is not published yet.)

These works have a basic assumption: these kernel states should be one
kind of struct.
Well, socket is an obvious case, but not the only case.

But, I didn't consider using them in concurrency testing before.
The goal I implement it is for improving coverage of fuzzing.
For example, by using some static analysis such as symbolic execution,
we certainly know which kernel state would be used in some conditions.
Those prog( in syz corpus) with more useful kernel states would have a
high priors to be resources in generating and mutating.
Further thinking, syscall script auto-generating may benefit from
these things also, it can tell syzkaller which data struct would be
more useful.
(I had some practice with this, but it's not so satisfying now.)

I don't know if concurrency testing could also benefit from this kind
of state, I practiced less with this.
But, I believe syzkaller can benefit from "state" feedback of course,
maybe others would offer a better way to solve it.

Looking forward to your progress of kernel state feedback.
Thanks!

Dmitry Vyukov

unread,
May 21, 2021, 3:27:44 AM5/21/21
to Mathias Payer, Vegard Nossum, Paul E. McKenney, syzkaller, Marco Elver, kasan-dev
Hi Mathias,

There are now actually several branches in this thread and some don't
have you in CC (hard to synchronize now), but the whole thread is
available here:
https://groups.google.com/g/syzkaller/c/yFtW39rcWyQ
It all started with Paul sending a link to the "Learning-based
Controlled Concurrency Testing" paper.
We can tolerate some impreciseness because our end goal is triggering
bugs at runtime, which is ultimate proof.

For very targeted provocation of concurrency bugs dynamic analysis may
work well, because we don't care about part of the code we can't
trigger (so far), and for the parts we can trigger and plan to collide
we can as well do precise dynamic tracing.
And dynamically we could as well trace actual addresses rather than just fields.
However, both addresses and struct fields will suffer from common
background noise (accessing common shared facilities, kmalloc,
lockdep, etc).

But having some notion of relation statically may be useful when we
generate/mutate programs and need to select syscalls for inclusion.
Say, we have 5000 syscalls and want to generate a program with 10
syscalls. Which ones do we choose? We have some analysis for this, but
there is always room for improvement :)

Vegard Nossum

unread,
May 21, 2021, 7:31:52 AM5/21/21
to Dmitry Vyukov, Paul E. McKenney, syzkaller, Marco Elver, kasan-dev, Mathias Payer
I understand what you wrote about improving the collider support, but
unfortunately I don't think my Go skills are sufficient to make a
contribution on the syzkaller side here...

However, I was too curious to stop myself, so I went ahead and
implemented a gcc plugin for collecting struct member derefs + a kcov
mode, see the attachment. It seems to work here but I'm probably missing
some subtler cases in the gcc code (e.g. anonymous structs).

I'll play a bit with this, and the plugin is there in case somebody does
end up doing something on the syzkaller side :-)


Vegard
0001-kcov-add-dereference-tracing-mode.patch

Marco Elver

unread,
Jun 17, 2021, 7:20:16 AM6/17/21
to Paul E. McKenney, bri...@redhat.com, Dmitry Vyukov, syzkaller, kasan-dev, LKML
[+Daniel, just FYI. We had a discussion about "functional coverage"
and fuzzing, and I've just seen your wonderful work on RV. If you have
thought about fuzzing with RV and how coverage of the model impacts
test generation, I'd be curious to hear.]

Looks like there is ongoing work on specifying models and running them
along with the kernel: https://lwn.net/Articles/857862/

Those models that are run alongside the kernel would have their own
coverage, and since there's a mapping between real code and model, a
fuzzer trying to reach new code in one or the other will ultimately
improve coverage for both.

Just wanted to document this here, because it seems quite relevant.
I'm guessing that "functional coverage" would indeed be a side-effect
of a good RV model?

Previous discussion below.

Thanks,
-- Marco
..

Dmitry Vyukov

unread,
Jun 17, 2021, 7:38:34 AM6/17/21
to Marco Elver, Paul E. McKenney, bri...@redhat.com, syzkaller, kasan-dev, LKML
On Thu, Jun 17, 2021 at 1:20 PM 'Marco Elver' via syzkaller
<syzk...@googlegroups.com> wrote:
>
> [+Daniel, just FYI. We had a discussion about "functional coverage"
> and fuzzing, and I've just seen your wonderful work on RV. If you have
> thought about fuzzing with RV and how coverage of the model impacts
> test generation, I'd be curious to hear.]
>
> Looks like there is ongoing work on specifying models and running them
> along with the kernel: https://lwn.net/Articles/857862/
>
> Those models that are run alongside the kernel would have their own
> coverage, and since there's a mapping between real code and model, a
> fuzzer trying to reach new code in one or the other will ultimately
> improve coverage for both.
>
> Just wanted to document this here, because it seems quite relevant.
> I'm guessing that "functional coverage" would indeed be a side-effect
> of a good RV model?

Ha! That's interesting. RV can indeed be a source of high-quality
meaningful states.

The idea behind states is to "multiply" code coverage by the dimension
of states, right? Instead of checking "have we covered this code?", we
will be checking "have we covered this code in this state or not?".
This will require some way of figuring what code is affected by what
model, right? Otherwise it still can lead to state explosion I think.
E.g. if we have 5 models with 5 states each, it will increase the
amount of effective coverage by 5^5.

The preemption model in the example is "global" (per-task), but there
are also per-object models. I remember we discussed sockets as an
example on LPC. But I don't remember what was proposed API for tieing
states to objects. Maybe that API will help with code regions as
well?...
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/CANpmjNMPvAucMQoZeLQAP_WiwiLT6XBoss%3DEZ4xAbrHnMwdt5g%40mail.gmail.com.

Daniel Bristot de Oliveira

unread,
Jun 18, 2021, 3:59:05 AM6/18/21
to Marco Elver, Paul E. McKenney, Dmitry Vyukov, syzkaller, kasan-dev, LKML
On 6/17/21 1:20 PM, Marco Elver wrote:
> [+Daniel, just FYI. We had a discussion about "functional coverage"
> and fuzzing, and I've just seen your wonderful work on RV. If you have
> thought about fuzzing with RV and how coverage of the model impacts
> test generation, I'd be curious to hear.]

One aspect of RV is that we verify the actual execution of the system instead of
a complete model of the system, so we depend of the testing to cover all the
aspects of the system <-> model.

There is a natural relation with testing/fuzzing & friends with RV.

> Looks like there is ongoing work on specifying models and running them
> along with the kernel: https://lwn.net/Articles/857862/
>
> Those models that are run alongside the kernel would have their own
> coverage, and since there's a mapping between real code and model, a
> fuzzer trying to reach new code in one or the other will ultimately
> improve coverage for both.

Perfect!

> Just wanted to document this here, because it seems quite relevant.
> I'm guessing that "functional coverage" would indeed be a side-effect
> of a good RV model?

So, let me see if I understood the terms. Functional coverage is a way to check
if all the desired aspects of a code/system/subsystem/functionality were covered
by a set of tests?

If that is correct, we could use RV to:

- create an explicit model of the states we want to cover.
- check if all the desired states were visited during testing.

?

-- Daniel

Marco Elver

unread,
Jun 18, 2021, 7:27:00 AM6/18/21
to Daniel Bristot de Oliveira, Paul E. McKenney, Dmitry Vyukov, syzkaller, kasan-dev, LKML
On Fri, Jun 18, 2021 at 09:58AM +0200, Daniel Bristot de Oliveira wrote:
> On 6/17/21 1:20 PM, Marco Elver wrote:
> > [+Daniel, just FYI. We had a discussion about "functional coverage"
> > and fuzzing, and I've just seen your wonderful work on RV. If you have
> > thought about fuzzing with RV and how coverage of the model impacts
> > test generation, I'd be curious to hear.]
>
> One aspect of RV is that we verify the actual execution of the system instead of
> a complete model of the system, so we depend of the testing to cover all the
> aspects of the system <-> model.
>
> There is a natural relation with testing/fuzzing & friends with RV.
>
> > Looks like there is ongoing work on specifying models and running them
> > along with the kernel: https://lwn.net/Articles/857862/
> >
> > Those models that are run alongside the kernel would have their own
> > coverage, and since there's a mapping between real code and model, a
> > fuzzer trying to reach new code in one or the other will ultimately
> > improve coverage for both.
>
> Perfect!
>
> > Just wanted to document this here, because it seems quite relevant.
> > I'm guessing that "functional coverage" would indeed be a side-effect
> > of a good RV model?
>
> So, let me see if I understood the terms. Functional coverage is a way to check
> if all the desired aspects of a code/system/subsystem/functionality were covered
> by a set of tests?

Yes, unlike code/structural coverage (which is what we have today via
KCOV) functional coverage checks if some interesting states were reached
(e.g. was buffer full/empty, did we observe transition a->b etc.).

Functional coverage is common in hardware verification, but of course
software verification would benefit just as much -- just haven't seen it
used much in practice yet.
[ Example for HW verification: https://www.chipverify.com/systemverilog/systemverilog-functional-coverage ]

It still requires some creativity from the designer/developer to come up
with suitable functional coverage. State explosion is a problem, too,
and naturally it is impractical to capture all possible states ... after
all, functional coverage is meant to direct the test generator/fuzzer
into more interesting states -- we're not doing model checking after all.

> If that is correct, we could use RV to:
>
> - create an explicit model of the states we want to cover.
> - check if all the desired states were visited during testing.
>
> ?

Yes, pretty much. On one hand there could be an interface to query if
all states were covered, but I think this isn't useful out-of-the box.
Instead, I was thinking we can simply get KCOV to help us out: my
hypothesis is that most of this would happen automatically if dot2k's
generated code has distinct code paths per transition.

If KCOV covers the RV model (since it's executable kernel C code), then
having distinct code paths for "state transitions" will effectively give
us functional coverage indirectly through code coverage (via KCOV) of
the RV model.

From what I can tell this doesn't quite happen today, because
automaton::function is a lookup table as an array. Could this just
become a generated function with a switch statement? Because then I
think we'd pretty much have all the ingredients we need.

Then:

1. Create RV models for states of interests not covered by normal code
coverage of code under test.

2. Enable KCOV for everything.

3. KCOV's coverage of the RV model will tell us if we reached the
desired "functional coverage" (and can be used by e.g. syzbot to
generate better tests without any additional changes because it
already talks to KCOV).

Thoughts?

Thanks,
-- Marco

Dmitry Vyukov

unread,
Jun 19, 2021, 7:08:31 AM6/19/21
to Marco Elver, Daniel Bristot de Oliveira, Paul E. McKenney, syzkaller, kasan-dev, LKML
I think there is usually already some code for any important state
transitions. E.g. I can't imagine how a socket can transition to
active/listen/shutdown/closed states w/o any code.

I see RV to be potentially more useful for the "coverage dimensions"
idea. I.e. for sockets that would be treating coverage for a socket
function X as different coverage based on the current socket state,
effectively consider (PC,state) as feedback signal.
But my concern is that we don't want to simply consider combinations
of all kernel code multiplied by all combinations of states of all RV
models. Most likely this will lead to severe feedback signal
explosion. So the question is: how do we understand that the socket
model relates only to this restricted set of code?

Daniel Bristot de Oliveira

unread,
Jun 21, 2021, 4:24:01 AM6/21/21
to Marco Elver, Paul E. McKenney, Dmitry Vyukov, syzkaller, kasan-dev, LKML
So you want to observe a given a->b transition, not that B was visited?

> Functional coverage is common in hardware verification, but of course
> software verification would benefit just as much -- just haven't seen it
> used much in practice yet.
> [ Example for HW verification: https://www.chipverify.com/systemverilog/systemverilog-functional-coverage ]
>
> It still requires some creativity from the designer/developer to come up
> with suitable functional coverage.

That is where the fun lives.

State explosion is a problem, too,
> and naturally it is impractical to capture all possible states ... after
> all, functional coverage is meant to direct the test generator/fuzzer
> into more interesting states -- we're not doing model checking after all.


I still need to understand what you are aiming to verify, and what is the
approach that you would like to use to express the specifications of the systems...

Can you give me a simple example?

>> If that is correct, we could use RV to:
>>
>> - create an explicit model of the states we want to cover.
>> - check if all the desired states were visited during testing.
>>
>> ?
>
> Yes, pretty much. On one hand there could be an interface to query if
> all states were covered, but I think this isn't useful out-of-the box.
> Instead, I was thinking we can simply get KCOV to help us out: my
> hypothesis is that most of this would happen automatically if dot2k's
> generated code has distinct code paths per transition.

...

>
> If KCOV covers the RV model (since it's executable kernel C code), then
> having distinct code paths for "state transitions" will effectively give
> us functional coverage indirectly through code coverage (via KCOV) of
> the RV model.

so, you want to have a different function for every transition so KCOV can
observe that?

>
> From what I can tell this doesn't quite happen today, because
> automaton::function is a lookup table as an array.

It is a the transition function of the formal automaton definition. Check this:

https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf

page 9.

Could this just
> become a generated function with a switch statement? Because then I
> think we'd pretty much have all the ingredients we need.

a switch statement that would.... call a different function for each transition?

Daniel Bristot de Oliveira

unread,
Jun 21, 2021, 4:40:01 AM6/21/21
to Dmitry Vyukov, Marco Elver, Paul E. McKenney, syzkaller, kasan-dev, LKML
makes sense...

> I see RV to be potentially more useful for the "coverage dimensions"
> idea. I.e. for sockets that would be treating coverage for a socket
> function X as different coverage based on the current socket state,
> effectively consider (PC,state) as feedback signal.

How can RV subsystem talk with KCOV?

> But my concern is that we don't want to simply consider combinations
> of all kernel code multiplied by all combinations of states of all RV
> models.

I agree! Also because RV monitors will generally monitor an specific part of the
code (with exceptions for models like the preemption one).

Most likely this will lead to severe feedback signal
> explosion.So the question is: how do we understand that the socket
> model relates only to this restricted set of code?
>

Should we annotate a model, saying which subsystem it monitors/verify?

-- Daniel

Marco Elver

unread,
Jun 21, 2021, 6:30:46 AM6/21/21
to Daniel Bristot de Oliveira, Paul E. McKenney, Dmitry Vyukov, syzkaller, kasan-dev, LKML
On Mon, Jun 21, 2021 at 10:23AM +0200, Daniel Bristot de Oliveira wrote:
[...]
> > Yes, unlike code/structural coverage (which is what we have today via
> > KCOV) functional coverage checks if some interesting states were reached
> > (e.g. was buffer full/empty, did we observe transition a->b etc.).
>
> So you want to observe a given a->b transition, not that B was visited?

An a->b transition would imply that a and b were visited.

> I still need to understand what you are aiming to verify, and what is the
> approach that you would like to use to express the specifications of the systems...
>
> Can you give me a simple example?

The older discussion started around a discussion how to get the fuzzer
into more interesting states in complex concurrent algorithms. But
otherwise I have no idea ... we were just brainstorming and got to the
point where it looked like "functional coverage" would improve automated
test generation in general. And then I found RV which pretty much can
specify "functional coverage" and almost gets that information to KCOV
"for free".

> so, you want to have a different function for every transition so KCOV can
> observe that?

Not a different function, just distinct "basic blocks". KCOV uses
compiler instrumentation, and a sequence of non-branching instructions
denote one point of coverage; at the next branch (conditional or otherwise)
it then records which branch was taken and therefore we know which code
paths were covered.

> >
> > From what I can tell this doesn't quite happen today, because
> > automaton::function is a lookup table as an array.
>
> It is a the transition function of the formal automaton definition. Check this:
>
> https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf
>
> page 9.
>
> Could this just
> > become a generated function with a switch statement? Because then I
> > think we'd pretty much have all the ingredients we need.
>
> a switch statement that would.... call a different function for each transition?

No, just a switch statement that returns the same thing as it does
today. But KCOV wouldn't see different different coverage with the
current version because it's all in one basic block because it looks up
the next state given the current state out of the array. If it was a
switch statement doing the same thing, the compiler will turn the thing
into conditional branches and KCOV then knows which code path
(effectively the transition) was covered.

Daniel Bristot de Oliveira

unread,
Jun 21, 2021, 3:25:55 PM6/21/21
to Marco Elver, Paul E. McKenney, Dmitry Vyukov, syzkaller, kasan-dev, LKML
On 6/21/21 12:30 PM, Marco Elver wrote:
> On Mon, Jun 21, 2021 at 10:23AM +0200, Daniel Bristot de Oliveira wrote:
> [...]
>>> Yes, unlike code/structural coverage (which is what we have today via
>>> KCOV) functional coverage checks if some interesting states were reached
>>> (e.g. was buffer full/empty, did we observe transition a->b etc.).
>>
>> So you want to observe a given a->b transition, not that B was visited?
>
> An a->b transition would imply that a and b were visited.

HA! let's try again with a less abstract example...


| +------------ on --+----------------+
v ^ +--------v v
+========+ | +===========+>--- suspend ---->+===========+
| OFF | +- on --<| ON | | SUSPENDED |
+========+ <------ shutdown -----<+===========+<----- on -------<+===========+
^ v v
+--------------- off ----------------+-----------------------------+

Do you care about:

1) states [OFF|ON|SUSPENDED] being visited a # of times; or
2) the occurrence of the [on|suspend|off] events a # of times; or
3) the language generated by the "state machine"; like:

the occurrence of *"on -> suspend -> on -> off"*

which is != of

the occurrence of *"on -> on -> suspend -> off"*

although the same events and states occurred the same # of times
?

RV can give you all... but the way to inform this might be different.

>> I still need to understand what you are aiming to verify, and what is the
>> approach that you would like to use to express the specifications of the systems...
>>
>> Can you give me a simple example?
>
> The older discussion started around a discussion how to get the fuzzer
> into more interesting states in complex concurrent algorithms. But
> otherwise I have no idea ... we were just brainstorming and got to the
> point where it looked like "functional coverage" would improve automated
> test generation in general. And then I found RV which pretty much can
> specify "functional coverage" and almost gets that information to KCOV
> "for free".

I think we will end up having an almost for free solution, but worth the price.

>> so, you want to have a different function for every transition so KCOV can
>> observe that?
>
> Not a different function, just distinct "basic blocks". KCOV uses
> compiler instrumentation, and a sequence of non-branching instructions
> denote one point of coverage; at the next branch (conditional or otherwise)
> it then records which branch was taken and therefore we know which code
> paths were covered.

ah, got it. But can't KCOV be extended with another source of information?

>>>
>>> From what I can tell this doesn't quite happen today, because
>>> automaton::function is a lookup table as an array.
>>
>> It is a the transition function of the formal automaton definition. Check this:
>>
>> https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf
>>
>> page 9.
>>
>> Could this just
>>> become a generated function with a switch statement? Because then I
>>> think we'd pretty much have all the ingredients we need.
>>
>> a switch statement that would.... call a different function for each transition?
>
> No, just a switch statement that returns the same thing as it does
> today. But KCOV wouldn't see different different coverage with the
> current version because it's all in one basic block because it looks up
> the next state given the current state out of the array. If it was a
> switch statement doing the same thing, the compiler will turn the thing
> into conditional branches and KCOV then knows which code path
> (effectively the transition) was covered.

[ the answer for this points will depend on your answer from my first question
on this email so... I will reply it later ].

-- Daniel

Dmitry Vyukov

unread,
Jun 22, 2021, 1:17:26 AM6/22/21
to Marco Elver, Daniel Bristot de Oliveira, Paul E. McKenney, syzkaller, kasan-dev, LKML
If we do this, we need to watch out for compiler optimizations. In
both clang and gcc KCOV pass runs in the middle of the middle-end
after some optimizations. It's possible that some trivial branches are
merged back into unconditional code already (e.g. table/conditional
moves).

Dmitry Vyukov

unread,
Jun 22, 2021, 2:33:14 AM6/22/21
to Daniel Bristot de Oliveira, Marco Elver, Paul E. McKenney, syzkaller, kasan-dev, LKML
KCOV collects a trace of covered PCs. One natural way for this
interface would be a callback that allows injecting RV state events
into the KCOV trace. To make it possible to associate states with
code, these events need to be scoped, e.g.:

void kcov_state_start(int model, int state);
void kcov_state_end(int model, int state);

There is no prior art that I am aware of, so I assume it will require
some experimentation and research work to figure out exactly what
interface works best, if it works at all, how much it helps fuzzing,
is it a good metric for assessing testing coverage, etc.

> > But my concern is that we don't want to simply consider combinations
> > of all kernel code multiplied by all combinations of states of all RV
> > models.
>
> I agree! Also because RV monitors will generally monitor an specific part of the
> code (with exceptions for models like the preemption one).
>
> Most likely this will lead to severe feedback signal
> > explosion.So the question is: how do we understand that the socket
> > model relates only to this restricted set of code?
> >
> Should we annotate a model, saying which subsystem it monitors/verify?

Yes. The main question I see: how to specify what "subsystem" is.

Besides dynamic scoping we could use static mapping of models to code.
E.g. socket model covers net/core/*.c and net/tpc/*.c. Then maybe we
don't need dynamic scopes (?) however then it becomes tricker for
models that are associated with objects. Namely, if we traced
different states for different objects, what object does current
executions belong to? Does it belong to any of these at all?

Marco Elver

unread,
Jun 22, 2021, 6:48:24 AM6/22/21
to Daniel Bristot de Oliveira, Paul E. McKenney, Dmitry Vyukov, syzkaller, kasan-dev, LKML
On Mon, Jun 21, 2021 at 09:25PM +0200, Daniel Bristot de Oliveira wrote:
> On 6/21/21 12:30 PM, Marco Elver wrote:
> > On Mon, Jun 21, 2021 at 10:23AM +0200, Daniel Bristot de Oliveira wrote:
> > [...]
> >>> Yes, unlike code/structural coverage (which is what we have today via
> >>> KCOV) functional coverage checks if some interesting states were reached
> >>> (e.g. was buffer full/empty, did we observe transition a->b etc.).
> >>
> >> So you want to observe a given a->b transition, not that B was visited?
> >
> > An a->b transition would imply that a and b were visited.
>
> HA! let's try again with a less abstract example...

Terminology misunderstanding.

I mean "state transition". Writing "a->b transition" led me to infer 'a'
and 'b' are states, but from below I infer that you meant an "event
trace" (viz. event sequence). So it seems I was wrong.

Let me be clearer: transition A -[a]-> B implies states A and B were
visited. Hence, knowing that event 'a' occurred is sufficient, and
actually provides a little more information than just "A and B were
visited".

>
> | +------------ on --+----------------+
> v ^ +--------v v
> +========+ | +===========+>--- suspend ---->+===========+
> | OFF | +- on --<| ON | | SUSPENDED |
> +========+ <------ shutdown -----<+===========+<----- on -------<+===========+
> ^ v v
> +--------------- off ----------------+-----------------------------+
>
> Do you care about:
>
> 1) states [OFF|ON|SUSPENDED] being visited a # of times; or
> 2) the occurrence of the [on|suspend|off] events a # of times; or
> 3) the language generated by the "state machine"; like:
>
> the occurrence of *"on -> suspend -> on -> off"*
>
> which is != of
>
> the occurrence of *"on -> on -> suspend -> off"*
>
> although the same events and states occurred the same # of times
> ?

They are all interesting, but unrealistic for a fuzzer to keep track of.
We can't realistically keep track of all possible event traces. Nor that
some state or event was visited # of times.

What I did mean is as described above: the simple occurrence of an
event, as it implies some previous and next state were visited.

The fuzzer then builds up knowledge of which inputs cause some events to
occur. Because it knows it has inputs for such events, it will then try
to further combine these inputs hoping to reach new coverage. This leads
to various distinct event traces using the events it has already
observed. All of this is somewhat random of course, because fuzzers are
not meant to be model checkers.

If someone wants something more complex as you describe, it'd have to
explicitly become part of the model (if possible?). The problem of
coverage explosion applies, and we may not recommend such usage anyway.

> RV can give you all... but the way to inform this might be different.
>
> >> I still need to understand what you are aiming to verify, and what is the
> >> approach that you would like to use to express the specifications of the systems...
> >>
> >> Can you give me a simple example?
> >
> > The older discussion started around a discussion how to get the fuzzer
> > into more interesting states in complex concurrent algorithms. But
> > otherwise I have no idea ... we were just brainstorming and got to the
> > point where it looked like "functional coverage" would improve automated
> > test generation in general. And then I found RV which pretty much can
> > specify "functional coverage" and almost gets that information to KCOV
> > "for free".
>
> I think we will end up having an almost for free solution, but worth the price.
>
> >> so, you want to have a different function for every transition so KCOV can
> >> observe that?
> >
> > Not a different function, just distinct "basic blocks". KCOV uses
> > compiler instrumentation, and a sequence of non-branching instructions
> > denote one point of coverage; at the next branch (conditional or otherwise)
> > it then records which branch was taken and therefore we know which code
> > paths were covered.
>
> ah, got it. But can't KCOV be extended with another source of information?

Not without changing KCOV. And I think we're weary of something like
that due to the potential for coverage explosion. -fsanitize-coverage
has various options to capture different types of coverage actually, not
purely basic block based coverage. (KCOV already supports
KCOV_ENABLE_COMPARISONS, perhaps that could help somehow. It captures
arguments of comparisons.)

> >>>
> >>> From what I can tell this doesn't quite happen today, because
> >>> automaton::function is a lookup table as an array.
> >>
> >> It is a the transition function of the formal automaton definition. Check this:
> >>
> >> https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf
> >>
> >> page 9.
> >>
> >> Could this just
> >>> become a generated function with a switch statement? Because then I
> >>> think we'd pretty much have all the ingredients we need.
> >>
> >> a switch statement that would.... call a different function for each transition?
> >
> > No, just a switch statement that returns the same thing as it does
> > today. But KCOV wouldn't see different different coverage with the
> > current version because it's all in one basic block because it looks up
> > the next state given the current state out of the array. If it was a
> > switch statement doing the same thing, the compiler will turn the thing
> > into conditional branches and KCOV then knows which code path
> > (effectively the transition) was covered.

Per Dmitry's comment, yes we need to be careful that the compiler
doesn't collapse the switch statement somehow. But this should be
achievable with a bunch or 'barrier()' after every 'case ...:'.

Daniel Bristot de Oliveira

unread,
Jun 23, 2021, 5:10:24 AM6/23/21
to Marco Elver, Dmitry Vyukov, Paul E. McKenney, syzkaller, kasan-dev, LKML
On 6/22/21 12:48 PM, Marco Elver wrote:
> On Mon, Jun 21, 2021 at 09:25PM +0200, Daniel Bristot de Oliveira wrote:
>> On 6/21/21 12:30 PM, Marco Elver wrote:
>>> On Mon, Jun 21, 2021 at 10:23AM +0200, Daniel Bristot de Oliveira wrote:
>>> [...]
>>>>> Yes, unlike code/structural coverage (which is what we have today via
>>>>> KCOV) functional coverage checks if some interesting states were reached
>>>>> (e.g. was buffer full/empty, did we observe transition a->b etc.).
>>>>
>>>> So you want to observe a given a->b transition, not that B was visited?
>>>
>>> An a->b transition would imply that a and b were visited.
>>
>> HA! let's try again with a less abstract example...
>
> Terminology misunderstanding.
>
> I mean "state transition". Writing "a->b transition" led me to infer 'a'
> and 'b' are states, but from below I infer that you meant an "event
> trace" (viz. event sequence). So it seems I was wrong.
>
> Let me be clearer: transition A -[a]-> B implies states A and B were
> visited.

right

Hence, knowing that event 'a' occurred is sufficient, and
> actually provides a little more information than just "A and B were
> visited".

iff [a] happens only from A to B...

>
>>
>> | +------------ on --+----------------+
>> v ^ +--------v v
>> +========+ | +===========+>--- suspend ---->+===========+
>> | OFF | +- on --<| ON | | SUSPENDED |
>> +========+ <------ shutdown -----<+===========+<----- on -------<+===========+
>> ^ v v
>> +--------------- off ----------------+-----------------------------+
>>
>> Do you care about:
>>
>> 1) states [OFF|ON|SUSPENDED] being visited a # of times; or
>> 2) the occurrence of the [on|suspend|off] events a # of times; or
>> 3) the language generated by the "state machine"; like:
>>
>> the occurrence of *"on -> suspend -> on -> off"*
>>
>> which is != of
>>
>> the occurrence of *"on -> on -> suspend -> off"*
>>
>> although the same events and states occurred the same # of times
>> ?
>
> They are all interesting, but unrealistic for a fuzzer to keep track of.
> We can't realistically keep track of all possible event traces. Nor that
> some state or event was visited # of times.

We can track this easily via RV, and doing that is already on my todo list. But
now I got that we do not need all these information for the functional coverage.

> What I did mean is as described above: the simple occurrence of an
> event, as it implies some previous and next state were visited.
>
> The fuzzer then builds up knowledge of which inputs cause some events to
> occur. Because it knows it has inputs for such events, it will then try
> to further combine these inputs hoping to reach new coverage. This leads
> to various distinct event traces using the events it has already
> observed. All of this is somewhat random of course, because fuzzers are
> not meant to be model checkers.
>
> If someone wants something more complex as you describe, it'd have to
> explicitly become part of the model (if possible?). The problem of
> coverage explosion applies, and we may not recommend such usage anyway.

I did not mean to make GCOV/the fuzzer to keep track of these information. I was
trying to understand what are the best way to provide the information that you
all need.
Changing the "function" will add some overhead for the runtime monitor use-case.
For example, for the safety-critical systems that will run with a monitor
enabled to detect a failure and react to it.

But! I can extend the idea of the reactor to receive the successful state
transitions or create the "observer" abstraction, to which we can attach a
generic that will make the switch statements. This function can be
auto-generated by dot2k as well...

This reactor/observer can be enabed/disabled so... we can add as much annotation
and barriers as we want.

Thoughts?

-- Daniel

Marco Elver

unread,
Jun 24, 2021, 4:09:32 AM6/24/21
to Daniel Bristot de Oliveira, Dmitry Vyukov, Paul E. McKenney, syzkaller, kasan-dev, LKML
That sounds reasonable. Simply having an option (Kconfig would be
ideal) to enable the KCOV-friendly version of the transition function
is good enough for the fuzzer usecase. The kernels built for fuzzing
usually include lots of other debug options anyway, and aren't
production kernels.

Thanks,
-- Marco
Reply all
Reply to author
Forward
0 new messages