Formal description of system call interface

Dmitry Vyukov

unread,

Nov 6, 2016, 5:39:49 PM11/6/16

to linu...@vger.kernel.org, LKML, mtk.ma...@gmail.com, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, car...@redhat.com, syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

Hello,

This is notes from the discussion we had at Linux Plumbers this week
regarding providing a formal description of system calls (user API).

The idea come up in the context of syzkaller, syscall fuzzer, which
has descriptions for 1000+ syscalls mostly concentrating on types of
arguments and return values. However, problems are that a small group
of people can't write descriptions for all syscalls; can't keep them
up-to-date and doesn't have necessary domain expertise to do correct
descriptions in some cases.

We identified a surprisingly large number of potential users for such
descriptions:
- fuzzers (syzkaller, trinity, iknowthis)
- strace/syscall tracepoints (capturing indirect arguments and
printing human-readable info)
- generation of entry points for C libraries (glibc, liblinux
(raw syscalls), Go runtime, clang/gcc sanitizers)
- valgrind/sanitizers checking of input/output values of syscalls
- seccomp filters (minijail, libseccomp) need to know interfaces
to generate wrappers
- safety certification (requires syscall specifications)
- man pages (could provide actual syscall interface rather than
glibc wrapper interface, it was noted that possible errno values
is an important part here)
- generation of syscall argument validation in kernel (fast version
is enabled all the time, extended is optional)

It's worth noting that number of these users already have some
descriptions that suffer from the same problems of being
incomplete/outdated. See also linux-api mailing list description
which lists an overlapping set of cases:
https://www.kernel.org/doc/man-pages/linux-api-ml.html

We discussed several implementation approaches:
- Extracting the interface from kernel code either by parsing
sources or using dwarf. However, current source doesn't have
enough info: fd are specified as int, while we need to know exact
fd type (e.g. fd_epoll_t); not possible to extract flag set for
'int flags'; don't know what is 'char*'.
- Making the formal description the master copy and generating
kernel code from it (structs, flags, syscall entry points).
This is quite pervasive, but otherwise should work.
- Doing what syzkaller currently does: providing the description
on side. Verifying that description and implementation match
is an important piece here. We can do dynamic checking in syscall
entry points (print warnings on anything that does not match
descriptions); or static checking (but again kernel code doesn't
have enough info for checking).

We decided to pursue the last option as the least pervasive for now.
Several locations for the descriptions were proposed: with source code,
include/uapi, Documentation.

Action points:
- polish DSL for description (must be extensible)
- write a parser for DSL
- provide definition for mm syscalls (mm is reasonably simple
and self-contained)
- see if we can do validation of mm arguments

It was acknowledged that whatever we do now it will probably
significantly change and evolve over time as we better understand
what we need and what works.

For the reference, current syzkaller descriptions are in txt files here:
https://github.com/google/syzkaller/tree/master/sys
The most generic syscalls are here:
https://github.com/google/syzkaller/blob/master/sys/sys.txt
Specific subsystems are described in separate files, e.g.:
https://github.com/google/syzkaller/blob/master/sys/bpf.txt
https://github.com/google/syzkaller/blob/master/sys/tty.txt
https://github.com/google/syzkaller/blob/master/sys/sndseq.txt
The descriptions should be self-explanatory, but just in case there
is also a semi-formal DSL specification here:
https://github.com/google/syzkaller/blob/master/sys/README.md

Taking the opportunity, if you see that something is missing/wrong
in the descriptions of the subsystem you care about, or if it is not
described at all, fixes are welcome.

Thanks

Szabolcs Nagy

unread,

Nov 6, 2016, 7:28:41 PM11/6/16

to Dmitry Vyukov, linu...@vger.kernel.org, LKML, mtk.ma...@gmail.com, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, car...@redhat.com, syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

* Dmitry Vyukov <dvy...@google.com> [2016-11-06 14:39:28 -0800]:

> This is notes from the discussion we had at Linux Plumbers this week
> regarding providing a formal description of system calls (user API).

yes a database of the syscall abis would be useful
..depending on the level of detail it has.

(right now there is not even spec about what registers
the syscall entry point may clobber on the various abis
which would be useful to know when making syscalls)

> Action points:
> - polish DSL for description (must be extensible)
> - write a parser for DSL
> - provide definition for mm syscalls (mm is reasonably simple
> and self-contained)
> - see if we can do validation of mm arguments

for all abi variants? e.g. mmap offset range is abi dependent.

> For the reference, current syzkaller descriptions are in txt files here:
> https://github.com/google/syzkaller/tree/master/sys

...

> Taking the opportunity, if you see that something is missing/wrong
> in the descriptions of the subsystem you care about, or if it is not
> described at all, fixes are welcome.

abi variants are missing (abi variation makes a lot of
syscall interface related work painful).

Cyril Hrubis

unread,

Nov 7, 2016, 5:38:26 AM11/7/16

to Dmitry Vyukov, linu...@vger.kernel.org, LKML, mtk.ma...@gmail.com, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, car...@redhat.com, syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

Hi!

> We identified a surprisingly large number of potential users for such
> descriptions:
> - fuzzers (syzkaller, trinity, iknowthis)
> - strace/syscall tracepoints (capturing indirect arguments and
> printing human-readable info)
> - generation of entry points for C libraries (glibc, liblinux
> (raw syscalls), Go runtime, clang/gcc sanitizers)
> - valgrind/sanitizers checking of input/output values of syscalls
> - seccomp filters (minijail, libseccomp) need to know interfaces
> to generate wrappers
> - safety certification (requires syscall specifications)
> - man pages (could provide actual syscall interface rather than
> glibc wrapper interface, it was noted that possible errno values
> is an important part here)
> - generation of syscall argument validation in kernel (fast version
> is enabled all the time, extended is optional)

I was thinking of generating LTP testcases from a well defined format for
quite some time. Maybe that would be a good way how to keep the
descriptions to match the reality.

LTP test would however need a bit more information that just syscall
parameter anotation. We would need also description of the expected
return value in some cases, annotating it as "returns fd" or "zero on
success" would be good enough for basic tests. Better testing would need
to describe the "side efect" as well (file was created, filesystem
mounted, etc.) then we could write a classes of verify functions,
something as check_file_exits() and use them to check the results
accordingly.

I'm not sure if something like this is really doable or in the scope of
this project, but it may be worth a try.

--
Cyril Hrubis
chr...@suse.cz

Andy Lutomirski

unread,

Nov 11, 2016, 12:11:05 PM11/11/16

to Dmitry Vyukov, Linux API, LKML, Michael Kerrisk, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, Carlos O'Donell, syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

On Sun, Nov 6, 2016 at 2:39 PM, Dmitry Vyukov <dvy...@google.com> wrote:
> Hello,
>
> This is notes from the discussion we had at Linux Plumbers this week
> regarding providing a formal description of system calls (user API).
>
> The idea come up in the context of syzkaller, syscall fuzzer, which
> has descriptions for 1000+ syscalls mostly concentrating on types of
> arguments and return values. However, problems are that a small group
> of people can't write descriptions for all syscalls; can't keep them
> up-to-date and doesn't have necessary domain expertise to do correct
> descriptions in some cases.
>
> We identified a surprisingly large number of potential users for such
> descriptions:

Let me add one more: consolidation of all the incompatible arch
syscall tables. A sufficiently descriptive format should be parseable
at build time to generate the syscall tables.

Dmitry Vyukov

unread,

Nov 21, 2016, 10:03:25 AM11/21/16

to syzkaller, linu...@vger.kernel.org, LKML, Michael Kerrisk-manpages, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, car...@redhat.com, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

On Mon, Nov 7, 2016 at 1:28 AM, Szabolcs Nagy <n...@port70.net> wrote:
> * Dmitry Vyukov <dvy...@google.com> [2016-11-06 14:39:28 -0800]:
>> This is notes from the discussion we had at Linux Plumbers this week
>> regarding providing a formal description of system calls (user API).
> yes a database of the syscall abis would be useful
> ..depending on the level of detail it has.
>
> (right now there is not even spec about what registers
> the syscall entry point may clobber on the various abis
> which would be useful to know when making syscalls)

Hi Szabolcs,

Level of detail is discus-sable. I would say that a detail is worth
supporting if:
1. there is an intent to actually use in a foreseeable future
2. there is somebody who is ready to spend time describing the detail

>> Action points:
>> - polish DSL for description (must be extensible)
>> - write a parser for DSL
>> - provide definition for mm syscalls (mm is reasonably simple
>> and self-contained)
>> - see if we can do validation of mm arguments
>
> for all abi variants? e.g. mmap offset range is abi dependent.

I don't think we draw exact line between what will be verified and will not.
There are simpler predicates (e.g. memory is addressable) and more complex
predicates (e.g. this flag can have this value iff that other flags is
specified and
a valid fd is passed in that field of a struct).

>> For the reference, current syzkaller descriptions are in txt files here:
>> https://github.com/google/syzkaller/tree/master/sys
> ...
>> Taking the opportunity, if you see that something is missing/wrong
>> in the descriptions of the subsystem you care about, or if it is not
>> described at all, fixes are welcome.
>
> abi variants are missing (abi variation makes a lot of
> syscall interface related work painful).

What exactly do you mean by "abi variants"? Is it architecture?
What exactly needs to be added to the descriptions support "abi variants?

Thanks

Dmitry Vyukov

unread,

Nov 21, 2016, 10:15:18 AM11/21/16

to Cyril Hrubis, linu...@vger.kernel.org, LKML, Michael Kerrisk-manpages, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, car...@redhat.com, syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

Hi Cyril,

I think I heard something similar from Tavis in the iknowthis context.

Description of "returns fd or this set of errors" looks simple and useful.
Any test system or fuzzer will be able to verify that kernel actually returns
claimed return values. Also Carlos expressed interested in errno values
in the context of glibc.
I would do it from day one.

Re more complex side effects. I always feared that a description suitable
for automatic verification (i.e. zero false positives, otherwise it is useless)
may be too difficult to achieve.

Cyril, Tavis, can you come up with some set of predicates that can be
checked automatically yet still useful?
We can start small, e.g. "must not alter virtual address space".

Dmitry Vyukov

unread,

Nov 21, 2016, 10:18:05 AM11/21/16

to Andy Lutomirski, Linux API, LKML, Michael Kerrisk, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, Carlos O'Donell, syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

Hi Andy,

What exactly info do we need for this?

Anything that just requires a bit of info per syscall looks like the
lowest hanging fruit
(as compared to info per every ioctl discrimination, and complete args
description).

Tavis Ormandy

unread,

Nov 21, 2016, 10:35:10 AM11/21/16

to Dmitry Vyukov, Cyril Hrubis, linu...@vger.kernel.org, LKML, Michael Kerrisk-manpages, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, car...@redhat.com, syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones

On Mon, Nov 21, 2016 at 7:14 AM, Dmitry Vyukov <dvy...@google.com> wrote:
>
>
> Re more complex side effects. I always feared that a description suitable
> for automatic verification (i.e. zero false positives, otherwise it is useless)
> may be too difficult to achieve.
>
> Cyril, Tavis, can you come up with some set of predicates that can be
> checked automatically yet still useful?
> We can start small, e.g. "must not alter virtual address space".

Yes, I've been working on creating something like this, I have a
simple working prototype. I cant promise it has zero false positives
right now, but I think that is achievable.

Let me dig it up (I had put it on the back burner).

Tavis.

Steven Rostedt

unread,

Nov 21, 2016, 10:37:57 AM11/21/16

to Cyril Hrubis, Dmitry Vyukov, linu...@vger.kernel.org, LKML, mtk.ma...@gmail.com, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Arnd Bergmann, car...@redhat.com, syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

On Mon, 7 Nov 2016 11:38:20 +0100
Cyril Hrubis <chr...@suse.cz> wrote:

> I'm not sure if something like this is really doable or in the scope of
> this project, but it may be worth a try.
>

Looking ahead into the future, I was also thinking that if this becomes
robust, we could also start an integration specification, that could
describe how different system calls interact with each other. Like
open() to read(), write() and close().

But this is just an idea that popped in my head while reading this
thread. We want to start small first, but still could keep this in the
back of our minds for future enhancements.

-- Steve

Dmitry Vyukov

unread,

Nov 21, 2016, 10:49:10 AM11/21/16

to syzkaller, Cyril Hrubis, Linux API, LKML, Michael Kerrisk-manpages, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Arnd Bergmann, Carlos O'Donell, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

FWIW syzkaller does something along these lines.
It has notion of 'resources' and of input/output arguments.
Then it can figure out that e.g. open creates fd's, so it should be called
before any reads/writes (provided that we want to pass in valid fd's).
It does not have notion of "destructors" for resources (e.g. close
destroys the passed in resource). But it should be easy to describe.

Cyril Hrubis

unread,

Nov 21, 2016, 11:10:54 AM11/21/16

to Dmitry Vyukov, linu...@vger.kernel.org, LKML, Michael Kerrisk-manpages, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, car...@redhat.com, syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

Hi!

> Description of "returns fd or this set of errors" looks simple and useful.
> Any test system or fuzzer will be able to verify that kernel actually returns
> claimed return values. Also Carlos expressed interested in errno values
> in the context of glibc.
> I would do it from day one.
>
> Re more complex side effects. I always feared that a description suitable
> for automatic verification (i.e. zero false positives, otherwise it is useless)
> may be too difficult to achieve.

I'm afraid it may be as well. I would expect that we will end up with
something quite complex with a large set of exceptions from the rules.

> Cyril, Tavis, can you come up with some set of predicates that can be
> checked automatically yet still useful?
> We can start small, e.g. "must not alter virtual address space".

I will try to thing about this a bit.

--
Cyril Hrubis
chr...@suse.cz

Cyril Hrubis

unread,

Nov 21, 2016, 11:59:14 AM11/21/16

to Dmitry Vyukov, syzkaller, Linux API, LKML, Michael Kerrisk-manpages, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Arnd Bergmann, Carlos O'Donell, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

Hi!

> > Looking ahead into the future, I was also thinking that if this becomes
> > robust, we could also start an integration specification, that could
> > describe how different system calls interact with each other. Like
> > open() to read(), write() and close().
> >
> > But this is just an idea that popped in my head while reading this
> > thread. We want to start small first, but still could keep this in the
> > back of our minds for future enhancements.
>
>
> FWIW syzkaller does something along these lines.
> It has notion of 'resources' and of input/output arguments.
> Then it can figure out that e.g. open creates fd's, so it should be called
> before any reads/writes (provided that we want to pass in valid fd's).
> It does not have notion of "destructors" for resources (e.g. close
> destroys the passed in resource). But it should be easy to describe.

Logical extension would be to teach it that creat(), for instance,
returns file descriptor and creates a file as a side effect. That file
then could be used for a stat() or unlink() concurently, etc. But we
should also consider that not all file descriptors or files are equal,
so we may end up with some classes of files and file descriptors
some of them suitable for different subsets of operations.

I think that defining classes of objects and defining how syscalls
transform their state may yield something usable. But that would require
some serious thinking and a few trial and error implementations.

--
Cyril Hrubis
chr...@suse.cz

Szabolcs Nagy

unread,

Nov 22, 2016, 8:07:39 AM11/22/16

to Dmitry Vyukov, syzkaller, linu...@vger.kernel.org, LKML, Michael Kerrisk-manpages, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, car...@redhat.com, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy

* Dmitry Vyukov <dvy...@google.com> [2016-11-21 16:03:04 +0100]:

> On Mon, Nov 7, 2016 at 1:28 AM, Szabolcs Nagy <n...@port70.net> wrote:
> > * Dmitry Vyukov <dvy...@google.com> [2016-11-06 14:39:28 -0800]:

> >> For the reference, current syzkaller descriptions are in txt files here:
> >> https://github.com/google/syzkaller/tree/master/sys
> > ...
> >> Taking the opportunity, if you see that something is missing/wrong
> >> in the descriptions of the subsystem you care about, or if it is not
> >> described at all, fixes are welcome.
> >
> > abi variants are missing (abi variation makes a lot of
> > syscall interface related work painful).
>
>
> What exactly do you mean by "abi variants"? Is it architecture?
> What exactly needs to be added to the descriptions support "abi variants?
>

abi variant is a supported syscall abi
(linux supports several abis on the same arch:
e.g. x86 has i386,x86_64,x32 one might also
call different endianness a different abi, but
i'd expect le/be to be handled together.)

available syscalls, argument types and ordering
can be different across abis

i may be wrong, but i did not see those handled
in sys.txt, looking at it now i see missing
paddings in ipc structs (*_ds) so it may not
work the way i assumed (these are types which
have some abi variation).

Carlos O'Donell

unread,

Apr 21, 2017, 11:14:33 AM4/21/17

to Dmitry Vyukov, linu...@vger.kernel.org, LKML, mtk.ma...@gmail.com, Thomas Gleixner, Sasha Levin, Mathieu Desnoyers, scie...@fb.com, Steven Rostedt, Arnd Bergmann, syzkaller, Kostya Serebryany, Mike Frysinger, Dave Jones, Tavis Ormandy, Florian Weimer

On 11/06/2016 05:39 PM, Dmitry Vyukov wrote:
> Hello,
>
> This is notes from the discussion we had at Linux Plumbers this week
> regarding providing a formal description of system calls (user API).
>
> The idea come up in the context of syzkaller, syscall fuzzer, which
> has descriptions for 1000+ syscalls mostly concentrating on types of
> arguments and return values. However, problems are that a small group
> of people can't write descriptions for all syscalls; can't keep them
> up-to-date and doesn't have necessary domain expertise to do correct
> descriptions in some cases.
>
> We identified a surprisingly large number of potential users for such
> descriptions:
> - fuzzers (syzkaller, trinity, iknowthis)
> - strace/syscall tracepoints (capturing indirect arguments and
> printing human-readable info)
> - generation of entry points for C libraries (glibc, liblinux
> (raw syscalls), Go runtime, clang/gcc sanitizers)

To add another:

Auto-generation of SYS_* macros (sys/syscalls.h) in glibc which are
required for syscall().

It would mean we could copy the list directly from the most recently
released kernel instead of relying on distro kernel UAPI headers package.

We need this information in the released kernel.

Have we made any progress on these points?

> It was acknowledged that whatever we do now it will probably
> significantly change and evolve over time as we better understand
> what we need and what works.
>
> For the reference, current syzkaller descriptions are in txt files here:
> https://github.com/google/syzkaller/tree/master/sys
> The most generic syscalls are here:
> https://github.com/google/syzkaller/blob/master/sys/sys.txt
> Specific subsystems are described in separate files, e.g.:
> https://github.com/google/syzkaller/blob/master/sys/bpf.txt
> https://github.com/google/syzkaller/blob/master/sys/tty.txt
> https://github.com/google/syzkaller/blob/master/sys/sndseq.txt
> The descriptions should be self-explanatory, but just in case there
> is also a semi-formal DSL specification here:
> https://github.com/google/syzkaller/blob/master/sys/README.md
>
> Taking the opportunity, if you see that something is missing/wrong
> in the descriptions of the subsystem you care about, or if it is not
> described at all, fixes are welcome.
>
> Thanks

--
Cheers,
Carlos.

Reply all

Reply to author

Forward