Tutorial on describing new subsystems

991 views

Skip to first unread message

Dmitry Vyukov

unread,

Apr 16, 2021, 5:09:51 AM4/16/21

to syzkaller, Andrey Konovalov

[Originally written by Andrey, posting it here so that it's not lost, I've moved some bits to issue #533, ideally we convert the rest to docs/linux/something.md]

How syscalls work

To get some generic understanding of how Linux kernel syscalls work it's recommended to read through:

Identifying new kernel interfaces/subsystems

Most of the things are already mentioned in go/syzkaller-intern:

Looking at syzkaller's issue #533.
Browsing through the kernel code (Elixir cross reference is extremely useful).
Looking at existing CVEs [1, 2, 3].
Looking at new things that were added over the past releases.

Example. Tracing stty to see what kind of syscalls it uses.

# strace stty -F /dev/tty0 size

…

open("/dev/tty0", O_RDONLY|O_NONBLOCK) = 3

dup2(3, 0) = 0

close(3) = 0

fcntl(0, F_GETFL) = 0x8800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE)

fcntl(0, F_SETFL, O_RDONLY|O_LARGEFILE) = 0

ioctl(0, TCGETS, {B38400 opost isig -icanon -echo ...}) = 0

ioctl(1, TIOCGWINSZ, {ws_row=53, ws_col=183, ws_xpixel=0, ws_ypixel=0}) = 0

ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=80, ws_xpixel=0, ws_ypixel=0}) = 0

...

Here we can see that stty opens /dev/tty0 and executes a certain ioctls on it.

Running other bizarre system utilities under strace might point to new subsystems.

Tracing existing applications can also be useful for understanding how a particular kernel interface works.

Example. Looking through /dev/ entries available on some Linux machine.

# ls -al /dev/

...

crw-r--r--. 1 root root 10, 235 Jun 17 14:21 autofs

...

crw-------. 1 root root 10, 231 Jun 17 14:21 snapshot

...

(I've removed from the output everything but a couple of entries that are not described in syzkaller.)

Note. Finding "interesting" kernel interfaces.

Worth saying, that the most interesting are the interfaces accessible to unprivileged users (e.g. as the default user on gLinux/Ubuntu) or in user namespaces (do unshare -U -r to get root shell in a user namespace).

Checking whether an interfaces is accessible in a particular environment is sometimes not trivial (there can be different kinds of kernel checks, filesystem permissions, selinux/apparmor, etc.); the most reliable way to do that is to build and run a program that uses that interface.

Another thing that points to whether an interface is interesting, is whether it's enabled in kernel configs for one of the popular distributions. You can grep /boot/config* on a gLinux/Ubuntu machine to find out which config options are enabled.

Checking if descriptions are present

Example. Checking whether the descriptions for /dev/autofs are present by grepping through the existing descriptions.

$ cd sys/linux/

$ grep -rniI autofs *.txt

...

sys.txt:521:openat$autofs(fd const[AT_FDCWD], file ptr[in, string["/dev/autofs"]], flags flags[open_flags], mode const[0]) fd

Inspecting sys.txt shows that we only have a generic openat syscall description, but nothing autofs specific (like e.g. descriptions for ioctl syscalls).

Example. Checking code coverage for /dev/snapshot on syzbot.

(Note: normally you should expect coverage for the ci-upstream-kasan-gce-root instance, but it's currently broken. Use ci2-upstream-kcsan-gce instead for now.)

(Note: see the "Finding relevant kernel source code parts" for to find relevant source code files.)

By checking coverage for snapshot_ioctl (which is the ioctl handler for /dev/snapshot) in kernel/power/user.c you can see that many case clauses are not covered.

Finding relevant kernel source code parts

Usually a kernel interface consists of two parts: UAPI (User API) headers typically found in include/uapi, and C implementation (can be in different directories depending on the subsystem).

Tip. There are some standard ways interfaces are integrated into the kernel. For example if a file descriptor is used as a part of an interface, there will likely be a related file_operations structure (contains pointers to handlers of syscalls that are typically used to interact with file descriptors (read, write, ioctl, etc.)).

Example. Finding kernel implementation for /dev/autofs ioctl handler.

$ find . -name *autofs*

./fs/autofs

./fs/autofs/autofs_i.h

./Documentation/filesystems/autofs.rst

./Documentation/filesystems/autofs-mount-control.rst

./include/config/autofs

./include/config/autofs4

$ ls fs/autofs/*.c

fs/autofs/dev-ioctl.c fs/autofs/expire.c fs/autofs/init.c fs/autofs/inode.c fs/autofs/root.c fs/autofs/symlink.c fs/autofs/waitq.c

$ grep -rnI '.unlocked_ioctl'

dev-ioctl.c:703: .unlocked_ioctl = autofs_dev_ioctl,

root.c:35: .unlocked_ioctl = autofs_root_ioctl

Example. Using cross reference to find kernel code for /dev/snapshot.

Let's say you googled for /dev/snapshot and found that SNAPSHOT_FREEZE is one of the ioctls it supports. Searching cross reference for this constant yields:

Defined in 1 files:

include/uapi/linux/suspend_ioctls.h, line 17 (as a macro)

Referenced in 2 files:

include/uapi/linux/suspend_ioctls.h, line 17

kernel/power/user.c, line 266

From this you can conclude that ioctl implementation is in kernel/power/user.c and UAPI headers are in include/uapi/linux/suspend_ioctls.h.

Enabling required kernel configs

Once you located *.c files of the implementation, the easiest way to check which configs are required is to check relevant Makefile files.

Note. Config dependencies.

Some config options depends on others, and therefore before a particular config option can be enabled, you need to enable all its dependencies. If you've edited .config manually, a good idea is to run make oldconfig and check that your changes were preserved.

Example. Finding out kernel configs required for /dev/snapshot.

$ cat kernel/power/Makefile | grep user.o

obj-$(CONFIG_HIBERNATION_SNAPSHOT_DEV) += user.

$ cat kernel/Makefile | grep power

obj-y += power/

Here we can see that power/ directory is always included in the kernel build regardless of configuration (obj-y), and kernel/power/user.c requires CONFIG_HIBERNATION_SNAPSHOT_DEV to be enabled.

Tip. Kernel configuration is stored in the .config file in the kernel directory. It can be edited manually, or with one of the following make commands:

make menuconfig opens a terminal-based interface for editing current .config.
make oldconfig takes the existing .config file and prompts the user for reachable options that are not found in the file. Run automatically when you run make.
make olddefconfig does the same, but uses default values instead of prompting the user.

Understanding the interface and writing descriptions

The general outline is here.

Tip. Write runtests for tricky subsystems which use custom pseudo-syscalls. Runtests are stored in sys/linux/test/. The tests can be run with syz-execprog utility inside of a VM. Or using go install ./tools/syz-runtest && syz-runtest -config=manager.config -tests yourtestfilename. Check out other test programs for the use of AUTO, it's handy for specifying address and const values.

Running syzkaller

Running syzkaller generally follows along the available docs and tutorials.

Tip. When testing newly written descriptions, it makes sense to restrict the syscalls used for fuzzing to the ones that are relevant to the interface that's being fuzzed. This can be done with the enable_syscalls parameter in syz-manager config file.

Note. syzkaller can mutate constant values via hint or ANYBLOB mutations. This might complicate targeted fuzzing (by e.g. sykaller mutating the first argument of socket syscall and therefore creating sockets of unwanted types despite the syscall restrictions). However you can disable hint mutations by disabling CONFIG_KCOV_ENABLE_COMPARISONS, and you can disable ANYBLOB mutations via:

diff --git a/prog/hints.go b/prog/hints.go

index 9163dd41..26d080f3 100644

--- a/prog/hints.go

+++ b/prog/hints.go

@@ -113,8 +113,10 @@ func generateHints(compMap CompMap, arg Arg, exec func()) {

}

switch a := arg.(type) {

+/*

case *ConstArg:

checkConstArg(a, compMap, exec)

+*/

case *DataArg:

checkDataArg(a, compMap, exec)

}

Tip. You can use the syz-db tool to unpack syzkaller corpus (stored in $WORKDIR/corpus.db) into a set of files with programs, manually view/edit them or add new ones, and pack them back.

Checking descriptions

Indicators that the written descriptions are working properly:

All relevant kernel code parts are covered.
syzkaller finds bugs.

Indicators that something is wrong:

Important parts of the relevant kernel code aren't covered even after running syzkaller for a substantial time (~1 hour).
Lots of lost connection to machine type of crashes (inspect crash logs to find out the reason).

Tip. You can manually introduce a bug somewhere deep in the subsystem you're fuzzing (by e.g. adding a BUG()), and see if syzkaller finds it. If the subsystem has previously known bugs that have already been fixed, you can revert the fix and see if syzkaller rediscovers the issue.

Tip. Use syz-check to statically check descriptions.

Debugging descriptions

If you observe that some of the kernel code parts aren't covered properly, it might point to a bug in syscalls descriptions. There are a few ways to debug that (besides rechecking everything manually).

Manually write a syzkaller program that is supposed to reach the uncovered code paths.

syzkaller programs can be executed manually with syz-execprog.

By tracing the kernel (see the kernel debugging section below) while executing a program, you can find out whether the program reaches the uncovered part, and try to find out the reason if not.

It's a good idea to take as a base a syzkaller program generated during fuzzing that goes as close as possible to the uncovered part of code (you can obtain those from the coverage page on syz-manager dashboard).

To simplify editing syzkaller programs manually, you can use syz-expand to expand all arguments in a syzkaller program (as those are omitted, when the default values are used).

Edit a C program generated by syz-prog2c.

Instead of editing syzkaller programs, you can use syz-prog2c to generate C programs and edit them. It's generally easier to do changes to C code, but the downside here is that even if you come up with a C program that reaches the uncovered code paths, you still need to make sure that syzkaller descriptions are flexible enough to generate a matching syzkaller program. Still useful for checking your understanding of the kernel interface you're trying to describe.

Investigating kernel crashes

To simplify reading kernel stack traces, you can use a symbolizer script as shown here.

Debugging the kernel

Sometimes it's useful to be able to debug the kernel during execution of a syzkaller program to see what happens internally and what kind of branches are taken.

Debug prints

Debug prints can be added with pr_err() anywhere in the kernel code, e.g. pr_err("addr: %px, size: %d\n", addr, size); (note using %px to print pointers unlike %p for printf()).

If you want to see a call stack trace printed, use WARN_ON(condition) (or BUG_ON(condition) which will also panic the kernel).

GDB

Adding -s argument to qemu-system-x86_64 turns on GDB stub. Adding -S on top of that makes the QEMU wait until you connect GDB before booting the kernel.

To connect GDB to QEMU:

cd $KERNEL

gdb ./vmlinux \

-ex 'set confirm off' \

-ex 'set verbose off' \

-ex 'set architecture i386:x86-64:intel' \

-ex 'target remote localhost:1234'

Optionally add -ex 'set disassembly-flavor intel' to turn on Intel assembly syntax.

perf-tools

perf-tools are sometimes useful to trace the kernel without rebuilding. The tools I find useful are kprobe and funcgraph, see their documentation for usage examples. Most of the tools require CONFIG_FTRACE to be enabled.

Hangbin Liu

unread,

Apr 21, 2021, 11:49:36 PM4/21/21

to syzkaller

Very nice article. Thanks for sharing the experience!

Reply all

Reply to author

Forward

0 new messages