Re: KASAN: slab-out-of-bounds Read in __ext4_check_dir_entry

61 views
Skip to first unread message

Theodore Y. Ts'o

unread,
Mar 31, 2018, 6:31:56 PM3/31/18
to syzk...@googlegroups.com
Some minor nits about this report:

On Sat, Mar 31, 2018 at 01:47:06PM -0700, syzbot wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> 9dd2326890d89a5179967c947dab2bab34d7ddee (Fri Mar 30 17:29:47 2018 +0000)
> Merge tag 'ceph-for-4.16-rc8' of git://github.com/ceph/ceph-client
> syzbot dashboard link:
> https://syzkaller.appspot.com/bug?extid=730517f1d3fbe54a17c7

Reporting ext4 KASAN problems on a ceph tree is not particularly
edifying. It would be nice if the Syzbot could try to replicate the
bug on either (a) the latest upstream commit on Linus's tree, or (b)
the branch used by linux-ext4 to pull from the ext4 tree. (You'd have
to get that information from Stephen Rothwell in the general case, but
in the case of ext4.git tree, it's the dev branch.)

> So far this crash happened 3 times on upstream.

"Upstream" is a bit misleading here. The ceph tree isn't really the
upstream for anything. It's the dev tree for Ceph, sure, but
"upstream" has a fairly specific meaning for upstream, external
development --- which is to say, Linus's tree.

(I assume what syzbot meant by "upstream" was as opposed some internal
Google or Android kernel tree?)

> C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4574925402669056

In many cases, for file system level crashes, it's often *critical* to
have the file system image that triggered the crash in order to get a
reliable repro. For example, see:

https://bugzilla.kernel.org/show_bug.cgi?id=199181

In this case, I was able to figure out what was going on w/o needing
to get an exact repro environment. But in some cases, it might be
useful if there was some standard GCE image that you were using that
could be made available to the developers, with some standard way to
feed it a kernel that can be used. I don't know if you're using kexec
to boot into an arbitrary kernel, ala gce-xfstests, but if you are,
that would be really handy.

Cheers,

- Ted

Eric Biggers

unread,
Mar 31, 2018, 7:11:54 PM3/31/18
to Theodore Y. Ts'o, syzk...@googlegroups.com, Dmitry Vyukov
On Sat, Mar 31, 2018 at 06:31:53PM -0400, Theodore Y. Ts'o wrote:
> Some minor nits about this report:
>
> On Sat, Mar 31, 2018 at 01:47:06PM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot hit the following crash on upstream commit
> > 9dd2326890d89a5179967c947dab2bab34d7ddee (Fri Mar 30 17:29:47 2018 +0000)
> > Merge tag 'ceph-for-4.16-rc8' of git://github.com/ceph/ceph-client
> > syzbot dashboard link:
> > https://syzkaller.appspot.com/bug?extid=730517f1d3fbe54a17c7
>
> Reporting ext4 KASAN problems on a ceph tree is not particularly
> edifying. It would be nice if the Syzbot could try to replicate the
> bug on either (a) the latest upstream commit on Linus's tree, or (b)
> the branch used by linux-ext4 to pull from the ext4 tree. (You'd have
> to get that information from Stephen Rothwell in the general case, but
> in the case of ext4.git tree, it's the dev branch.)
>
> > So far this crash happened 3 times on upstream.
>
> "Upstream" is a bit misleading here. The ceph tree isn't really the
> upstream for anything. It's the dev tree for Ceph, sure, but
> "upstream" has a fairly specific meaning for upstream, external
> development --- which is to say, Linus's tree.
>
> (I assume what syzbot meant by "upstream" was as opposed some internal
> Google or Android kernel tree?)
>

It *is* Linus' tree; the last commit just happened to be merging a ceph pull
request, and syzbot gave the commit title. Dmitry, this has confused several
people already since the given commit title usually has nothing to do with the
bug. Maybe it should show the output of 'git describe' instead, e.g.
v4.16-rc7-93-g9dd2326890d89 in this case.

> > C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4574925402669056
>
> In many cases, for file system level crashes, it's often *critical* to
> have the file system image that triggered the crash in order to get a
> reliable repro. For example, see:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=199181
>
> In this case, I was able to figure out what was going on w/o needing
> to get an exact repro environment. But in some cases, it might be
> useful if there was some standard GCE image that you were using that
> could be made available to the developers, with some standard way to
> feed it a kernel that can be used. I don't know if you're using kexec
> to boot into an arbitrary kernel, ala gce-xfstests, but if you are,
> that would be really handy.
>

There is a link to a qemu-suitable Debian wheezy image in the syzbot
documentation linked to from the bottom of the bug report. It probably should
be made more prominent. A GCE image is a good idea too.

That being said, in my experience most syzbot reproducers don't care about the
root filesystem at all. It's only every once in a while that a bug like this
this shows up, where the reproducer is implicitly assuming that the current
working directory is on an ext4 filesystem.

syzkaller also just gained the ability to mount filesystem images
(https://github.com/google/syzkaller/commit/7c923cf8d45b650c4251503c11e74653779c74c4),
and bugs found in that way (expect lots of them!) will actually have the code
that generates the filesystem image that triggers the crash embedded in the
reproducer, I believe.

Eric

Theodore Y. Ts'o

unread,
Apr 1, 2018, 1:12:58 AM4/1/18
to Eric Biggers, syzk...@googlegroups.com, Dmitry Vyukov
On Sat, Mar 31, 2018 at 04:12:14PM -0700, Eric Biggers wrote:
>
> It *is* Linus' tree; the last commit just happened to be merging a ceph pull
> request, and syzbot gave the commit title. Dmitry, this has confused several
> people already since the given commit title usually has nothing to do with the
> bug. Maybe it should show the output of 'git describe' instead, e.g.
> v4.16-rc7-93-g9dd2326890d89 in this case.

Yes, that's the mistake I made. Reformating the e-mail so it is much
cleaner would really help. Maybe something like this?

---------------------------
Syzbot found the following crash on:

Commit: v4.16-rc7-93-g9dd2326890d8
Git Tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Dashboard Link: https://syzkaller.appspot.com/bug?extid=730517f1d3fbe54a17c7
C Repro: https://syzkaller.appspot.com/x/repro.c?id=4574925402669056
Compiler: gcc (GCC) 7.1.1 20170620

If you fix the bug, please add the following tag to the commit:

Reported-by: syzbot+730517...@syzkaller.appspotmail.com

<Cleaned up console output with stack trace>

<Syzbot Footer>
------------------------------

The human eye will have a much easier time scanning the report;
sometimes *removing* extraneous information is just as important as
adding extra info. And if people need more details, they can always
get to things like the Syzkaller reproducer, raw console output, etc.,
from the dashboard link.

> There is a link to a qemu-suitable Debian wheezy image in the syzbot
> documentation linked to from the bottom of the bug report. It probably should
> be made more prominent. A GCE image is a good idea too.

I've looked at the Debian Wheezy image, as near as I can tell, it's
nothing special. It doesn't even have any of the syzkaller programs
(syz-executor or syz-execprog). So I might as well use "kvm-xfstests
shell" as my environment. This might actually be more convenient for
kernel developers, since the kvm-xfstests image is only 87 MiB (as
opposed to the 1 GiB Wheezy image).

If possible, it would be good to have the exact GCE image that is used
by the syzbot, since if a particular reproduction relies on how the
kernel memory / slab allocations are set up before the reproducer
program runs, the face that GCE images tend to run all sorts of extra
bits from GCE specific init scripts means that it's quite possible
that something that repros in the GCE image might not repro reliably
either on Syzkaller's wheezy.img or kvm-xfstests's root_fs.img running
under KVM.

BTW, it would be useful if the Syzkaller docs gave people a hint how
use the image, since often images are dependant on qemu/kvm boot-time
options. Fortunately for me, "kvm-xfstests -I /tmp/wheezy.img shell"
worked for me, but that wasn't guaranteed to work. Providing a shell
script which fires up the wheezy image using Syzkaller's preferred kvm
options would be nice. After all, kvm/qemu invocations can get quite
complex. For example, the kvm-xfstests invocation above translates
to:

ionice -n 5 /usr/bin/kvm -boot order=c -net none \
-machine type=pc,accel=kvm:tcg \
-drive file=/tmp/wheezy.img,if=virtio,snapshot=on \
-drive file=/dev/cwcc/test-4k,cache=none,if=virtio,format=raw,aio=native \
-drive file=/dev/cwcc/scratch,cache=none,if=virtio,format=raw,aio=native \
-drive file=/dev/cwcc/test-1k,cache=none,if=virtio,format=raw,aio=native \
-drive file=/dev/cwcc/scratch2,cache=none,if=virtio,format=raw,aio=native \
-drive file=/dev/cwcc/scratch3,cache=none,if=virtio,format=raw,aio=native \
-drive file=/dev/cwcc/results,cache=none,if=virtio,format=raw,aio=native \
-drive file=/tmp/kvm-upload.Jjt9LpNa,if=virtio,format=raw \
-vga none -nographic -smp 2 -m 2048 \
-fsdev local,id=v_tmp,path=/tmp/kvm-xfstests-tytso,security_model=none \
-device virtio-9p-pci,fsdev=v_tmp,mount_tag=v_tmp \
-object rng-random,filename=/dev/urandom,id=rng0 \
-device virtio-rng-pci,rng=rng0 \
-serial mon:stdio -monitor telnet:localhost:7498,server,nowait \
-serial telnet:localhost:7500,server,nowait \
-serial telnet:localhost:7501,server,nowait \
-serial telnet:localhost:7502,server,nowait \
-gdb tcp:localhost:7499 \
--kernel /build/ext4-64/arch/x86/boot/bzImage \
--append "quiet loglevel=0 root=/dev/vda console=ttyS0,115200 cmd=maint fstesttz=America/New_York fstesttyp=ext4 fstestapi=1.3"

:-)

> syzkaller also just gained the ability to mount
> filesystem images
> (https://github.com/google/syzkaller/commit/7c923cf8d45b650c4251503c11e74653779c74c4),
> and bugs found in that way (expect lots of them!) will actually have
> the code that generates the filesystem image that triggers the crash
> embedded in the reproducer, I believe.

Hmm, maybe it would be worth it to teach kvm-xfstests and gce-xfstests
how to run Syzkaller reproduction test cases. If we're going to be
seeing lots of bugs, the more automation, the better....

- Ted

Dmitry Vyukov

unread,
Apr 1, 2018, 8:32:36 AM4/1/18
to Theodore Y. Ts'o, Eric Biggers, syzkaller
Hi Ted,

syzbot used to provide only commit hash, but then people complained
that syzbot provides bogus commits because when they try to check it
out git says that there is no such commit in the tree. That's how we
learned about kernel trees that are rebuilt, rebased, fixed up, etc.
So unfortunately only hash (of git describe name) is not always
meaningful.

Dmitry Vyukov

unread,
Apr 1, 2018, 8:35:56 AM4/1/18
to Theodore Y. Ts'o, Eric Biggers, syzkaller
I will look into you other comments a bit later. Thanks for the feedback.

Theodore Y. Ts'o

unread,
Apr 1, 2018, 3:07:54 PM4/1/18
to Dmitry Vyukov, Eric Biggers, syzkaller
On Sun, Apr 01, 2018 at 02:32:14PM +0200, Dmitry Vyukov wrote:
>
> syzbot used to provide only commit hash, but then people complained
> that syzbot provides bogus commits because when they try to check it
> out git says that there is no such commit in the tree. That's how we
> learned about kernel trees that are rebuilt, rebased, fixed up, etc.
> So unfortunately only hash (of git describe name) is not always
> meaningful.

Ah, yes. The problem is the right thing to do really depends on the
git tree in question. Linus's tree never rewinds, so git commits
never disappear (compared to say, linux-next, which rewinds every
day). Linus's tree is filled with merges, so the commit description
is almost always something like this:

Merge tag 'ceph-for-4.16-rc8' of git://github.com/ceph/ceph-client

So combine that with the fact that the syzkaller e-mail uses the "wall
of text" reporting style, for example:

syzbot hit the following crash on upstream commit
9dd2326890d89a5179967c947dab2bab34d7ddee (Fri Mar 30 17:29:47 2018 +0000)
Merge tag 'ceph-for-4.16-rc8' of git://github.com/ceph/ceph-client
syzbot dashboard link:
https://syzkaller.appspot.com/bug?extid=730517f1d3fbe54a17c7

This is *painful* for a human being to parse. Compare that to
something like this:

Syzbot found the following crash:

Commit: 9dd2326890d8: Merge tag 'ceph-for-4.16-rc8' of git://github.com/ceph/ceph-client
Git Tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Syz Link: https://syzkaller.appspot.com/bug?extid=730517f1d3fbe54a17c7
C Repro: https://syzkaller.appspot.com/x/repro.c?id=4574925402669056


It's much easier to read, and it's also much easier to cut and paste.
(Note that I dropped the date. Given that Syzbot is just testing the
tip of the tree, and not doing any bisection, the date has a very low
amount of information content compared to the amount space it takes up
in the e-mail report.)

Regards,

- Ted

Theodore Y. Ts'o

unread,
Apr 1, 2018, 9:31:09 PM4/1/18
to Eric Biggers, syzk...@googlegroups.com, Dmitry Vyukov
On Sun, Apr 01, 2018 at 01:12:55AM -0400, Theodore Y. Ts'o wrote:
> I've looked at the Debian Wheezy image, as near as I can tell, it's
> nothing special. It doesn't even have any of the syzkaller programs
> (syz-executor or syz-execprog). So I might as well use "kvm-xfstests
> shell" as my environment. This might actually be more convenient for
> kernel developers, since the kvm-xfstests image is only 87 MiB (as
> opposed to the 1 GiB Wheezy image).

I've just pushed out changes to the xfstests-bld repository so that
future kvm-xfstests and gce-xfstests images will have the syz-executor
and syz-execprog binaries built in. (For people building their own
images, adding the syzkaller repo will be optional, much like the
stress-ng repo. But it will be in the official gce-xfstests and
x86_64 and i386 kvm-xfstests images.)


In the future I would like to be able to teach kvm-xfstests and
gce-xfstest to be able to accept commands like:

{kvm,gce}-xfstests syz https://syzkaller.appspot.com/bug?extid=f3bd89a5ab3266b10540
{kvm,gce}-xfstests syz f3bd89a5ab3266b10540
{kvm,gce}-xfstests syz repro.c
{kvm,gce}-xfstests syz repro.syz

I might try to add support for "android-xfstests syz", but I'll
probably check and see if Eric is interested in taking on that
project. :-)


One thing which would really make this easier is if there was an easy
way to get the repro.c or repro.syz file for a particular syzbot bug.
Right now, for example, the download links for:

https://syzkaller.appspot.com/bug?extid=f3bd89a5ab3266b10540

are:

https://syzkaller.appspot.com/x/repro.c?id=6290970458980352
https://syzkaller.appspot.com/x/repro.syz?id=6577156880596992

and so on. It would be really nice if it were possible to look up
relevant files by extid, e.g.:

https://syzkaller.appspot.com/x/repro.c?extid=f3bd89a5ab3266b10540
https://syzkaller.appspot.com/x/repro.syz?extid=f3bd89a5ab3266b10540


Also, I'm curious --- why does syzboot rerun a particular failing test
so often?

Looking at the dashboard for the above-mentioned syzkaller failure, I
see it has run the commit 1379ef828a18d8f81c526b25e4d5685caa2cfd65
over 13 times. That seems highly wasteful. That's enough resources
that you should be able to do a full bisection search to find the
first guilty commit where a particular syzkaller repro started
failing. Is there a reason why the syzkaller repro is being run so
many times given the kernel commit and syzkaller commit hasn't
changed?

Thanks,

- Ted

Dmitry Vyukov

unread,
Apr 13, 2018, 5:31:36 AM4/13/18
to Theodore Y. Ts'o, Eric Biggers, syzkaller
On Sun, Apr 1, 2018 at 7:12 AM, Theodore Y. Ts'o <ty...@mit.edu> wrote:
Hi Ted,

I've filed https://github.com/google/syzkaller/issues/565 for this
with some mocks based on your proposals. I have some substantial
backlog at the moment, but I will get back to it.
I've added reference qemu and ssh command lines to the docs:
https://github.com/google/syzkaller/blob/master/docs/syzbot.md#crash-does-not-reproduce

The image is especially large because it is used for fuzzing, and
fuzzer may need some space.

I am not sure it's feasible to store and export all images on GCE. The
list would grow by 500+ each month, which means 5000+ to date. Our
current GCE quota is 100 custom images, I guess it can extended, but
we would need an increase by 3 orders of magnitude. Also I am not sure
who many kernel developers (besides you!) will use GCE images.

Dmitry Vyukov

unread,
May 9, 2018, 4:54:35 AM5/9/18
to Theodore Y. Ts'o, Eric Biggers, syzkaller
Hi Ted,

This is now implemented and deployed, you can see an example of new format here:

https://lkml.org/lkml/2018/5/8/36

Cleaned and tidied. Does it look better to you?

Thanks for your feedback.
Reply all
Reply to author
Forward
0 new messages