> Thank you for your reply!
>
> Considering the duration of my internship, Matt and I think that going down the path of mounting an image of the filesystem type we want to fuzz would be the most accessible approach. The third suboption you mentioned ("mount a new copy in the psudo-syscall right in the process work dir") seems to be the best option along this path as it seems to handle reproducibility problem. Going into the detail of this, do we want to just create a new psudo-syscall which performs mounting task with a set of pre-set parameters, or do we want to force syzkaller to mount the given image everytime a new fuzzing thread has started (and if so, how do we actually do it)?
Sounds reasonable.
I think the image should be mounted by a pseudo-syscall rather than on
every test process startup. This way we (1) use some fuzzer-provided
randomness during mount, (2) don't pay the cost of the mount if it's
not necessary, (3) mount several filesystems (of the same type or of
different types).
There are still several options:
1. We can use images stored as files in known locations.
2. Embed some compressed version directly into executor (this can use
the existing syz_mount_image syscall, but with an "empty" image; if
image is empty, executor will use the "default" image).
3. Pre-seed the corpus with valid images in syz_mount_image syscall.
(3) looks like the best option to me so far. It does everything we
want, and makes C reproducers work without any external dependencies
and on top of option (2) also allows the fuzzer to mutate the seed
images to test the mount operation itself.
I injected some images into the corpus manually as a one-off effort at
some point. But nobody knows what images I injected now, and how well
that worked, and if they still stay in the corpus or were lost for
some reason, and how to inject more images for other filesystems
(f2fs).
So I wonder if it's possible to do this in a more principled,
controlled and extensible way...
We have these "unit-tests" for some descriptions:
https://github.com/google/syzkaller/tree/master/sys/linux/test
I was thinking before if we could use these as seeds for corpus. Some
of these contains very interesting non-trivial scenarios, e.g.:
https://github.com/google/syzkaller/blob/master/sys/linux/test/binder
https://github.com/google/syzkaller/blob/master/sys/linux/test/wireguard
https://github.com/google/syzkaller/blob/master/sys/linux/test/io_uring
It looks very reasonable to use them as seeds: both useful and
provides a nice way for contributors not just to add descriptions, but
also seed corpus with some non-trivial initial use examples for the
subsystem (currently it's not possible).
And it looks like a perfect option for what you are trying to achieve.
We could add a few seeds for fs there and then arrange either syz-ci
or syz-manager and build process (not sure yet what's the best course
here) to use them as initial seeds when starting fuzzing.
What do you think?
> Secondly, what do you think would be a suitable image to be mounted? I used the create_image.sh scripted to create a f2fs image w/ minimal debian stretch os, and the size of that is 2GB (which is 134217729 lines of binary, probably too large to be directly embedded). I'm not really familiar with image related stuff, but I kind of feel like we don't really need an os on the image? Is there any way for us to downsize the image but still have it working for mounting purpose?
Yes, don't use that, we totally don't need a full distro image. We
just need a minimal empty (or almost empty) filesystem image.
Here is the command create a minimal FAT image:
$ fallocate -l 56K disk.raw && mkfs.fat disk.raw
That's it. It gives you a 56K file that can be mounted as FAT fs.
And the nice thing is that it's actually mostly 0's which play well
with the syz_mount_image "compressed" format (which makes all regions
with 0's implicit):
$ hexdump -C disk.raw
00000000 eb 3c 90 6d 6b 66 73 2e 66 61 74 00 02 04 01 00 |.<.mkfs.fat.....|
00000010 02 00 02 70 00 f8 01 00 20 00 40 00 00 00 00 00 |...p.... .@.....|
00000020 00 00 00 00 80 00 29 73 4d 0a df 4e 4f 20 4e 41 |......)sM..NO NA|
00000030 4d 45 20 20 20 20 46 41 54 31 32 20 20 20 0e 1f |ME FAT12 ..|
00000040 be 5b 7c ac 22 c0 74 0b 56 b4 0e bb 07 00 cd 10 |.[|.".t.V.......|
00000050 5e eb f0 32 e4 cd 16 cd 19 eb fe 54 68 69 73 20 |^..2.......This |
00000060 69 73 20 6e 6f 74 20 61 20 62 6f 6f 74 61 62 6c |is not a bootabl|
00000070 65 20 64 69 73 6b 2e 20 20 50 6c 65 61 73 65 20 |e disk. Please |
00000080 69 6e 73 65 72 74 20 61 20 62 6f 6f 74 61 62 6c |insert a bootabl|
00000090 65 20 66 6c 6f 70 70 79 20 61 6e 64 0d 0a 70 72 |e floppy and..pr|
000000a0 65 73 73 20 61 6e 79 20 6b 65 79 20 74 6f 20 74 |ess any key to t|
000000b0 72 79 20 61 67 61 69 6e 20 2e 2e 2e 20 0d 0a 00 |ry again ... ...|
000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.|
00000200 f8 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000210 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000400 f8 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000410 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0000e000
mkfs also accepts some per-fs options, it makes sense to generate a
number of different images.
E.g. here is what I used for hfsplus (I don't remember which of these
succeeded, most likely not all, but it gives an idea):
fallocate -l 64K disk.raw && mkfs.hfsplus disk.raw
fallocate -l 512K disk.raw && mkfs.hfsplus disk.raw
fallocate -l 512K disk.raw && mkfs.hfsplus -w disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -w disk.raw
fallocate -l 512K disk.raw && mkfs.hfsplus disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -w -s disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -w -s disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -s disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -w -J disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -w -J 256K disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -w -J 1M disk.raw
fallocate -l 2M disk.raw && mkfs.hfsplus -w -J 1M disk.raw
fallocate -l 2M disk.raw && mkfs.hfsplus -w -b 512 disk.raw
fallocate -l 2M disk.raw && mkfs.hfsplus -w -i 17 -b 512 disk.raw
fallocate -l 2M disk.raw && mkfs.hfsplus -w -i 17 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 512 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 512 -n
e=512,c=512,a=512 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 4096 -n
e=4096,c=4096,a=4096 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 4096 -n
e=512,c=4096,a=4096 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 4096 -n
e=1024,c=4096,a=4096 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 4096 -n
e=1024,c=512,a=4096 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 4096 -n
e=1024,c=2048,a=4096 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 4096 -n
e=1024,c=1024,a=4096 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 4096 -n
e=1024,c=4096,a=4096 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 4096 -n
e=1024,c=4096,a=512 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 4096 -n
e=1024,c=4096,a=1024 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 4096 -n
e=1024,c=4096,a=2048 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 8192 -n
e=1024,c=4096,a=2048 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 8192 -n
e=1024,c=2048,a=2048 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -i 17 -b 2048 -n
e=1024,c=4096,a=2048 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -b 2048 -n
e=1024,c=8192,a=2048 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -b 2048 -c a=128,b=3 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -c a=128,b=3 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -c a=128,b=3,c=17 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -w -c a=128,b=3,c=17,e=10000 disk.raw
fallocate -l 1M disk.raw && mkfs.hfsplus -h disk.raw
fallocate -l 512K disk.raw && mkfs.hfsplus -h disk.raw
fallocate -l 512K disk.raw && mkfs.hfsplus -h -s disk.raw
fallocate -l 512K disk.raw && mkfs.hfsplus -h -s -w disk.raw
fallocate -l 512K disk.raw && mkfs.hfsplus -s disk.raw
fallocate -l 512K disk.raw && mkfs.hfsplus -h -s disk.raw
fallocate -l 512K disk.raw && mkfs.hfsplus -h -w disk.raw
fallocate -l 512K disk.raw && mkfs.hfsplus -h -J 13 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v asd disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz -b 5000 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz -b 10000 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz -b 8192 -n
e=1024,c=1024,a=1024 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz -b 8192 -n
e=1024,c=2048,a=1024 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz -b 8192 -n
e=1024,c=4095,a=1024 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz -b 8192 -n
e=1024,c=4096,a=1024 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz -b 8192 -n
e=1024,c=4097,a=1024 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz -b 8192 -n
e=1024,c=4096,a=1024 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz -b 8192 -n
e=1024,c=4096,a=1024 -c a=128,b=3,c=17,e=10000 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -b 8192 -n
e=1024,c=4096,a=1024 -c a=128,b=3,c=17,e=10000 disk.raw
fallocate -l 768K disk.raw && mkfs.hfsplus -h -v syz -b 8192 -n
e=1024,c=4096,a=1024 disk.raw
fallocate -l 4M disk.raw && mkfs.hfsplus -h -v syz -b 8192 -n
e=1024,c=4096,a=1024 disk.raw
> Thirdly, you mentioned that having syzkaller generating random images would be a better and "long-term" approach. I'm wondering how different this would be from creating a "parameterized" version of the create-image.sh script, and just use that script with random parameters to generate new images.
See above. We can use mkfs to generate few variants.
But this won't replace full random generation b/c number of different
valid images is effectively infinite, we can't pre-generate them all.
mkfs can't even produce them all.
> I also need some help understanding the importance of mutating images: would some bug occur only given a specific image?
Totally.
There are some "major" features/options of images that have a very
significant effect on behavior of all fs operations. Plus an infinite
long tail of various details that may have some effect.