On Mon, Jul 30, 2018 at 2:12 PM, shankarapailoor
<
shankar...@gmail.com> wrote:
> Sorry forgot mailing list.
>
> On Mon, Jul 30, 2018 at 5:00 AM, shankarapailoor <
shankar...@gmail.com>
> wrote:
>>
>> Thanks for the feedback. I've also made some updates since Friday to the
>> source tree. Please pull again.
>>
>> > 1. I did not find distill.json anywhere.
Right, I did not pull. Found it now.
>> It is under getting-started/ directory.
>>
>> > 2. moonshine does not create the serialized dir, so it fails with
>>
>> Also should be fixed. Now it is 'deserialized/'.
Still crashes for me:
moonshine$ bin/moonshine -dir sampletraces -distill
./getting-started/distill.json
Total Number of Files: 346
Parsing File 1/346: ltp_accept4_01
Parsing File 2/346: ltp_accept_01
...
Parsing File 345/346: ltp_utimensat01
Parsing File 346/346: ltp_writev05
Total number of seeds: 43438
Performing implicit distillation with 43438 calls contributing coverage
TOTAL HEAVY HITTERS: 639
Total Distilled Progs: 391
Average Program Length: 10
Total Contributing calls: 639 out of 43438, in 388
implicitly-distilled programs that consist of: 3250 calls
panic: failed to output file: [open deserialized/distill0: no such
file or directory]
goroutine 1 [running]:
github.com/shankarapailoor/moonshine/logging.Failf(0x603787, 0x19,
0xc4449ac850, 0x1, 0x1)
src/
github.com/shankarapailoor/moonshine/logging/logger.go:6 +0xfe
main.ParseTraces(0x1a00040, 0x5, 0x5c0b41, 0x5)
src/
github.com/shankarapailoor/moonshine/main.go:126 +0xda8
main.main()
src/
github.com/shankarapailoor/moonshine/main.go:45 +0x1a8
>> > 3. It seems to produce too much output by default. I understand that
>> it may be useful to get more detailed information about a particular
>> program, so I think it need a verbosity flag (-v). By default it
>> should print just few lines about overall progress and results. On
>> higher verbosity levels it can prints some per-program info.
>>
>> Yeah I agree. Will add a more flexible logger similar to Syzkaller.
ack
>> > 4. I've noticed some common calls in each program that come from libc:
>>
>> > r0 = open(&(0x7f0000000000)='/etc/ld.so.cache\x00', 0x80000, 0x0)
>> > r1 = open(&(0x7f0000000055)='/lib/x86_64-linux-gnu/libc.so.6\x00',
>> > 0x80000, 0x0)
>> > I think we need to strip this common prefix. The programs will be
>> > shorter, faster and lots of them will also run on gVisor, freebsd, akaros,
>> > etc.
>>
>> Agreed.
>>
>> > 5. I've noticed lots of mmap's that look excessive. Today executor
>> > always maps whole 512MB data region, so there is no need to do mmaps just to
>> > map memory for arguments. However, if an an fd is mapped into memory, that
>> > of course needs to be preserved.
>>
>> Good point. Our static memory tracker validates that mmap'd regions cannot
>> exceed that amount because that will create crashes at runtime, but we might
>> be wasting some fuzzing time by including uninteresting ones. However, don't
>> you think it might be interesting for complicated mlock->mmap->mremap?
>> Perhaps limit those to a fixed (but small page size) and do best effort
>> tracking downstream dependencies?
Yes, I agree that some mmaps are useful. Perhaps we could also look
for any syscalls accepting vma type (e.g. mlock, mremap).
>> > 6. I think we need some tests that check that it can parse canned strace
>> output and produce a valid syzkaller program. For few cases it can
>> make sense to check input/output verbatim.
>>
>> Yes I agree. Right now I've just been converting the deserialized programs
>> (seen in 'deserialized/') and running them under strace (without coverage)
>> to make sure the same calls appear with sam return values. I plan to add
>> some automated tests.
>>
>> > 7. How is distillation different from fuzzer minimization procedure?
>> On first glance they do the same: take a large program and produce one
>> or several smaller programs preserving coverage.
>>
>> I think there are two main benefits. (Apologies if I'm wrong)
>>
>> 1) It allows us to serialize very large traces from real programs into
>> compact Syzkaller ones. A trace of even a few seconds of (say Chrome?) is
>> far too large to even be serialized into a valid Syzkaller program.
>> Distillation allows one to create compact and interesting seeds upfront.
>>
>> 2) The existing fuzzer minimzation works well because Syzkaller naturally
>> generates compact programs; however, in my experience this isn't able to
>> scale effectively for programs with even 100 calls (such as these). For
>> example, it seems that Syzkaller tries to minimize the starting seeds
>> upfront correct? I remember providing only a handful of these converted
>> programs, and the minimization was still going for a day and a half before I
>> killed it.
This makes sense. Yes, fuzzer minimization can't handle huge programs.
I am trying to run it on syz-manager trace, I am using current strace
head (cb3a15fe21b49c4b11f307882a828bfe99a37c4e).
The trace produces a bunch of parsing errors:
error: syntax error
Error parsing line: 93489 <... newfstatat resumed> 0xc4204ab3b8, 0) =
-1 ENOENT (No such file or directory)
error: syntax error
Error parsing line: 93503 <... read resumed> 0xc420787000, 4096) = -1
EAGAIN (Resource temporarily unavailable)
error: syntax error
Error parsing line: 93538 <... read resumed> 0xc420787080, 3968) = -1
EAGAIN (Resource temporarily unavailable)
error: syntax error
Error parsing line: 93526 <... read resumed> 0xc4207870b4, 3916) = -1
EAGAIN (Resource temporarily unavailable)
error: syntax error
Error parsing line: 93489 <... accept4 resumed> 0xc423092bd8, [112],
SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily
unavailable)
and then it crashes with:
error: syntax error
Error parsing line: 93558 ioctl(0, SNDCTL_TMR_START or TCSETS,
{c_iflags=0, c_oflags=0x1, c_cflags=0x30, c_lflags=0, c_line=0,
c_cc[VMIN]=1, c_cc[VTIME]=0,
c_cc="\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"})
= -1 ENOTTY (Inappropriate ioctl for device)
panic: Failed to parse line: [93558 ioctl(0, SNDCTL_TMR_START or
TCSETS, {c_iflags=0, c_oflags=0x1, c_cflags=0x30, c_lflags=0,
c_line=0, c_cc[VMIN]=1, c_cc[VTIME]=0,
c_cc="\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"})
= -1 ENOTTY (Inappropriate ioctl for device)]
goroutine 1 [running]:
github.com/shankarapailoor/moonshine/logging.Failf(0x602288, 0x19,
0xc427588ab0, 0x1, 0x1)
src/
github.com/shankarapailoor/moonshine/logging/logger.go:6 +0xfe
github.com/shankarapailoor/moonshine/scanner.parseLoop(0xc4200adbf8,
0xc43d952000)
src/
github.com/shankarapailoor/moonshine/scanner/scanner.go:69 +0x3d2
github.com/shankarapailoor/moonshine/scanner.Parse(0xc4201040a0, 0x12,
0xc4200adec8)
src/
github.com/shankarapailoor/moonshine/scanner/scanner.go:93 +0x1e0
main.ParseTraces(0x1a00040, 0x5, 0x5c0b41, 0x5)
src/
github.com/shankarapailoor/moonshine/main.go:78 +0x7cd
main.main()
src/
github.com/shankarapailoor/moonshine/main.go:45 +0x1a8
Here are some of these lines:
93489:
93565 <... rt_sigtimedwait resumed> 0x7f7cef7fd720, {tv_sec=0,
tv_nsec=0}, 8) = -1 EAGAIN (Resource temporarily unavailable)
93489:
93565 <... rt_sigtimedwait resumed> 0x7f7cef7fd720, {tv_sec=0,
tv_nsec=0}, 8) = -1 EAGAIN (Resource temporarily unavailable)
93558:
93564 ioctl(20, KVM_RUN, 0 <unfinished ...>