Fwd: Adding Syzkaller Tool

528 views
Skip to first unread message

shankarapailoor

unread,
Jul 30, 2018, 7:12:16 AM7/30/18
to Dmitry Vyukov, syzkaller
++ Syzkaller mailing list

No problem!
---------- Forwarded message ----------
From: Dmitry Vyukov <dvy...@google.com>
Date: Mon, Jul 30, 2018 at 1:58 AM
Subject: Re: Adding Syzkaller Tool
To: shankarapailoor <shankar...@gmail.com>


Do you mind adding syzkaller mailing list to CC on these conversations?

On Mon, Jul 30, 2018 at 8:16 AM, shankarapailoor
<shankar...@gmail.com> wrote:
> Hi Dmitry,
>
> I've added more detailed instructions for building and running MoonShine in
> my repository. The step-by-step guide for collecting traces
> (LTP or selftests)
> is still in progress; however, I have provided a a collection of ~350 traces
> that can be used to see how MoonShine works. A link to those can be found in
> the README.md.
>
> One thing to note is that this approach has been tested up to Syzkaller
> commit f48c20b8f9b2a6c26629f11cc15e1c9c316572c8 (May 19, 2018)
> and so I don't know yet if it will work with more recent versions.
>
> Regards,
> Shankara
>
> On Fri, Jul 27, 2018 at 3:48 AM, shankarapailoor <shankar...@gmail.com>
> wrote:
>>
>> Thanks for letting me know Dmitry. I will add instructions to my
>> github repository in the next day or so and email you once I'm done.
>>
>> On Fri, Jul 27, 2018 at 2:32 AM, Dmitry Vyukov <dvy...@google.com> wrote:
>> > On Fri, Jul 27, 2018 at 5:02 AM, shankarapailoor
>> > <shankar...@gmail.com> wrote:
>> >> Hi Dmitry,
>> >>
>> >> I'm not sure if you remember, but we talked last year about our
>> >> research
>> >> project MoonShine which generates seed programs for OS fuzzers from
>> >> traces
>> >> of real programs. Our prototype implementation generates a starting
>> >> corpus
>> >> for Syzkaller and can be found here:
>> >>
>> >> https://github.com/shankarapailoor/moonshine.
>> >>
>> >> We have found over 17 new bugs by collecting traces from LTP, Posix
>> >> testsuite, Glibc tests, kernel selftest programs, etc. We also perform
>> >> call
>> >> level distillation to generate compact, yet diverse seeds.
>> >>
>> >> Could we add our work as a Syzkaller "tool"? If not, then would you be
>> >> willing to
>> >> include
>> >>  a link to our repository
>> >> so that others who may be interested can take a look
>> >> ?
>> >
>> >
>> > Hi Shankara,
>> >
>> > Yes, I remember. It seems that you got some great results with it.
>> > I don't mind adding it to tools/ dir. But first I would like to try it
>> > and see how it actually works, and for this we need step-by-step usage
>> > instructions. I think that would be useful in either case. Can you
>> > provide such instructions for, say, selftests?
>>
>>
>>
>> --
>> Regards,
>> Shankara Pailoor
>
>
>
>
> --
> Regards,
> Shankara Pailoor



--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Jul 30, 2018, 7:24:08 AM7/30/18
to shankarapailoor, syzkaller
Great!

The capability of transforming C tests into syzkaller programs is
definitely very interesting and useful.

I've tried to follow the instructions and here are some comments.

1. I did not find distill.json anywhere.

2. moonshine does not create the serialized dir, so it fails with:

panic: failed to output file: [open serialized/ltp_accept4_011: no
such file or directory]

3. It seems to produce too much output by default. I understand that
it may be useful to get more detailed information about a particular
program, so I think it need a verbosity flag (-v). By default it
should print just few lines about overall progress and results. On
higher verbosity levels it can prints some per-program info.

4. I've noticed some common calls in each program that come from libc:

r0 = open(&(0x7f0000000000)='/etc/ld.so.cache\x00', 0x80000, 0x0)
r1 = open(&(0x7f0000000055)='/lib/x86_64-linux-gnu/libc.so.6\x00', 0x80000, 0x0)

I think we need to strip this common prefix. The programs will be
shorter, faster and lots of them will also run on gVisor, freebsd,
akaros, etc.

5. I've noticed lots of mmap's that look excessive. Today executor
always maps whole 512MB data region, so there is no need to do mmaps
just to map memory for arguments. However, if an an fd is mapped into
memory, that of course needs to be preserved.

6. Tests don't build:

$ go test -short ./...
# github.com/shankarapailoor/moonshine/tracker
tracker/memory_tracker.go:205: Errorf format %d reads arg #2, but call
has only 1 arg
# github.com/shankarapailoor/moonshine/distiller
distiller/distiller-metadata.go:152: Printf format %v arg err.Error is
a func value, not called
# github.com/shankarapailoor/moonshine
./main.go:88: Fprintln arg list ends with redundant newline
./main.go:92: Fprintln arg list ends with redundant newline

I think we need some tests that check that it can parse canned strace
output and produce a valid syzkaller program. For few cases it can
make sense to check input/output verbatim.

7. How is distillation different from fuzzer minimization procedure?
On first glance they do the same: take a large program and produce one
or several smaller programs preserving coverage.

Thanks



On Mon, Jul 30, 2018 at 1:12 PM, shankarapailoor
<shankar...@gmail.com> wrote:
> ++ Syzkaller mailing list
>
> No problem!
> ---------- Forwarded message ----------
> From: Dmitry Vyukov
> Date: Mon, Jul 30, 2018 at 1:58 AM
> Subject: Re: Adding Syzkaller Tool
> To: shankarapailoor
>
>

shankarapailoor

unread,
Jul 30, 2018, 8:12:45 AM7/30/18
to Dmitry Vyukov, syzkaller
Sorry forgot mailing list.

On Mon, Jul 30, 2018 at 5:00 AM, shankarapailoor <shankar...@gmail.com> wrote:
Thanks for the feedback. I've also made some updates since Friday to the source tree. Please pull again.

> 1. I did not find distill.json anywhere.

It is under getting-started/ directory. 

> 2. moonshine does not create the serialized dir, so it fails with

Also should be fixed. Now it is 'deserialized/'.

> 3. It seems to produce too much output by default. I understand that
it may be useful to get more detailed information about a particular
program, so I think it need a verbosity flag (-v). By default it
should print just few lines about overall progress and results. On
higher verbosity levels it can prints some per-program info.

Yeah I agree. Will add a more flexible logger similar to Syzkaller.

> 4. I've noticed some common calls in each program that come from libc:

> r0 = open(&(0x7f0000000000)='/etc/ld.so.cache\x00', 0x80000, 0x0)
> r1 = open(&(0x7f0000000055)='/lib/x86_64-linux-gnu/libc.so.6\x00', 0x80000, 0x0)
> I think we need to strip this common prefix. The programs will be shorter, faster and lots of them will also run on gVisor, freebsd, akaros, etc.

Agreed. 

> 5. I've noticed lots of mmap's that look excessive. Today executor always maps whole 512MB data region, so there is no need to do mmaps just to map memory for arguments. However, if an an fd is mapped into memory, that of course needs to be preserved.

Good point. Our static memory tracker validates that mmap'd regions cannot exceed that amount because that will create crashes at runtime, but we might be wasting some fuzzing time by including uninteresting ones. However, don't you think it might be interesting for complicated mlock->mmap->mremap? Perhaps limit those to a fixed (but small page size) and do best effort tracking downstream dependencies?


> 6. I think we need some tests that check that it can parse canned strace

output and produce a valid syzkaller program. For few cases it can
make sense to check input/output verbatim.

Yes I agree. Right now I've just been converting the deserialized programs (seen in 'deserialized/') and running them under strace (without coverage) to make sure the same calls appear with sam return values. I plan to add some automated tests.

> 7. How is distillation different from fuzzer minimization procedure?
On first glance they do the same: take a large program and produce one
or several smaller programs preserving coverage.

I think there are two main benefits. (Apologies if I'm wrong)

1) It allows us to serialize very large traces from real programs into compact Syzkaller ones. A trace of even a few seconds of (say Chrome?) is far too large to even be serialized into a valid Syzkaller program. Distillation allows one to create compact and interesting seeds upfront.

2) The existing fuzzer minimzation works well because Syzkaller naturally generates compact programs; however, in my experience this isn't able to scale effectively for programs with even 100 calls (such as these). For example, it seems that Syzkaller tries to minimize the starting seeds upfront correct? I remember providing only a handful of these converted programs, and the minimization was still going for a day and a half before I killed it. 




--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Jul 31, 2018, 5:23:45 AM7/31/18
to shankarapailoor, syzkaller
On Mon, Jul 30, 2018 at 2:12 PM, shankarapailoor
<shankar...@gmail.com> wrote:
> Sorry forgot mailing list.
>
> On Mon, Jul 30, 2018 at 5:00 AM, shankarapailoor <shankar...@gmail.com>
> wrote:
>>
>> Thanks for the feedback. I've also made some updates since Friday to the
>> source tree. Please pull again.
>>
>> > 1. I did not find distill.json anywhere.

Right, I did not pull. Found it now.

>> It is under getting-started/ directory.
>>
>> > 2. moonshine does not create the serialized dir, so it fails with
>>
>> Also should be fixed. Now it is 'deserialized/'.

Still crashes for me:

moonshine$ bin/moonshine -dir sampletraces -distill
./getting-started/distill.json
Total Number of Files: 346
Parsing File 1/346: ltp_accept4_01
Parsing File 2/346: ltp_accept_01
...
Parsing File 345/346: ltp_utimensat01
Parsing File 346/346: ltp_writev05
Total number of seeds: 43438
Performing implicit distillation with 43438 calls contributing coverage
TOTAL HEAVY HITTERS: 639
Total Distilled Progs: 391
Average Program Length: 10
Total Contributing calls: 639 out of 43438, in 388
implicitly-distilled programs that consist of: 3250 calls
panic: failed to output file: [open deserialized/distill0: no such
file or directory]
goroutine 1 [running]:
github.com/shankarapailoor/moonshine/logging.Failf(0x603787, 0x19,
0xc4449ac850, 0x1, 0x1)
src/github.com/shankarapailoor/moonshine/logging/logger.go:6 +0xfe
main.ParseTraces(0x1a00040, 0x5, 0x5c0b41, 0x5)
src/github.com/shankarapailoor/moonshine/main.go:126 +0xda8
main.main()
src/github.com/shankarapailoor/moonshine/main.go:45 +0x1a8



>> > 3. It seems to produce too much output by default. I understand that
>> it may be useful to get more detailed information about a particular
>> program, so I think it need a verbosity flag (-v). By default it
>> should print just few lines about overall progress and results. On
>> higher verbosity levels it can prints some per-program info.
>>
>> Yeah I agree. Will add a more flexible logger similar to Syzkaller.

ack

>> > 4. I've noticed some common calls in each program that come from libc:
>>
>> > r0 = open(&(0x7f0000000000)='/etc/ld.so.cache\x00', 0x80000, 0x0)
>> > r1 = open(&(0x7f0000000055)='/lib/x86_64-linux-gnu/libc.so.6\x00',
>> > 0x80000, 0x0)
>> > I think we need to strip this common prefix. The programs will be
>> > shorter, faster and lots of them will also run on gVisor, freebsd, akaros,
>> > etc.
>>
>> Agreed.
>>
>> > 5. I've noticed lots of mmap's that look excessive. Today executor
>> > always maps whole 512MB data region, so there is no need to do mmaps just to
>> > map memory for arguments. However, if an an fd is mapped into memory, that
>> > of course needs to be preserved.
>>
>> Good point. Our static memory tracker validates that mmap'd regions cannot
>> exceed that amount because that will create crashes at runtime, but we might
>> be wasting some fuzzing time by including uninteresting ones. However, don't
>> you think it might be interesting for complicated mlock->mmap->mremap?
>> Perhaps limit those to a fixed (but small page size) and do best effort
>> tracking downstream dependencies?

Yes, I agree that some mmaps are useful. Perhaps we could also look
for any syscalls accepting vma type (e.g. mlock, mremap).


>> > 6. I think we need some tests that check that it can parse canned strace
>> output and produce a valid syzkaller program. For few cases it can
>> make sense to check input/output verbatim.
>>
>> Yes I agree. Right now I've just been converting the deserialized programs
>> (seen in 'deserialized/') and running them under strace (without coverage)
>> to make sure the same calls appear with sam return values. I plan to add
>> some automated tests.
>>
>> > 7. How is distillation different from fuzzer minimization procedure?
>> On first glance they do the same: take a large program and produce one
>> or several smaller programs preserving coverage.
>>
>> I think there are two main benefits. (Apologies if I'm wrong)
>>
>> 1) It allows us to serialize very large traces from real programs into
>> compact Syzkaller ones. A trace of even a few seconds of (say Chrome?) is
>> far too large to even be serialized into a valid Syzkaller program.
>> Distillation allows one to create compact and interesting seeds upfront.
>>
>> 2) The existing fuzzer minimzation works well because Syzkaller naturally
>> generates compact programs; however, in my experience this isn't able to
>> scale effectively for programs with even 100 calls (such as these). For
>> example, it seems that Syzkaller tries to minimize the starting seeds
>> upfront correct? I remember providing only a handful of these converted
>> programs, and the minimization was still going for a day and a half before I
>> killed it.

This makes sense. Yes, fuzzer minimization can't handle huge programs.


I am trying to run it on syz-manager trace, I am using current strace
head (cb3a15fe21b49c4b11f307882a828bfe99a37c4e).

The trace produces a bunch of parsing errors:

error: syntax error
Error parsing line: 93489 <... newfstatat resumed> 0xc4204ab3b8, 0) =
-1 ENOENT (No such file or directory)
error: syntax error
Error parsing line: 93503 <... read resumed> 0xc420787000, 4096) = -1
EAGAIN (Resource temporarily unavailable)
error: syntax error
Error parsing line: 93538 <... read resumed> 0xc420787080, 3968) = -1
EAGAIN (Resource temporarily unavailable)
error: syntax error
Error parsing line: 93526 <... read resumed> 0xc4207870b4, 3916) = -1
EAGAIN (Resource temporarily unavailable)
error: syntax error
Error parsing line: 93489 <... accept4 resumed> 0xc423092bd8, [112],
SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily
unavailable)


and then it crashes with:

error: syntax error
Error parsing line: 93558 ioctl(0, SNDCTL_TMR_START or TCSETS,
{c_iflags=0, c_oflags=0x1, c_cflags=0x30, c_lflags=0, c_line=0,
c_cc[VMIN]=1, c_cc[VTIME]=0,
c_cc="\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"})
= -1 ENOTTY (Inappropriate ioctl for device)
panic: Failed to parse line: [93558 ioctl(0, SNDCTL_TMR_START or
TCSETS, {c_iflags=0, c_oflags=0x1, c_cflags=0x30, c_lflags=0,
c_line=0, c_cc[VMIN]=1, c_cc[VTIME]=0,
c_cc="\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"})
= -1 ENOTTY (Inappropriate ioctl for device)]

goroutine 1 [running]:
github.com/shankarapailoor/moonshine/logging.Failf(0x602288, 0x19,
0xc427588ab0, 0x1, 0x1)
src/github.com/shankarapailoor/moonshine/logging/logger.go:6 +0xfe
github.com/shankarapailoor/moonshine/scanner.parseLoop(0xc4200adbf8,
0xc43d952000)
src/github.com/shankarapailoor/moonshine/scanner/scanner.go:69 +0x3d2
github.com/shankarapailoor/moonshine/scanner.Parse(0xc4201040a0, 0x12,
0xc4200adec8)
src/github.com/shankarapailoor/moonshine/scanner/scanner.go:93 +0x1e0
main.ParseTraces(0x1a00040, 0x5, 0x5c0b41, 0x5)
src/github.com/shankarapailoor/moonshine/main.go:78 +0x7cd
main.main()
src/github.com/shankarapailoor/moonshine/main.go:45 +0x1a8


Here are some of these lines:

93489:
93565 <... rt_sigtimedwait resumed> 0x7f7cef7fd720, {tv_sec=0,
tv_nsec=0}, 8) = -1 EAGAIN (Resource temporarily unavailable)

93489:
93565 <... rt_sigtimedwait resumed> 0x7f7cef7fd720, {tv_sec=0,
tv_nsec=0}, 8) = -1 EAGAIN (Resource temporarily unavailable)

93558:
93564 ioctl(20, KVM_RUN, 0 <unfinished ...>

Dmitry Vyukov

unread,
Jul 31, 2018, 5:39:38 AM7/31/18
to shankarapailoor, syzkaller
How was implicit-dependencies/implicit_dependencies.json generated? We
need to be able to regenerate it because the set of calls is changing,
and there are other OSes, and we can improve the analysis logic. Or
ideally just extract this info on the fly if does not take too long.

Overall, I think that the tool needs to split into 2 separate tools:
one converts strace to syzkaller programs, and another minimizes
syzkaller programs. This way it will be much easier to understand and
review. These parts also have vastly different dependencies and deal
with different aspects. We can start with merging the part that
converts starce to syzkaller programs.

Also, the code needs some additional work. I've run gometalinter with
the current syzkaller config (which is already somewhat relaxed) and
this gives a good start:
https://gist.githubusercontent.com/dvyukov/4aba3220e29913dd2f25620da46e4a80/raw/b01bf54c17f9d5dceca63095912ee182d3d11064/gistfile1.txt

shankarapailoor

unread,
Jul 31, 2018, 6:49:11 AM7/31/18
to Dmitry Vyukov, syzkaller
Hi Dmitry,

>Still crashes for me:
Man that's embarrassing. I had a few others run through the instructions end-to-end and it worked for them so my apologies.

>How was implicit-dependencies/implicit_dependencies.json generated?
This was generated with Smatch, a static analysis framework. I will put up instructions on how to reproduce the generation in a couple days time. There are a number of ways this analysis can be improved. 

>Parsing errors :(
Could you do me a favor and upload a large trace to Google drive so I can systematically go through and deal with these errors?

>Also, the code needs some additional work. I've run gometalinter with the current syzkaller config (which is already somewhat relaxed) and this gives a good start:
Very nice! I'll take a look at gometalinter along with what you have provided. 

>Overall, I think that the tool needs to split into 2 separate tools:
one converts strace to syzkaller programs, and another minimizes
syzkaller programs.

Makes sense. I think that this should be feasible.

Regards,
Shankara


--
Regards,
Shankara Pailoor

shankarapailoor

unread,
Jul 31, 2018, 1:48:35 PM7/31/18
to Dmitry Vyukov, syzkaller
Hi Dmitry,

From our above conversation it seems like the steps going forward are the following:

1. Add unit tests for the conversion.
2. Separate the program minimization components from the strace->syz conversion
3. Use gometalinter to cleanup code
4. Support verbosity flag for logging.

The strace->syzkaller conversion definitely needs to be more robust so I will start with adding the tests.  

Regards,
Shankara

On Tue, Jul 31, 2018 at 4:05 AM, Dmitry Vyukov <dvy...@google.com> wrote:
On Tue, Jul 31, 2018 at 12:49 PM, shankarapailoor
<shankar...@gmail.com> wrote:
> Hi Dmitry,
>
>>Still crashes for me:
> Man that's embarrassing. I had a few others run through the instructions
> end-to-end and it worked for them so my apologies.
>
>>How was implicit-dependencies/implicit_dependencies.json generated?
> This was generated with Smatch, a static analysis framework. I will put up
> instructions on how to reproduce the generation in a couple days time. There
> are a number of ways this analysis can be improved.
>
>>Parsing errors :(
> Could you do me a favor and upload a large trace to Google drive so I can
> systematically go through and deal with these errors?


I've attached a smaller file. This one crashes with:

panic: Failed to eval flag: SIGINT

But what I am doing is simply:

$ /src/strace/install/bin/strace -o tracefile -s 65500 -v -xx -f
syzkaller/bin/syz-manager -config my.cfg -v 1 -debug

I would expect that this will produce similar traces if you run it locally.



--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Aug 1, 2018, 5:48:30 AM8/1/18
to shankarapailoor, syzkaller
On Tue, Jul 31, 2018 at 7:48 PM, shankarapailoor
<shankar...@gmail.com> wrote:
> Hi Dmitry,
>
> From our above conversation it seems like the steps going forward are the
> following:
>
> 1. Add unit tests for the conversion.
> 2. Separate the program minimization components from the strace->syz
> conversion
> 3. Use gometalinter to cleanup code
> 4. Support verbosity flag for logging.
>
> The strace->syzkaller conversion definitely needs to be more robust so I
> will start with adding the tests.

Yes, sounds good to me.
Tests and clean up can be done ins parts too, i.e. the
strace->syzkaller part first. We can do this part end-to-end first and
commit code. And then switch to the distillation part. May be more
productive this way.
Also I did not really look at the code in deep. Don't want to nitpick
too much until gometalinter warnings are fixed, and also it will be
easier to do deeper review once the tool is split into 2 parts.

shankarapailoor

unread,
Aug 21, 2018, 9:23:07 PM8/21/18
to Dmitry Vyukov, syzkaller
Hi Dmitry,

Sorry for the late response; been away for a while. I've grouped the trace->syz components into one stand-alone tool which is in the refactor branch here: https://github.com/shankarapailoor/moonshine/tree/refactor under the trace2syz/ directory. I've worked on the following items:

1. Cleanup
       The code under trace2syz/ shouldn't have any gometalinter warnings and I've been using the same metalinter config as Syzkaller.

2. Tests
       I've been adding tests (slowly) but mainly focusing on the parsing of the traces and the conversion. There are still some  end->end tests I would like to add like "does the system call trace of the Syzkaller program match the original trace?". That latter might take some more time, but I'm currently doing this manually. 

3. Logging
      I've decided to just use the Syzkaller log package which does exactly what we want.

There are certainly lot more improvements to be made, but I plan to maintain this tool long term. Once this part is ok'd, I'll switch over to refactoring the minimization components.

Let me know your thoughts!

Regards,
Shankara
--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Sep 4, 2018, 12:50:28 AM9/4/18
to shankarapailoor, syzkaller
Hi Shankara,

I should be looking at the refactor branch, right?

I can't build it:

moonshine$ go install ./trace2syz/
# github.com/shankarapailoor/moonshine/trace2syz/trace2syz
trace2syz/trace2syz/scanner.go:46:10: undefined: newStraceLexer
trace2syz/trace2syz/scanner.go:47:10: undefined: StraceParse

I've just checked out latest head d9be2d8a297d578bb365472f62ae9bce8e17751f.

I can't find this function anywhere in source:

moonshine$ find -name "*.go" -exec grep newStraceLexer {} \; -print
lex := newStraceLexer(scanner.Bytes())
./trace2syz/trace2syz/scanner.go



On Wed, Aug 22, 2018 at 3:22 AM, shankarapailoor
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

shankarapailoor

unread,
Sep 4, 2018, 12:53:24 AM9/4/18
to Dmitry Vyukov, syzkaller
Hi Dmitry,

This code gets generated when you call `make` in trace2syz/ directory.

Regards,
Shankara
--
Regards,
Shankara Pailoor

shankarapailoor

unread,
Sep 4, 2018, 12:54:31 AM9/4/18
to Dmitry Vyukov, syzkaller
The instructions are the same as in the readme just no minimization stuff required
--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Sep 4, 2018, 1:09:12 AM9/4/18
to shankarapailoor, syzkaller
Can build it now.

I had to do the following so that generated files get proper relative
paths (prevents gometalinter about missing files):

diff --git a/trace2syz/Makefile b/trace2syz/Makefile
index 56e52ec..9ce92d8 100644
--- a/trace2syz/Makefile
+++ b/trace2syz/Makefile
@@ -1,6 +1,6 @@
-default:
- ragel -Z -G2 -o trace2syz/lex.go trace2syz/straceLex.rl
- goyacc -o trace2syz/strace.go -p Strace trace2syz/strace.y
+adefault:
+ (cd trace2syz; ragel -Z -G2 -o lex.go straceLex.rl)
+ (cd trace2syz; goyacc -o strace.go -p Strace strace.y)
mkdir -p bin deserialized
go build -o ./bin/moonshine main.go
clean:


I've synched .gometalinter.json with current syzkaller version and
still see a bunch of warnings:

trace2syz$ gometalinter.v2 ./...
trace2syz/fileTracker.go:9:5:warning: var commonRootPaths is unused
(U1000) (megacheck)
trace2syz/fileTracker.go:9:5:warning: var commonRootPaths is unused
(U1000) (unused)
trace2syz/fileTracker.go:10:5:warning: var systemPaths is unused
(U1000) (megacheck)
trace2syz/fileTracker.go:10:5:warning: var systemPaths is unused
(U1000) (unused)
trace2syz/fileTracker.go:33:23:warning: func (*FileTracker).chdir is
unused (U1000) (megacheck)
trace2syz/fileTracker.go:33:23:warning: func (*FileTracker).chdir is
unused (U1000) (unused)
trace2syz/fileTracker.go:52:23:warning: func (*FileTracker).sanitize
is unused (U1000) (unused)
trace2syz/fileTracker.go:52:23:warning: func (*FileTracker).sanitize
is unused (U1000) (megacheck)
trace2syz/fileTracker.go:70:6:warning: func isSystemFile is unused
(U1000) (unused)
trace2syz/fileTracker.go:70:6:warning: func isSystemFile is unused
(U1000) (megacheck)
trace2syz/fileTracker.go:84:1:warning: isTempFile is unused (deadcode)
trace2syz/fileTracker.go:84:6:warning: func isTempFile is unused
(U1000) (megacheck)
trace2syz/fileTracker.go:84:6:warning: func isTempFile is unused
(U1000) (unused)
trace2syz/fileTracker.go:106:6:warning: func isAbsPath is unused
(U1000) (megacheck)
trace2syz/fileTracker.go:106:6:warning: func isAbsPath is unused
(U1000) (unused)
trace2syz/fileTracker.go:107:2:warning: should use 'return <expr>'
instead of 'if <expr> { return <bool> }; return <bool>' (S1008)
(gosimple)
trace2syz/fileTracker.go:107:2:warning: should use 'return <expr>'
instead of 'if <expr> { return <bool> }; return <bool>' (S1008)
(megacheck)
trace2syz/preprocessHooks.go:144::warning: declaration of "pair"
shadows declaration at trace2syz/util.go:3 (vetshadow)
trace2syz/scanner_test.go:139::warning: cyclomatic complexity 26 of
function TestParseLoopIrTypes() is high (> 24) (gocyclo)

shankarapailoor

unread,
Sep 4, 2018, 1:12:28 AM9/4/18
to Dmitry Vyukov, syzkaller
Hi Dmitry,

Sorry most of these are under the fileTracker.go file which is fairly new. I did that to support some newer validation but haven't gotten to finishing it and it isn't integrated into the main tool so that is why there are so many linter warnings.

Regards,
Shankara
--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Sep 4, 2018, 1:25:55 AM9/4/18
to shankarapailoor, syzkaller
Looking at the code I have quite a few comments for code style. I've
created https://github.com/google/syzkaller/pull/706 for review
purposes.

On Tue, Sep 4, 2018 at 7:12 AM, shankarapailoor

shankarapailoor

unread,
Sep 4, 2018, 1:27:41 AM9/4/18
to Dmitry Vyukov, syzkaller
Ok thanks for letting me know. It is a bit late for me so I will probably respond tomorrow.
--
Regards,
Shankara Pailoor

shankarapailoor

unread,
Oct 8, 2018, 9:30:55 PM10/8/18
to Dmitry Vyukov, syzkaller
Hi Dmitry,

Sorry for the super late reply. I have been quite busy with other tasks. I have created a new github repo for the strace->syz component just because it can be completely stand alone. How should I update the Pull Request? For now, I have moved this component into its own project which you can find here: https://github.com/shankarapailoor/trace2syz?

I refactored this component into two packages: parser and proggen. parser just converts strace output to an intermediate representation and proggen converts the IR to syzkaller programs. 

I expect it needs more work before you accept but hopefully there are no more style issues.

Regards,
Shankara
--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Oct 10, 2018, 3:37:22 PM10/10/18
to shankarapailoor, syzkaller
On Tue, Oct 9, 2018 at 3:30 AM, shankarapailoor
<shankar...@gmail.com> wrote:
> Hi Dmitry,
>
> Sorry for the super late reply. I have been quite busy with other tasks. I
> have created a new github repo for the strace->syz component just because it
> can be completely stand alone. How should I update the Pull Request? For


I think you need to fork syzkaller repo on github, and then do this
work in a branch in your fork of syzkaller.
This will make several things simpler:
- you will be able to send pull requests to the main syzkaller repo
- you will be able to change syscall descriptions (like what we
discussed for capability consts, etc); these changes can be made is
separate commits that can also go into pull requests
- you will be able to laid out files/dirs to their final destination
in syzkaller tree

I think the main tool should go into something like tools/syz-trace2syz
And the 2 packages into... let's start with tools/syz-trace2syz
subdirs, i.e. tools/syz-trace2syz/{parser,proggen}.

And the utils packages needs to be renamed/merged into another package:
https://blog.golang.org/package-names
https://rakyll.org/style-packages/
https://twitter.com/davecheney/status/697551191752855552

And for file names the convention is cpp_style_with_underscores.go
rather than javaCamelCase.go.

shankarapailoor

unread,
Oct 10, 2018, 9:01:06 PM10/10/18
to Dmitry Vyukov, syzkaller
Sounds good.

So as I mentioned before, the latest Syzkaller's program validation checks to see if any filepaths escape the sandbox. Currently, our tool doesn't handle this case. I have some thoughts on fixing this, but it will take more time. Are you ok with me sending a PR for code review without this feature or should I wait to finish handling this case before sending a PR? I am asking because it might be nice to get some feedback iteratively about what is there now. 


--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Oct 11, 2018, 4:01:46 AM10/11/18
to shankarapailoor, syzkaller
On Thu, Oct 11, 2018 at 3:00 AM, shankarapailoor
<shankar...@gmail.com> wrote:
> Sounds good.
>
> So as I mentioned before, the latest Syzkaller's program validation checks
> to see if any filepaths escape the sandbox. Currently, our tool doesn't
> handle this case. I have some thoughts on fixing this, but it will take more
> time. Are you ok with me sending a PR for code review without this feature
> or should I wait to finish handling this case before sending a PR? I am
> asking because it might be nice to get some feedback iteratively about what
> is there now.


Yes, sure, send the pull request as early as possible. This will help
to shake out more things in parallel faster.

shankarapailoor

unread,
Oct 14, 2018, 11:32:54 AM10/14/18
to Dmitry Vyukov, syzkaller
Hi Dmitry,

I'm looking at memAlloc because I think it is better for us to use than the memory_tracker in trace2syz and I would like to clarify my understanding. If memAlloc doesn't have any space left will it free all previously allocated objects (bankruptcy)? Also, is it ok if I make this an exported struct?

Regards,
Shankara
--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Oct 15, 2018, 5:55:59 AM10/15/18
to shankarapailoor, syzkaller
On Sun, Oct 14, 2018 at 5:32 PM, shankarapailoor
<shankar...@gmail.com> wrote:
> Hi Dmitry,
>
> I'm looking at memAlloc because I think it is better for us to use than the
> memory_tracker in trace2syz and I would like to clarify my understanding. If
> memAlloc doesn't have any space left will it free all previously allocated
> objects (bankruptcy)? Also, is it ok if I make this an exported struct?


Yes, it would be good to reuse memAlloc.
And, yes, it does bankruptcy on overflow. The idea is that memory for
arguments is used only during the syscall itself most of the time, so
if we used some range long time ago, then most likely it's fine to use
it now again.

Re exporting memAlloc. This feels like a too low level interface to
export, and it uses randGen, so it would need to be exported as well.
I think a better approach long term would be to export some higher
level interface that is tailored to building custom programs, and hide
details like memAlloc underneath. In fact, we already have something
similar, take a look at Gen type in prog/target.go, it already has
Alloc method that does what you want.
I am thinking about something along the following lines:

builder := prog.NewBuilder(target, randSource)
// While building the call we can use some helper methods, e.g.
builder.Alloc which allocates an addr.
builder.AddCall(call)
builder.AddCall(call)
builder.AddCall(call)
// This does AssignSizes, Sanitize for each call and Validate, so we
don't need to export all these functions separately
p, err := builder.Build()

It would be nice to eventually unify Gen and this Builder types.

shankarapailoor

unread,
Oct 15, 2018, 11:32:05 AM10/15/18
to Dmitry Vyukov, syzkaller
I like the idea of a builder to do this.

On a side note, I had to delete the following line (https://github.com/google/syzkaller/blob/master/prog/target.go#L111) in order to evaluate strace flags. Will that cause a problem and if so, is there a better way?

Regards,
Shankara
--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Oct 15, 2018, 12:52:30 PM10/15/18
to shankarapailoor, syzkaller
On Mon, Oct 15, 2018 at 5:31 PM, shankarapailoor
<shankar...@gmail.com> wrote:
> I like the idea of a builder to do this.
>
> On a side note, I had to delete the following line
> (https://github.com/google/syzkaller/blob/master/prog/target.go#L111) in
> order to evaluate strace flags. Will that cause a problem and if so, is
> there a better way?

This map will consume memory in production use. So if you build your
own const map like this:
https://github.com/google/syzkaller/blob/master/prog/target.go#L125
It would be better.

shankarapailoor

unread,
Oct 22, 2018, 9:48:23 AM10/22/18
to Dmitry Vyukov, syzk...@googlegroups.com
Hi Dmitry,

Could you let me know the high level things we need to do with the tool going forward? I am currently just trying to make things more robust by converting lots of traces of syzkaller programs back to syz programs. Funny enough most of the issues have been missed constants where the constant value is not explicitly included in the constant map e.g. IP6T_SO_SET_REPLACE.

Regards,
Shankara
--
Regards,
Shankara Pailoor

shankarapailoor

unread,
Oct 29, 2018, 10:10:41 AM10/29/18
to Dmitry Vyukov, syzk...@googlegroups.com
Hi Dmitry,

I'm sure you are busy with other things, but could you let me know when you plan on reviewing the latest PR? Thanks!

Regards,
Shankara
--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Oct 29, 2018, 10:23:01 AM10/29/18
to shankarapailoor, syzkaller
I was on a conference and travelling for the past 2 weeks, sorry for
delays. Just submitted new review.

On Mon, Oct 29, 2018 at 3:10 PM, shankarapailoor

shankarapailoor

unread,
Oct 29, 2018, 11:09:06 AM10/29/18
to Dmitry Vyukov, syzk...@googlegroups.com
No problem. Thanks for the review
--
Regards,
Shankara Pailoor

shankarapailoor

unread,
Nov 1, 2018, 2:08:33 AM11/1/18
to Dmitry Vyukov, syzk...@googlegroups.com
Just letting you know that the PR is ready for a review whenever you have time. Thanks!
--
Regards,
Shankara Pailoor

shankarapailoor

unread,
Nov 1, 2018, 12:13:32 PM11/1/18
to Dmitry Vyukov, syzk...@googlegroups.com
Hi Dmitry,

Thanks for the recent comments on the PR. What do you feel are the next steps moving forward? Back when I first submitted you mentioned that the tool has a long way to go. Do you feel the same?

Regards,
Shankara
--
Regards,
Shankara Pailoor

Dmitry Vyukov

unread,
Nov 2, 2018, 9:12:16 AM11/2/18
to shankarapailoor, syzk...@googlegroups.com
On Thu, Nov 1, 2018 at 5:13 PM, shankarapailoor
<shankar...@gmail.com> wrote:
> Hi Dmitry,
>
> Thanks for the recent comments on the PR. What do you feel are the next
> steps moving forward? Back when I first submitted you mentioned that the
> tool has a long way to go. Do you feel the same?

It definitely become much better now. I still did not actually review
the core logic, because I was distracted by all the small things,
constants, strace parsing and program generation mixed in the same
package, etc.
I will try to go over intermediate_types.go interfaces and proggen
logic more closely now.

shankarapailoor

unread,
Nov 4, 2018, 6:02:13 PM11/4/18
to Dmitry Vyukov, syzk...@googlegroups.com
Hi Dmitry,

I updated the PR with changes from the last review. When you have time please take a look. Thanks!

Regards,
Shankara
--
Regards,
Shankara Pailoor
Reply all
Reply to author
Forward
0 new messages