[Proposal] Generating usages slice for enhanced testing/fuzzing

8 views
Skip to first unread message

Prabhu Subramanian

unread,
Sep 10, 2023, 8:06:07 AM9/10/23
to v8-dev
Hello,

I am a newbie here. Apologies if this is not the right group for the below message.

I am one of the developers of a static program analysis/slicer tool called atom (Apache-2.0). Atom uses the popular joern library (which internally uses Eclipse CDT for c/c++).

https://github.com/AppThreat/atom

With atom, it is possible to generate an intermediate representation for a project and then slice it into two modes - usages and data-flow. This is discussed in the below document.

https://github.com/AppThreat/atom/blob/main/specification/docs/slices.md

We recently improved the performance of generating atom to support large code bases like v8. It is possible to produce the usages slice in around 18 minutes using the below commands.

## Prerequisites

Ensure java >= 17 is installed
Download atom from https://github.com/AppThreat/atom/releases

```shell

unzip atom.zip
cd atom-1.0.0/bin

./atom -J-Xms40g -J-Xmx40g usages --slice-outfile usages.json -o app.atom --language c <path to v8>/src
```

## Proposal

The information in the usages slice, such as locations, signature, and type can be used to improve testing and fuzzing of projects like v8. I am unsure if this is an area actively explored here, but we would love to discuss further if this is useful.

The link below is a gzipped version of the usages slice json generated today for convenience.


Best,
Prabhu

Jakob Kummerow

unread,
Sep 11, 2023, 10:50:53 AM9/11/23
to v8-...@googlegroups.com
On Sun, Sep 10, 2023 at 2:06 PM Prabhu Subramanian <pra...@appthreat.dev> wrote:
Hello,

I am a newbie here. Apologies if this is not the right group for the below message.

I am one of the developers of a static program analysis/slicer tool called atom (Apache-2.0). Atom uses the popular joern library (which internally uses Eclipse CDT for c/c++).

https://github.com/AppThreat/atom

With atom, it is possible to generate an intermediate representation for a project and then slice it into two modes - usages and data-flow. This is discussed in the below document.

https://github.com/AppThreat/atom/blob/main/specification/docs/slices.md

We recently improved the performance of generating atom to support large code bases like v8. It is possible to produce the usages slice in around 18 minutes using the below commands.

## Prerequisites

Ensure java >= 17 is installed
Download atom from https://github.com/AppThreat/atom/releases

```shell

unzip atom.zip
cd atom-1.0.0/bin

./atom -J-Xms40g -J-Xmx40g usages --slice-outfile usages.json -o app.atom --language c <path to v8>/src
```

## Proposal

The information in the usages slice, such as locations, signature, and type can be used to improve testing and fuzzing of projects like v8.

Can you be a bit more specific? How exactly can this information be used for better testing or fuzzing?
 
I am unsure if this is an area actively explored here,

It certainly is.
 
but we would love to discuss further if this is useful.

Note that a lot of the "interesting" things that V8 does have to do with JIT code generation and custom heap management, so usually tools that rely on C++ code analysis can only gain very limited insight into everything that's happening.
If you believe that your analysis tool does provide useful input for fuzzers, then one thing you could do is run your own fuzzer over V8, and submit any relevant issues it discovers to the Chrome VRP to make it worth your while.
 
The link below is a gzipped version of the usages slice json generated today for convenience.


Best,
Prabhu

--

Prabhu Subramanian

unread,
Sep 11, 2023, 5:24:06 PM9/11/23
to v8-...@googlegroups.com
Thanks, Jakob.

With the usage slice, the idea I had in mind was to identify hotspots and then find the test coverage to determine any gaps. With the dataflow slice, precompute paths based on criteria, such as usage of a certain memory or Unicode operation, and then guide the fuzzer to test those particular flows.

I will spend more time with the codebase to produce a working poc and come back with an update.

Best,
Prabhu

-- 
-- 
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to a topic in the Google Groups "v8-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/v8-dev/R6K9I5yEiYc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/CAKSzg3SMce0ocOT5H5AqKwqQ7ybh6hnt89e95o08z%2BY8hNji%2BA%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages