[llvm-dev] Building LLVM's fuzzers

662 views
Skip to first unread message

Justin Bogner via llvm-dev

unread,
Aug 24, 2017, 2:30:05 PM8/24/17
to llvm-dev
(kcc, george: sorry for the re-send, the first was from a non-list email
address)

My configuration for building the fuzzers in the LLVM tree doesn't seem to
work any more (possibly as of moving libFuzzer to compiler-rt, but there
have been a few other changes in the last week or so that may be related).

I'm building with a fresh top-of-tree clang and setting
-DLLVM_USE_SANITIZER=Address and -DLLVM_USE_SANITIZE_COVERAGE=On, which
was working before:

% cmake -GNinja \
-DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=On \
-DLLVM_ENABLE_WERROR=On \
-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On \
-DCMAKE_C_COMPILER=$HOME/llvm-lkgc/bin/clang \
$HOME/code/llvm-src

But when I run any of the fuzzers, it looks like the sanitizer coverage
hasn't been set up correctly:

% ./bin/llvm-as-fuzzer 2017-08-24 11:14:33
INFO: Seed: 4089166883
INFO: Loaded 1 modules (50607 guards): 50607 [0x10e14ef80, 0x10e18063c),
INFO: Loaded 1 PC tables (0 PCs): 0 [0x10e2870a8,0x10e2870a8),
ERROR: The size of coverage PC tables does not match the number of instrumented PCs. This might be a bug in the compiler, please contact the libFuzzer developers.

From the build logs, it looks like we're now building objects with these
sanitizer flags:

-fsanitize=address
-fsanitize-address-use-after-scope
-fsanitize=fuzzer-no-link

We're then linking the fuzzer binaries with these:

-fsanitize=address
-fsanitize-address-use-after-scope
-fsanitize=fuzzer-no-link
-fsanitize=fuzzer

Any idea what's wrong or where to start looking?
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

George Karpenkov via llvm-dev

unread,
Aug 24, 2017, 2:32:07 PM8/24/17
to Justin Bogner, llvm-dev
Should -DCMAKE_CXX_COMPILER be also specified?

Justin Bogner via llvm-dev

unread,
Aug 24, 2017, 2:39:47 PM8/24/17
to George Karpenkov, llvm-dev
George Karpenkov <ekarp...@apple.com> writes:
> Should -DCMAKE_CXX_COMPILER be also specified?

CMake is smart enough to infer that from C_COMPILER:

% grep CMAKE_CXX_COMPILER CMakeCache.txt
CMAKE_CXX_COMPILER:FILEPATH=/Users/bogner/llvm-lkgc/bin/clang++

George Karpenkov via llvm-dev

unread,
Aug 24, 2017, 2:41:26 PM8/24/17
to Justin Bogner, llvm-dev
Right.
One difference is that I was previously building a debug build, trying release one now.

George Karpenkov via llvm-dev

unread,
Aug 24, 2017, 2:47:39 PM8/24/17
to Justin Bogner, llvm-dev
Indeed, I can reproduce in release build. Looking into it.
As a workaround, for now you could try to use debug build of fuzzers.

Kostya Serebryany via llvm-dev

unread,
Aug 24, 2017, 2:55:51 PM8/24/17
to George Karpenkov, llvm-dev, Mathew Morehouse
+mascasa@ FYI

Also, I am going to file a bug to implement some improvements in the way we build and use fuzz targets in LLVM.

And, take a look at the last night's trophies: https://bugs.chromium.org/p/oss-fuzz/issues/list?q=label:Proj-llvm

George Karpenkov via llvm-dev

unread,
Aug 24, 2017, 5:37:45 PM8/24/17
to Justin Bogner, llvm-dev
OK so with Kuba’s help I’ve found the error: with optimization, dead stripping of produced libraries is enabled,
which removes coverage instrumentation.

However, this has nothing to do with the move to compiler-rt, so I’m quite skeptical on whether it has worked
beforehand.

A trivial fix is to do:

diff --git a/cmake/modules/HandleLLVMOptions.cmake b/cmake/modules/HandleLLVMOptions.cmake
index 04596a6ff63..5465d8d95ba 100644
--- a/cmake/modules/HandleLLVMOptions.cmake
+++ b/cmake/modules/HandleLLVMOptions.cmake
@@ -665,6 +665,9 @@ if(LLVM_USE_SANITIZER)
endif()
if (LLVM_USE_SANITIZE_COVERAGE)
append("-fsanitize=fuzzer-no-link" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
+
+ # Dead stripping messes up coverage instrumentation.
+ set(LLVM_NO_DEAD_STRIP ON)
endif()
endif()

Any arguments against that?

Apparently, a better way is to follow ASAN instrumentation pass,
which uses some magic to protect against dead-stripping.

> On Aug 24, 2017, at 11:29 AM, Justin Bogner <ma...@justinbogner.com> wrote:
>

Kostya Serebryany via llvm-dev

unread,
Aug 24, 2017, 5:40:59 PM8/24/17
to George Karpenkov, llvm-dev
This is quite unexpected. 
Do you have a minimized example of dead stripping eliminating the coverage instrumentation? 

--kcc 

George Karpenkov via llvm-dev

unread,
Aug 24, 2017, 5:43:37 PM8/24/17
to Kostya Serebryany, llvm-dev
Yeah, at least on mac basically nothing works:

george@/Volumes/Transcend/code/llvm (master)./release-build/bin/clang -O3 -isysroot (xcrun --show-sdk-path) -fsanitize=fuzzer,address -Wl,-dead_strip projects/compiler-rt/test/fuzzer/StrcmpTest.cpp
george@/Volumes/Transcend/code/llvm (master)≻ ./a.out
INFO: Seed: 3036650336
INFO: Loaded 1 modules   (8 guards): 8 [0x106c2a440, 0x106c2a460),
INFO: Loaded 1 PC tables (0 PCs): 0 [0x106c2a5d0,0x106c2a5d0),
ERROR: The size of coverage PC tables does not match the number of instrumented PCs. This might be a bug in the compiler, please contact the libFuzzer developers.
numguards=8, NumPCsInPCTables=0, NumInline8bitCounters=0

Justin Bogner via llvm-dev

unread,
Aug 24, 2017, 5:49:45 PM8/24/17
to George Karpenkov, llvm-dev
George Karpenkov <ekarp...@apple.com> writes:
> OK so with Kuba’s help I’ve found the error: with optimization, dead
> stripping of produced libraries is enabled,
> which removes coverage instrumentation.
>
> However, this has nothing to do with the move to compiler-rt, so I’m
> quite skeptical on whether it has worked
> beforehand.
>
> A trivial fix is to do:
>
> diff --git a/cmake/modules/HandleLLVMOptions.cmake b/cmake/modules/HandleLLVMOptions.cmake
> index 04596a6ff63..5465d8d95ba 100644
> --- a/cmake/modules/HandleLLVMOptions.cmake
> +++ b/cmake/modules/HandleLLVMOptions.cmake
> @@ -665,6 +665,9 @@ if(LLVM_USE_SANITIZER)
> endif()
> if (LLVM_USE_SANITIZE_COVERAGE)
> append("-fsanitize=fuzzer-no-link" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
> +
> + # Dead stripping messes up coverage instrumentation.
> + set(LLVM_NO_DEAD_STRIP ON)
> endif()
> endif()
>
> Any arguments against that?

We shouldn't do this. We really only want to prevent dead stripping of
the counters themselves - disabling it completely isn't very nice.

> Apparently, a better way is to follow ASAN instrumentation pass,
> which uses some magic to protect against dead-stripping.

I thought this was already being done - how else did it work before?

Kostya Serebryany via llvm-dev

unread,
Aug 24, 2017, 5:55:50 PM8/24/17
to Justin Bogner, llvm-dev
Interesting. 
This is a relatively new addition (fsanitize-coverage=pc-tables, which is now a part of -fsanitize=fuzzer). 
The tests worked (did they? On Mac?) so I thought everything is ok. 
Yea, we need to make sure the pc-tables are not stripped (this is a separate section with globals). 
(I still haven't documented pc-tables, will do soon)

Do you know what's the analog of Wl,-dead_strip on Linux?

--kcc 
 

George Karpenkov via llvm-dev

unread,
Aug 24, 2017, 6:07:52 PM8/24/17
to Kostya Serebryany, llvm-dev
On Aug 24, 2017, at 2:55 PM, Kostya Serebryany <k...@google.com> wrote:

Interesting. 
This is a relatively new addition (fsanitize-coverage=pc-tables, which is now a part of -fsanitize=fuzzer). 
The tests worked (did they? On Mac?) so I thought everything is ok. 

For tests we never compile the tested target with -O3 (and that wouldn’t be sufficient),
and for testing fuzzers I was always building them in debug 

Yea, we need to make sure the pc-tables are not stripped (this is a separate section with globals). 
(I still haven't documented pc-tables, will do soon)

Do you know what's the analog of Wl,-dead_strip on Linux?

Apparently -Wl,—gc-sections.
For some reason LLVM does not do it for gold, even though it seems to support this flag as well.
(that could be another reason why you don’t see the failure on Linux)

 1 if(NOT LLVM_NO_DEAD_STRIP)
 2   if(${CMAKE_SYSTEM_NAME} MATCHES "Darwin")
 3     # ld64's implementation of -dead_strip breaks tools that use plugins.
 4     set_property(TARGET ${target_name} APPEND_STRING PROPERTY
 5                  LINK_FLAGS " -Wl,-dead_strip")
 6   elseif(${CMAKE_SYSTEM_NAME} MATCHES "SunOS")
 7     set_property(TARGET ${target_name} APPEND_STRING PROPERTY
 8                  LINK_FLAGS " -Wl,-z -Wl,discard-unused=sections")
 9   elseif(NOT WIN32 AND NOT LLVM_LINKER_IS_GOLD)
10     # Object files are compiled with -ffunction-data-sections.
11     # Versions of bfd ld < 2.23.1 have a bug in --gc-sections that breaks
12     # tools that use plugins. Always pass --gc-sections once we require
13     # a newer linker.
14     set_property(TARGET ${target_name} APPEND_STRING PROPERTY
15                  LINK_FLAGS " -Wl,--gc-sections")
16   endif()
17 endif()

Justin Bogner via llvm-dev

unread,
Aug 24, 2017, 6:10:24 PM8/24/17
to Kostya Serebryany, llvm-dev
Kostya Serebryany <k...@google.com> writes:
> Interesting.
> This is a relatively new addition (fsanitize-coverage=pc-tables, which is
> now a part of -fsanitize=fuzzer).
> The tests worked (did they? On Mac?) so I thought everything is ok.

It looks like the tests were passing, and I *think* I'd tried my fuzzer
since that change was in place. Maybe something about how we're linking
in compiler-rt makes it more obvious to the linker that these symbols
look dead?

> Yea, we need to make sure the pc-tables are not stripped (this is a
> separate section with globals).
> (I still haven't documented pc-tables, will do soon)
>
> Do you know what's the analog of Wl,-dead_strip on Linux?

I think the closest thing is -Wl,--gc-sections but that might be less
aggressive about it than macOS's linker is.

George Karpenkov via llvm-dev

unread,
Aug 24, 2017, 6:11:49 PM8/24/17
to Justin Bogner, llvm-dev

> On Aug 24, 2017, at 3:10 PM, Justin Bogner <ma...@justinbogner.com> wrote:
>
> Kostya Serebryany <k...@google.com> writes:
>> Interesting.
>> This is a relatively new addition (fsanitize-coverage=pc-tables, which is
>> now a part of -fsanitize=fuzzer).
>> The tests worked (did they? On Mac?) so I thought everything is ok.
>
> It looks like the tests were passing, and I *think* I'd tried my fuzzer
> since that change was in place.

that wouldn’t have helped: it was specifying coverage flags explicitly,
which did not include the new flag in question.

Kostya Serebryany via llvm-dev

unread,
Aug 24, 2017, 6:17:23 PM8/24/17
to George Karpenkov, llvm-dev
With -Wl,-gc-sections I get this: 
SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]+0x1b): undefined reference to `__start___sancov_pcs'
SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]+0x20): undefined reference to `__stop___sancov_pcs'


Kostya Serebryany via llvm-dev

unread,
Aug 24, 2017, 6:20:17 PM8/24/17
to George Karpenkov, llvm-dev
On Thu, Aug 24, 2017 at 3:16 PM, Kostya Serebryany <k...@google.com> wrote:
With -Wl,-gc-sections I get this: 
SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]+0x1b): undefined reference to `__start___sancov_pcs'
SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]+0x20): undefined reference to `__stop___sancov_pcs'

This happens only with 'ld'. 
lld and gold are fine. 

Justin Bogner via llvm-dev

unread,
Aug 24, 2017, 6:20:54 PM8/24/17
to Kostya Serebryany, llvm-dev
I think the simplest fix is something like this:

diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..e81957ab80a 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module &M, const char *Section,
new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
nullptr, getSectionEnd(Section));
SecEnd->setVisibility(GlobalValue::HiddenVisibility);
+ appendToUsed(M, {SecStart, SecEnd});

return std::make_pair(SecStart, SecEnd);
}

I'm trying it out now.

Kostya Serebryany <k...@google.com> writes:
> With -Wl,-gc-sections I get this:
> SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]+0x1b):
> undefined reference to `__start___sancov_pcs'
> SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]+0x20):
> undefined reference to `__stop___sancov_pcs'
>
>
>
> On Thu, Aug 24, 2017 at 3:07 PM, George Karpenkov <ekarp...@apple.com>
> wrote:
>
>>
>> On Aug 24, 2017, at 2:55 PM, Kostya Serebryany <k...@google.com> wrote:
>>
>> Interesting.
>> This is a relatively new addition (fsanitize-coverage=pc-tables, which is
>> now a part of -fsanitize=fuzzer).
>> The tests worked (did they? On Mac?) so I thought everything is ok.
>>
>>
>> For tests we never compile the tested target with -O3 (and that wouldn’t
>> be sufficient),
>> and for testing fuzzers I was always building them in debug
>>
>> Yea, we need to make sure the pc-tables are not stripped (this is a
>> separate section with globals).
>> (I still haven't documented pc-tables, will do soon)
>>
>>
>> Do you know what's the analog of Wl,-dead_strip on Linux?
>>
>>
>> Apparently -Wl,—gc-sections.
>> For some reason LLVM does not do it for gold, even though it seems to
>> support this flag as well.
>> (that could be another reason why you don’t see the failure on Linux)
>>

>> 1 *if*(NOT LLVM_NO_DEAD_STRIP)
>> 2 *if*(${CMAKE_SYSTEM_NAME} MATCHES "Darwin")


>> 3 # ld64's implementation of -dead_strip breaks tools that use
>> plugins.
>> 4 set_property(TARGET ${target_name} APPEND_STRING PROPERTY
>> 5 LINK_FLAGS " -Wl,-dead_strip")

>> 6 *elseif*(${CMAKE_SYSTEM_NAME} MATCHES "SunOS")


>> 7 set_property(TARGET ${target_name} APPEND_STRING PROPERTY
>> 8 LINK_FLAGS " -Wl,-z -Wl,discard-unused=sections")

>> 9 *elseif*(NOT WIN32 AND NOT LLVM_LINKER_IS_GOLD)


>> 10 # Object files are compiled with -ffunction-data-sections.
>> 11 # Versions of bfd ld < 2.23.1 have a bug in --gc-sections that
>> breaks
>> 12 # tools that use plugins. Always pass --gc-sections once we require
>> 13 # a newer linker.
>> 14 set_property(TARGET ${target_name} APPEND_STRING PROPERTY
>> 15 LINK_FLAGS " -Wl,--gc-sections")

>> 16 *endif*()
>> 17 *endif*()

Kostya Serebryany via llvm-dev

unread,
Aug 24, 2017, 6:21:28 PM8/24/17
to George Karpenkov, llvm-dev, Peter Collingbourne
+pcc (for the discussion on how to preserve a section when -Wl,-gc-sections is used)

Kostya Serebryany via llvm-dev

unread,
Aug 24, 2017, 6:22:20 PM8/24/17
to Justin Bogner, llvm-dev, Peter Collingbourne
On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <ma...@justinbogner.com> wrote:
I think the simplest fix is something like this:

diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..e81957ab80a 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module &M, const char *Section,
       new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
                          nullptr, getSectionEnd(Section));
   SecEnd->setVisibility(GlobalValue::HiddenVisibility);
+  appendToUsed(M, {SecStart, SecEnd});

   return std::make_pair(SecStart, SecEnd);
 }

I'm trying it out now.

LGTM (if this works), thanks! 

Peter Collingbourne via llvm-dev

unread,
Aug 24, 2017, 6:35:58 PM8/24/17
to Kostya Serebryany, llvm-dev
On Thu, Aug 24, 2017 at 3:21 PM, Kostya Serebryany via llvm-dev <llvm...@lists.llvm.org> wrote:


On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <ma...@justinbogner.com> wrote:
I think the simplest fix is something like this:

diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..e81957ab80a 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module &M, const char *Section,
       new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
                          nullptr, getSectionEnd(Section));
   SecEnd->setVisibility(GlobalValue::HiddenVisibility);
+  appendToUsed(M, {SecStart, SecEnd});

   return std::make_pair(SecStart, SecEnd);
 }

I'm trying it out now.

LGTM (if this works), thanks! 

I wouldn't expect that to work because for ELF targets llvm.used has no effect on the object file (only on the optimizer).

Is there a simple way to reproduce the link failure?

Peter

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




--
-- 
Peter

Kostya Serebryany via llvm-dev

unread,
Aug 24, 2017, 6:38:38 PM8/24/17
to Peter Collingbourne, llvm-dev
On Thu, Aug 24, 2017 at 3:35 PM, Peter Collingbourne <pe...@pcc.me.uk> wrote:
On Thu, Aug 24, 2017 at 3:21 PM, Kostya Serebryany via llvm-dev <llvm...@lists.llvm.org> wrote:


On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <ma...@justinbogner.com> wrote:
I think the simplest fix is something like this:

diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..e81957ab80a 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module &M, const char *Section,
       new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
                          nullptr, getSectionEnd(Section));
   SecEnd->setVisibility(GlobalValue::HiddenVisibility);
+  appendToUsed(M, {SecStart, SecEnd});

   return std::make_pair(SecStart, SecEnd);
 }

I'm trying it out now.

LGTM (if this works), thanks! 

I wouldn't expect that to work because for ELF targets llvm.used has no effect on the object file (only on the optimizer).

Is there a simple way to reproduce the link failure?


ninja compiler-rt 
echo 'extern "C" int LLVMFuzzerTestOneInput(const unsigned char *a, unsigned long b){return 0; } ' > test.cc
clang -O3 test.cc   -fsanitize=fuzzer # works
clang -O3 test.cc  -Wl,-gc-sections -fsanitize=fuzzer # fails

Kostya Serebryany via llvm-dev

unread,
Aug 24, 2017, 6:43:59 PM8/24/17
to Peter Collingbourne, llvm-dev
FTR: 
r311719 adds the docs for -fsanitize-coverage=pc-table and -fsanitize-coverage=inline-8bit-counters

Justin Bogner via llvm-dev

unread,
Aug 24, 2017, 7:22:44 PM8/24/17
to Peter Collingbourne, llvm-dev
Peter Collingbourne <pe...@pcc.me.uk> writes:
> On Thu, Aug 24, 2017 at 3:21 PM, Kostya Serebryany via llvm-dev <
> llvm...@lists.llvm.org> wrote:
>
>>
>>
>> On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <ma...@justinbogner.com>
>> wrote:
>>
>>> I think the simplest fix is something like this:
>>>
>>> diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>>> b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>>> index c6f0d17f8fe..e81957ab80a 100644
>>> --- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>>> +++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>>> @@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module
>>> &M, const char *Section,
>>> new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
>>> nullptr, getSectionEnd(Section));
>>> SecEnd->setVisibility(GlobalValue::HiddenVisibility);
>>> + appendToUsed(M, {SecStart, SecEnd});
>>>
>>> return std::make_pair(SecStart, SecEnd);
>>> }
>>>
>>> I'm trying it out now.
>>>
>>
>> LGTM (if this works), thanks!
>
> I wouldn't expect that to work because for ELF targets llvm.used has no
> effect on the object file (only on the optimizer).

Interesting. Appending to llvm.used is the only thing that's done to
keep variables alive in the PGO instrumentation, and it seems to work
in practice.

In any case, the first patch handled the wrong variables - those section
start and end variables aren't stripped. The symbol that's being
stripped is actually a global array with private linkage inside the
section, and the following patch works on macOS:

diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..fdf265143fd 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -557,6 +557,10 @@ void SanitizerCoverageModule::CreatePCArray(Function &F,
FunctionPCsArray->setInitializer(
ConstantArray::get(ArrayType::get(Int8PtrTy, N), PCs));
FunctionPCsArray->setConstant(true);
+
+ // We don't reference the PCs array in any of our runtime functions, so we
+ // need to prevent it from being dead stripped.
+ appendToUsed(*F.getParent(), {FunctionPCsArray});
}

void SanitizerCoverageModule::CreateFunctionLocalArrays(

I'm building on linux now to see what happens.

Peter Collingbourne via llvm-dev

unread,
Aug 24, 2017, 7:24:58 PM8/24/17
to Kostya Serebryany, llvm-dev
On Thu, Aug 24, 2017 at 3:38 PM, Kostya Serebryany <k...@google.com> wrote:


On Thu, Aug 24, 2017 at 3:35 PM, Peter Collingbourne <pe...@pcc.me.uk> wrote:
On Thu, Aug 24, 2017 at 3:21 PM, Kostya Serebryany via llvm-dev <llvm...@lists.llvm.org> wrote:


On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <ma...@justinbogner.com> wrote:
I think the simplest fix is something like this:

diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..e81957ab80a 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module &M, const char *Section,
       new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
                          nullptr, getSectionEnd(Section));
   SecEnd->setVisibility(GlobalValue::HiddenVisibility);
+  appendToUsed(M, {SecStart, SecEnd});

   return std::make_pair(SecStart, SecEnd);
 }

I'm trying it out now.

LGTM (if this works), thanks! 

I wouldn't expect that to work because for ELF targets llvm.used has no effect on the object file (only on the optimizer).

Is there a simple way to reproduce the link failure?


ninja compiler-rt 
echo 'extern "C" int LLVMFuzzerTestOneInput(const unsigned char *a, unsigned long b){return 0; } ' > test.cc
clang -O3 test.cc   -fsanitize=fuzzer # works
clang -O3 test.cc  -Wl,-gc-sections -fsanitize=fuzzer # fails

It seems that the issue is that older versions of ld.bfd have a bug which causes it not to define __start_ and __stop_ symbols if the only reference to those symbols is from a constructor.

If I add an artificial reference to the start symbol from libfuzzer's main function, the program links correctly.

diff --git a/compiler-rt/lib/fuzzer/FuzzerMain.cpp b/compiler-rt/lib/fuzzer/FuzzerMain.cpp
index af8657200be2..c41e28e012db 100644
--- a/compiler-rt/lib/fuzzer/FuzzerMain.cpp
+++ b/compiler-rt/lib/fuzzer/FuzzerMain.cpp
@@ -16,6 +16,10 @@ extern "C" {
 int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size);
 }  // extern "C"
 
+__attribute__((weak)) void nop(void *p) {}
+extern void *__start___sancov_pcs;
+
 int main(int argc, char **argv) {
+  nop(__start___sancov_pcs);
   return fuzzer::FuzzerDriver(&argc, &argv, LLVMFuzzerTestOneInput);
 }

The problem also goes away if I use "GNU ld (GNU Binutils) 2.28.51.20170105".

Peter



--
-- 
Peter

Justin Bogner via llvm-dev

unread,
Aug 24, 2017, 9:31:01 PM8/24/17
to Peter Collingbourne, llvm-dev

It looks like this is a different problem from the one on macOS (and I
wasn't able to reproduce it with any bfd ld I had available, they were
all too new)

I've gone ahead and fixed the issue on macOS in r311742.

> If I add an artificial reference to the start symbol from libfuzzer's main
> function, the program links correctly.
>
> diff --git a/compiler-rt/lib/fuzzer/FuzzerMain.cpp
> b/compiler-rt/lib/fuzzer/FuzzerMain.cpp
> index af8657200be2..c41e28e012db 100644
> --- a/compiler-rt/lib/fuzzer/FuzzerMain.cpp
> +++ b/compiler-rt/lib/fuzzer/FuzzerMain.cpp
> @@ -16,6 +16,10 @@ extern "C" {
> int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size);
> } // extern "C"
>
> +__attribute__((weak)) void nop(void *p) {}
> +extern void *__start___sancov_pcs;
> +
> int main(int argc, char **argv) {
> + nop(__start___sancov_pcs);
> return fuzzer::FuzzerDriver(&argc, &argv, LLVMFuzzerTestOneInput);
> }

If we were to do this, we'd have to guard it appropriately - not all
platforms name the __start symbols like this.

> The problem also goes away if I use "GNU ld (GNU Binutils)
> 2.28.51.20170105".

2.27 also doesn't have the issue. I don't know what our minimum version
of binutils is, and I'm under the impression most people use gold or lld
to link LLVM these days, so it isn't clear to me how big of a problem
this is.

>>>>> >>> >> INFO: Seed: 4089166883 <(408)%20916-6883>

Peter Collingbourne via llvm-dev

unread,
Aug 24, 2017, 9:43:28 PM8/24/17
to Justin Bogner, llvm-dev
Of course. There's also the issue of how to keep the symbols alive in DSOs.

> The problem also goes away if I use "GNU ld (GNU Binutils)
> 2.28.51.20170105".

2.27 also doesn't have the issue. I don't know what our minimum version
of binutils is, and I'm under the impression most people use gold or lld
to link LLVM these days, so it isn't clear to me how big of a problem
this is.

For the record, the problem reproduces under 2.24, which is shipped by Ubuntu 14.04 LTS, which isn't that old. My view is that if we can find an unintrusive enough workaround, we should deploy it (with a comment to remove it after N years).

Peter



--
-- 
Peter

Ed Maste via llvm-dev

unread,
Aug 26, 2017, 10:54:19 AM8/26/17
to Justin Bogner, llvm-dev
On 24 August 2017 at 21:30, Justin Bogner via llvm-dev

<llvm...@lists.llvm.org> wrote:
>
> 2.27 also doesn't have the issue. I don't know what our minimum version
> of binutils is, and I'm under the impression most people use gold or lld
> to link LLVM these days, so it isn't clear to me how big of a problem
> this is.

I don't know that we specify a minimum. On FreeBSD at least we still
have GNU ld from binutils 2.17.50 on all platforms except arm64 (where
we've switched to lld as the system linker).

Kamil Rytarowski via llvm-dev

unread,
Aug 26, 2017, 11:29:40 AM8/26/17
to Ed Maste, Justin Bogner, llvm-dev
On 26.08.2017 16:53, Ed Maste via llvm-dev wrote:
> On 24 August 2017 at 21:30, Justin Bogner via llvm-dev
> <llvm...@lists.llvm.org> wrote:
>>
>> 2.27 also doesn't have the issue. I don't know what our minimum version
>> of binutils is, and I'm under the impression most people use gold or lld
>> to link LLVM these days, so it isn't clear to me how big of a problem
>> this is.
>
> I don't know that we specify a minimum. On FreeBSD at least we still
> have GNU ld from binutils 2.17.50 on all platforms except arm64 (where
> we've switched to lld as the system linker).

NetBSD uses GNU ld(1).

$ ld -v
GNU ld (NetBSD Binutils nb1) 2.27

I have a local patches for libfuzzer and I was passing most of the
tests... but not all useful sanitizers are already aboard.

signature.asc

Kostya Serebryany via llvm-dev

unread,
Sep 9, 2017, 1:36:44 AM9/9/17
to Peter Collingbourne, llvm-dev
Justin, 
Calling appendToUsed has horrible complexity and if we call it in every function clang
consumes tons of memory (6Gb when compiling one of the clang's source files).
This killed my machine today :) 

The solution is to call appendToUsed once per module, instead of once per function. 
Also, since this does not seem to be required for linux, I've put this under if TargetTriple.isOSBinFormatMachO
Submitted r312855, I'll see if this breaks Mac (there seem to be no llvm tests for this, only compiler-rt tests)
but please also check if this looks ok. 

But this all still sounds bad on linux at least:
  * with the old bfd linker and -ffunction-sections -Wl,-gc-sections these arrays get removed (as discussed here)  
  * with newer linkers the sanitizer coverage essentially disables gc-sections 

--kcc 

Justin Bogner via llvm-dev

unread,
Sep 11, 2017, 2:53:28 PM9/11/17
to Kostya Serebryany, llvm-dev
Kostya Serebryany <k...@google.com> writes:
> Justin,

> Calling appendToUsed has horrible complexity and if we call it in
> every function clang consumes tons of memory (6Gb when compiling one
> of the clang's source files). This killed my machine today :)
>
> The solution is to call appendToUsed once per module, instead of once
> per function.

Oh right, updating lists in metadata is O(n), so doing it per function
is quadratic. This slipped my mind - sorry!

> Also, since this does not seem to be required for linux, I've put this
> under if TargetTriple.isOSBinFormatMachO Submitted r312855, I'll see
> if this breaks Mac (there seem to be no llvm tests for this, only
> compiler-rt tests) but please also check if this looks ok.

This looks fine, though I'd rather if we just did it on all platforms
for consistency / clear semantic intent. Running appendToUsed once
should be cheap, and we do it unguarded in our other instrumentation
(like InstrProfiling.cpp).

> But this all still sounds bad on linux at least:
> * with the old bfd linker and -ffunction-sections -Wl,-gc-sections these
> arrays get removed (as discussed here)

This is sad, and I don't think we have any particularly good ideas to
fix this.

> * with newer linkers the sanitizer coverage essentially disables
> gc-sections

I'm not sure I follow. We're making sure the global arrays /
instrumentation data doesn't get dead stripped here, but dead stripping
of functions should work as normal.

Kostya Serebryany via llvm-dev

unread,
Sep 11, 2017, 9:57:53 PM9/11/17
to Justin Bogner, llvm-dev
On Mon, Sep 11, 2017 at 11:52 AM, Justin Bogner <ma...@justinbogner.com> wrote:
Kostya Serebryany <k...@google.com> writes:
> Justin,

> Calling appendToUsed has horrible complexity and if we call it in
> every function clang consumes tons of memory (6Gb when compiling one
> of the clang's source files).  This killed my machine today :)
>
> The solution is to call appendToUsed once per module, instead of once
> per function.

Oh right, updating lists in metadata is O(n), so doing it per function
is quadratic. This slipped my mind - sorry!

Yea. Very unexpected (although I've stumbled on it a few times, I still don't remember about it)
 

> Also, since this does not seem to be required for linux, I've put this
> under if TargetTriple.isOSBinFormatMachO Submitted r312855, I'll see
> if this breaks Mac (there seem to be no llvm tests for this, only
> compiler-rt tests) but please also check if this looks ok.

This looks fine, though I'd rather if we just did it on all platforms
for consistency / clear semantic intent. Running appendToUsed once
should be cheap, and we do it unguarded in our other instrumentation
(like InstrProfiling.cpp).

It's unclear if it's going to hurt things on linux. 
If I see it doesn't -- I'll remove the if TargetTriple.isOSBinFormatMachO
 

> But this all still sounds bad on linux at least:
>   * with the old bfd linker and -ffunction-sections -Wl,-gc-sections these
>     arrays get removed (as discussed here)

This is sad, and I don't think we have any particularly good ideas to
fix this.

>   * with newer linkers the sanitizer coverage essentially disables
>     gc-sections

I'm not sure I follow. We're making sure the global arrays /
instrumentation data doesn't get dead stripped here, but dead stripping
of functions should work as normal.

Ah, that's now my confusion. 
inline-8bit-counters and trace-pc-guard seems to work just fine. 
pc-table actually does break gc-sections as there is no a reference to every function in the pc-table section. 

+eugenis, who dealt with a similar issue in asan (although it's probably a separate topic now). 

--kcc 

Kostya Serebryany via llvm-dev

unread,
Sep 15, 2017, 7:26:45 PM9/15/17
to Justin Bogner, llvm-dev, Mathew Morehouse
Two wrap up this discussion I've filed https://bugs.llvm.org/show_bug.cgi?id=34636 (for Matt)

Saleem Abdulrasool via llvm-dev

unread,
Sep 16, 2017, 5:28:45 PM9/16/17
to Justin Bogner, llvm-dev
It can be made as aggressive.  You need to compile with `-ffunction-sections` `-fdata-sections` and then link with `-Xlinker --gc-sections`.  ld64 is more aggressive due to `.subsections_via_symbols` (aka the two compile flags).
--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org
Reply all
Reply to author
Forward
0 new messages