My configuration for building the fuzzers in the LLVM tree doesn't seem to
work any more (possibly as of moving libFuzzer to compiler-rt, but there
have been a few other changes in the last week or so that may be related).
I'm building with a fresh top-of-tree clang and setting
-DLLVM_USE_SANITIZER=Address and -DLLVM_USE_SANITIZE_COVERAGE=On, which
was working before:
% cmake -GNinja \
-DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=On \
-DLLVM_ENABLE_WERROR=On \
-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On \
-DCMAKE_C_COMPILER=$HOME/llvm-lkgc/bin/clang \
$HOME/code/llvm-src
But when I run any of the fuzzers, it looks like the sanitizer coverage
hasn't been set up correctly:
% ./bin/llvm-as-fuzzer 2017-08-24 11:14:33
INFO: Seed: 4089166883
INFO: Loaded 1 modules (50607 guards): 50607 [0x10e14ef80, 0x10e18063c),
INFO: Loaded 1 PC tables (0 PCs): 0 [0x10e2870a8,0x10e2870a8),
ERROR: The size of coverage PC tables does not match the number of instrumented PCs. This might be a bug in the compiler, please contact the libFuzzer developers.
From the build logs, it looks like we're now building objects with these
sanitizer flags:
-fsanitize=address
-fsanitize-address-use-after-scope
-fsanitize=fuzzer-no-link
We're then linking the fuzzer binaries with these:
-fsanitize=address
-fsanitize-address-use-after-scope
-fsanitize=fuzzer-no-link
-fsanitize=fuzzer
Any idea what's wrong or where to start looking?
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
CMake is smart enough to infer that from C_COMPILER:
% grep CMAKE_CXX_COMPILER CMakeCache.txt
CMAKE_CXX_COMPILER:FILEPATH=/Users/bogner/llvm-lkgc/bin/clang++
However, this has nothing to do with the move to compiler-rt, so I’m quite skeptical on whether it has worked
beforehand.
A trivial fix is to do:
diff --git a/cmake/modules/HandleLLVMOptions.cmake b/cmake/modules/HandleLLVMOptions.cmake
index 04596a6ff63..5465d8d95ba 100644
--- a/cmake/modules/HandleLLVMOptions.cmake
+++ b/cmake/modules/HandleLLVMOptions.cmake
@@ -665,6 +665,9 @@ if(LLVM_USE_SANITIZER)
endif()
if (LLVM_USE_SANITIZE_COVERAGE)
append("-fsanitize=fuzzer-no-link" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
+
+ # Dead stripping messes up coverage instrumentation.
+ set(LLVM_NO_DEAD_STRIP ON)
endif()
endif()
Any arguments against that?
Apparently, a better way is to follow ASAN instrumentation pass,
which uses some magic to protect against dead-stripping.
> On Aug 24, 2017, at 11:29 AM, Justin Bogner <ma...@justinbogner.com> wrote:
>
We shouldn't do this. We really only want to prevent dead stripping of
the counters themselves - disabling it completely isn't very nice.
> Apparently, a better way is to follow ASAN instrumentation pass,
> which uses some magic to protect against dead-stripping.
I thought this was already being done - how else did it work before?
On Aug 24, 2017, at 2:55 PM, Kostya Serebryany <k...@google.com> wrote:Interesting.This is a relatively new addition (fsanitize-coverage=pc-tables, which is now a part of -fsanitize=fuzzer).The tests worked (did they? On Mac?) so I thought everything is ok.
Yea, we need to make sure the pc-tables are not stripped (this is a separate section with globals).(I still haven't documented pc-tables, will do soon)
Do you know what's the analog of Wl,-dead_strip on Linux?
It looks like the tests were passing, and I *think* I'd tried my fuzzer
since that change was in place. Maybe something about how we're linking
in compiler-rt makes it more obvious to the linker that these symbols
look dead?
> Yea, we need to make sure the pc-tables are not stripped (this is a
> separate section with globals).
> (I still haven't documented pc-tables, will do soon)
>
> Do you know what's the analog of Wl,-dead_strip on Linux?
I think the closest thing is -Wl,--gc-sections but that might be less
aggressive about it than macOS's linker is.
that wouldn’t have helped: it was specifying coverage flags explicitly,
which did not include the new flag in question.
With -Wl,-gc-sections I get this:SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]+0x1b): undefined reference to `__start___sancov_pcs'SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]+0x20): undefined reference to `__stop___sancov_pcs'
diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..e81957ab80a 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module &M, const char *Section,
new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
nullptr, getSectionEnd(Section));
SecEnd->setVisibility(GlobalValue::HiddenVisibility);
+ appendToUsed(M, {SecStart, SecEnd});
return std::make_pair(SecStart, SecEnd);
}
I'm trying it out now.
Kostya Serebryany <k...@google.com> writes:
> With -Wl,-gc-sections I get this:
> SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]+0x1b):
> undefined reference to `__start___sancov_pcs'
> SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]+0x20):
> undefined reference to `__stop___sancov_pcs'
>
>
>
> On Thu, Aug 24, 2017 at 3:07 PM, George Karpenkov <ekarp...@apple.com>
> wrote:
>
>>
>> On Aug 24, 2017, at 2:55 PM, Kostya Serebryany <k...@google.com> wrote:
>>
>> Interesting.
>> This is a relatively new addition (fsanitize-coverage=pc-tables, which is
>> now a part of -fsanitize=fuzzer).
>> The tests worked (did they? On Mac?) so I thought everything is ok.
>>
>>
>> For tests we never compile the tested target with -O3 (and that wouldn’t
>> be sufficient),
>> and for testing fuzzers I was always building them in debug
>>
>> Yea, we need to make sure the pc-tables are not stripped (this is a
>> separate section with globals).
>> (I still haven't documented pc-tables, will do soon)
>>
>>
>> Do you know what's the analog of Wl,-dead_strip on Linux?
>>
>>
>> Apparently -Wl,—gc-sections.
>> For some reason LLVM does not do it for gold, even though it seems to
>> support this flag as well.
>> (that could be another reason why you don’t see the failure on Linux)
>>
>> 1 *if*(NOT LLVM_NO_DEAD_STRIP)
>> 2 *if*(${CMAKE_SYSTEM_NAME} MATCHES "Darwin")
>> 3 # ld64's implementation of -dead_strip breaks tools that use
>> plugins.
>> 4 set_property(TARGET ${target_name} APPEND_STRING PROPERTY
>> 5 LINK_FLAGS " -Wl,-dead_strip")
>> 6 *elseif*(${CMAKE_SYSTEM_NAME} MATCHES "SunOS")
>> 7 set_property(TARGET ${target_name} APPEND_STRING PROPERTY
>> 8 LINK_FLAGS " -Wl,-z -Wl,discard-unused=sections")
>> 9 *elseif*(NOT WIN32 AND NOT LLVM_LINKER_IS_GOLD)
>> 10 # Object files are compiled with -ffunction-data-sections.
>> 11 # Versions of bfd ld < 2.23.1 have a bug in --gc-sections that
>> breaks
>> 12 # tools that use plugins. Always pass --gc-sections once we require
>> 13 # a newer linker.
>> 14 set_property(TARGET ${target_name} APPEND_STRING PROPERTY
>> 15 LINK_FLAGS " -Wl,--gc-sections")
>> 16 *endif*()
>> 17 *endif*()
I think the simplest fix is something like this:
diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..e81957ab80a 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module &M, const char *Section,
new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
nullptr, getSectionEnd(Section));
SecEnd->setVisibility(GlobalValue::HiddenVisibility);
+ appendToUsed(M, {SecStart, SecEnd});
return std::make_pair(SecStart, SecEnd);
}
I'm trying it out now.
On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <ma...@justinbogner.com> wrote:I think the simplest fix is something like this:
diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..e81957ab80a 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module &M, const char *Section,
new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
nullptr, getSectionEnd(Section));
SecEnd->setVisibility(GlobalValue::HiddenVisibility);
+ appendToUsed(M, {SecStart, SecEnd});
return std::make_pair(SecStart, SecEnd);
}
I'm trying it out now.LGTM (if this works), thanks!
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
On Thu, Aug 24, 2017 at 3:21 PM, Kostya Serebryany via llvm-dev <llvm...@lists.llvm.org> wrote:On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <ma...@justinbogner.com> wrote:I think the simplest fix is something like this:
diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..e81957ab80a 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module &M, const char *Section,
new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
nullptr, getSectionEnd(Section));
SecEnd->setVisibility(GlobalValue::HiddenVisibility);
+ appendToUsed(M, {SecStart, SecEnd});
return std::make_pair(SecStart, SecEnd);
}
I'm trying it out now.LGTM (if this works), thanks!I wouldn't expect that to work because for ELF targets llvm.used has no effect on the object file (only on the optimizer).Is there a simple way to reproduce the link failure?
Interesting. Appending to llvm.used is the only thing that's done to
keep variables alive in the PGO instrumentation, and it seems to work
in practice.
In any case, the first patch handled the wrong variables - those section
start and end variables aren't stripped. The symbol that's being
stripped is actually a global array with private linkage inside the
section, and the following patch works on macOS:
diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..fdf265143fd 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -557,6 +557,10 @@ void SanitizerCoverageModule::CreatePCArray(Function &F,
FunctionPCsArray->setInitializer(
ConstantArray::get(ArrayType::get(Int8PtrTy, N), PCs));
FunctionPCsArray->setConstant(true);
+
+ // We don't reference the PCs array in any of our runtime functions, so we
+ // need to prevent it from being dead stripped.
+ appendToUsed(*F.getParent(), {FunctionPCsArray});
}
void SanitizerCoverageModule::CreateFunctionLocalArrays(
I'm building on linux now to see what happens.
On Thu, Aug 24, 2017 at 3:35 PM, Peter Collingbourne <pe...@pcc.me.uk> wrote:On Thu, Aug 24, 2017 at 3:21 PM, Kostya Serebryany via llvm-dev <llvm...@lists.llvm.org> wrote:On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <ma...@justinbogner.com> wrote:I think the simplest fix is something like this:
diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index c6f0d17f8fe..e81957ab80a 100644
--- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -256,6 +256,7 @@ SanitizerCoverageModule::CreateSecStartEnd(Module &M, const char *Section,
new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
nullptr, getSectionEnd(Section));
SecEnd->setVisibility(GlobalValue::HiddenVisibility);
+ appendToUsed(M, {SecStart, SecEnd});
return std::make_pair(SecStart, SecEnd);
}
I'm trying it out now.LGTM (if this works), thanks!I wouldn't expect that to work because for ELF targets llvm.used has no effect on the object file (only on the optimizer).Is there a simple way to reproduce the link failure?ninja compiler-rtecho 'extern "C" int LLVMFuzzerTestOneInput(const unsigned char *a, unsigned long b){return 0; } ' > test.ccclang -O3 test.cc -fsanitize=fuzzer # worksclang -O3 test.cc -Wl,-gc-sections -fsanitize=fuzzer # fails
It looks like this is a different problem from the one on macOS (and I
wasn't able to reproduce it with any bfd ld I had available, they were
all too new)
I've gone ahead and fixed the issue on macOS in r311742.
> If I add an artificial reference to the start symbol from libfuzzer's main
> function, the program links correctly.
>
> diff --git a/compiler-rt/lib/fuzzer/FuzzerMain.cpp
> b/compiler-rt/lib/fuzzer/FuzzerMain.cpp
> index af8657200be2..c41e28e012db 100644
> --- a/compiler-rt/lib/fuzzer/FuzzerMain.cpp
> +++ b/compiler-rt/lib/fuzzer/FuzzerMain.cpp
> @@ -16,6 +16,10 @@ extern "C" {
> int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size);
> } // extern "C"
>
> +__attribute__((weak)) void nop(void *p) {}
> +extern void *__start___sancov_pcs;
> +
> int main(int argc, char **argv) {
> + nop(__start___sancov_pcs);
> return fuzzer::FuzzerDriver(&argc, &argv, LLVMFuzzerTestOneInput);
> }
If we were to do this, we'd have to guard it appropriately - not all
platforms name the __start symbols like this.
> The problem also goes away if I use "GNU ld (GNU Binutils)
> 2.28.51.20170105".
2.27 also doesn't have the issue. I don't know what our minimum version
of binutils is, and I'm under the impression most people use gold or lld
to link LLVM these days, so it isn't clear to me how big of a problem
this is.
>>>>> >>> >> INFO: Seed: 4089166883 <(408)%20916-6883>
> The problem also goes away if I use "GNU ld (GNU Binutils)
> 2.28.51.20170105".
2.27 also doesn't have the issue. I don't know what our minimum version
of binutils is, and I'm under the impression most people use gold or lld
to link LLVM these days, so it isn't clear to me how big of a problem
this is.
I don't know that we specify a minimum. On FreeBSD at least we still
have GNU ld from binutils 2.17.50 on all platforms except arm64 (where
we've switched to lld as the system linker).
> Calling appendToUsed has horrible complexity and if we call it in
> every function clang consumes tons of memory (6Gb when compiling one
> of the clang's source files). This killed my machine today :)
>
> The solution is to call appendToUsed once per module, instead of once
> per function.
Oh right, updating lists in metadata is O(n), so doing it per function
is quadratic. This slipped my mind - sorry!
> Also, since this does not seem to be required for linux, I've put this
> under if TargetTriple.isOSBinFormatMachO Submitted r312855, I'll see
> if this breaks Mac (there seem to be no llvm tests for this, only
> compiler-rt tests) but please also check if this looks ok.
This looks fine, though I'd rather if we just did it on all platforms
for consistency / clear semantic intent. Running appendToUsed once
should be cheap, and we do it unguarded in our other instrumentation
(like InstrProfiling.cpp).
> But this all still sounds bad on linux at least:
> * with the old bfd linker and -ffunction-sections -Wl,-gc-sections these
> arrays get removed (as discussed here)
This is sad, and I don't think we have any particularly good ideas to
fix this.
> * with newer linkers the sanitizer coverage essentially disables
> gc-sections
I'm not sure I follow. We're making sure the global arrays /
instrumentation data doesn't get dead stripped here, but dead stripping
of functions should work as normal.
Kostya Serebryany <k...@google.com> writes:
> Justin,
> Calling appendToUsed has horrible complexity and if we call it in
> every function clang consumes tons of memory (6Gb when compiling one
> of the clang's source files). This killed my machine today :)
>
> The solution is to call appendToUsed once per module, instead of once
> per function.
Oh right, updating lists in metadata is O(n), so doing it per function
is quadratic. This slipped my mind - sorry!
> Also, since this does not seem to be required for linux, I've put this
> under if TargetTriple.isOSBinFormatMachO Submitted r312855, I'll see
> if this breaks Mac (there seem to be no llvm tests for this, only
> compiler-rt tests) but please also check if this looks ok.
This looks fine, though I'd rather if we just did it on all platforms
for consistency / clear semantic intent. Running appendToUsed once
should be cheap, and we do it unguarded in our other instrumentation
(like InstrProfiling.cpp).
> But this all still sounds bad on linux at least:
> * with the old bfd linker and -ffunction-sections -Wl,-gc-sections these
> arrays get removed (as discussed here)
This is sad, and I don't think we have any particularly good ideas to
fix this.
> * with newer linkers the sanitizer coverage essentially disables
> gc-sections
I'm not sure I follow. We're making sure the global arrays /
instrumentation data doesn't get dead stripped here, but dead stripping
of functions should work as normal.