This is needed in case the users of libFuzzer use libc++ in their code, which the fuzz target (libFuzzer) will be linked against. When libc++ source is available, we build a private version of it and link it against libFuzzer which allows using the same static library against codebases which use both libc++ and libstdc++.
Details
- Reviewers
kcc george.karpenkov EricWF vitalybuka beanz morehouse - Commits
- rGeac2b47b9fcd: Reland "[libFuzzer] Support using libc++"
rGa1b57e694ee9: [libFuzzer] Support using libc++
rCRT322755: Reland "[libFuzzer] Support using libc++"
rL322755: Reland "[libFuzzer] Support using libc++"
rL322604: [libFuzzer] Support using libc++
rCRT322604: [libFuzzer] Support using libc++
Diff Detail
- Repository
- rCRT Compiler Runtime
Event Timeline
Minor nitpick: maybe print a message if user ask for LIBCXX but that's not available?
Hm. Shouldn't we build two versions of libFuzzer by default?
(Oh my, this is one more trouble with using STL in a low-level library)
CMakeLists.txt | ||
---|---|---|
143 ↗ | (On Diff #114392) | Why do we need it here? |
@kcc terrible suggestion: can we just copy'n paste vector/set/string implementations into libFuzzer from libc++? IIRC the license should be fine with it, and header libraries are recompiled with the source anyway.
Copy-paste -- no.
But if we can completely privatize the use of STL with some trick, that would be great (e.g. #define the namespace name to be something other than std).
This may work with STL containers, likely to work with STL algorithms, and unlikely to work with STL streams. If streams are the only problem, we could probably get rid of them.
Building two versions would work. We would need to alter the naming scheme and modify the driver to pick up the right version depending on the value -stdlib. We'll need some platform specific code to handle Darwin and Fuchsia which only support libc++.
If we can privatize STL this would be a much more preferable solution -- no driver changes and multiple versions.
This may also solve libFuzzer+msan (right now we have to recompile libFuzzer for msan).
I managed to "internalize" the names using _LIBCPP_ABI_VERSION, but now libFuzzer is breaking at link time because of undefined references (as expected). To solve that, we would probably have recompile libc++ using the internalized names and then package all the object files (or at least the necessary subset) into libFuzzer archive.
We can probably enforce by adding an explicit instantiations inside libFuzzer akin to https://github.com/llvm-mirror/libcxx/blob/master/src/vector.cpp#L14 and https://github.com/llvm-mirror/libcxx/blob/master/src/string.cpp#L22. I'll give it a try.
Hmm, it does seem to work, but there are more things that need to be pulled in so I'm somewhat concerned that if I keep pulling the thread, I'll eventually pull the entire libc++.
Here is the full list: https://reviews.llvm.org/P8035. Mostly streams, some system stuff, mutex and thread.
If we can privatize vector, string, the other containers, and algorithms,
Will we be able to privatize sort? (I can see it in your output)
I'm ok to rewrite thread/mutex/stream/clock using C library (I can even do part of that). Will this be enough?
I think it's possible but it won't be pretty. It's also complicated by the fact that libFuzzer is built as static library. What we would have to do is to build custom libc++ using a custom namespace, then build libFuzzer against that library, and finally combine the two static libraries together similarly to what libc++ is doing when merging libc++abi (https://github.com/llvm-mirror/libcxx/blob/master/lib/CMakeLists.txt#L268). I can give it a try if you're fine with that solution.
finally combine the two static libraries
libFuzzer.a is an implementation detail of -fsanitize=fuzzer
so why not just have two .a libs (libFuzzer.a and libFuzzer-private-libc++.a) and use both of them at link time?
This sounds reasonable to me. Libc++ could add an option that allows you to control the inline namespace it uses, hence giving you a version of libc++ that isn't necessarily *private* but is, at least, local to libFuzzer. After merging the archives, you shouldn't depend or conflict with external libc++ versions (**except for non-versioned types like exceptions, i'm not sure how relevant that is)
This does seem to work. Right now it's only used if COMPILER_RT_ENABLE_LIBCXX is set, but we can make it the default whenever libc++ source is available if that's fine with you.
I would rather require libc++ for libFuzzer to build. Isn't it simpler?
Vitaly, please help me review the cmake code.
We'll also need tests for any such change.
Can we just rely on LLVM_ENABLE_LIBCXX instead of introducing COMPILER_RT_ENABLE_LIBCXX?
Those fulfill different purpose, LLVM_ENABLE_LIBCXX enables building LLVM itself against libc++ (i.e. it passes -stdlib=libc++ to the host compiler), but it doesn't necessarily mean that the libc++ source is available. I think we should just rely on COMPILER_RT_HAS_LIBCXX_SOURCES, i.e. use libc++ anytime its sources are available (and we're building with Clang).
After thinking about it a bit more, I think LLVM_ENABLE_LIBCXX is fine for this purpose. The only question is whether you want to have separate toggle, e.g. for testing? Maybe we could keep the separate option but set LLVM_ENABLE_LIBCXX as its default value?
There still the case when you're building compiler-rt without having libc++ around in which case you need to fallback onto whatever C++ library is available is on host. Using libc++ whenever the source is available and we're building with Clang, independently of other options, would be fine with me.
We'll also need tests for any such change.
I was thinking about the right testing strategy. The lit setup already links explicitly against libstdc++, so if we use libc++ whenever possible, the existing tests should be sufficient to check the two libraries work together. We may consider adding additional lit configuration which also uses libc++.
So actually if LLVM_ENABLE_LIBCXX=NO it already builds libfuzzer with libc++. There are issues with tests. We need to add --stdlib=libc++ there and maybe rpath.
After thinking about it a bit more, I think LLVM_ENABLE_LIBCXX is fine for this purpose. The only question is whether you want to have separate toggle, e.g. for testing? Maybe we could keep the separate option but set LLVM_ENABLE_LIBCXX as its default value?
We should not test different variations of cmake flags and environments using cmake itself. This can get complicated very quickly.
Unless we are going to ship both fuzzer versions. Are we?
We have tests and we can just make them work with any value of LLVM_ENABLE_LIBCXX.
Then we just add another step into build bot config.
WDYT?
The problem is that this only covers the case when compiler-rt is in llvm/projects, when built as part of llvm/runtimes we currently don't set LLVM_ENABLE_LIBCXX (although maybe we should when it's available).
After thinking about it a bit more, I think LLVM_ENABLE_LIBCXX is fine for this purpose. The only question is whether you want to have separate toggle, e.g. for testing? Maybe we could keep the separate option but set LLVM_ENABLE_LIBCXX as its default value?
We should not test different variations of cmake flags and environments using cmake itself. This can get complicated very quickly.
Unless we are going to ship both fuzzer versions. Are we?
Ideally no.
We have tests and we can just make them work with any value of LLVM_ENABLE_LIBCXX.
Then we just add another step into build bot config.
WDYT?
Yeah, that' what I was thinking. I can update the lit configuration to use -lc++ even on Linux if LLVM_ENABLE_LIBCXX is set so we get coverage on bots. Is that fine with you?
Yeah, that' what I was thinking. I can update the lit configuration to use -lc++ even on Linux if LLVM_ENABLE_LIBCXX is set so we get coverage on bots. Is that fine with you?
SGTM
but is any reason to use -lc++ instead of --stdlib=libc++ ?
One issue with LLVM_ENABLE_LIBCXX that I just ran into is that it uses libc++ from the host compiler, but when building and running tests, we're using the just built Clang which might have a different version (e.g. in my case the host has 4.0.0 while target is 6.0.0dev) which causes issues. This is not a problem with runtimes/ build where compiler-rt is always built using just built Clang, but it's a problem when compiler-rt is built in projects/. I don't think we can rely on LLVM_ENABLE_LIBCXX or just -stdlib=libc++ for that reason, we always have explicitly point out to the right libc++ which should be used.
Seems LLVM_ENABLE_LIBCXX is not helpful here.
I am not sure that I understand your reply.
I think Kostya suggests to remove COMPILER_RT_ENABLE_LIBCXX and go with just:
if(COMPILER_RT_HAS_LIBCXX_SOURCES) use libc++ else whatever is on the host endif
Why is this bad?
I apologize for the delay, I was stuck working on other things but I finally I have to finish this. I've updated the patch to always use libc++ when available and I've also updated the fuzzer unit tests.
Matt, please also take a look.
CMakeLists.txt | ||
---|---|---|
355 ↗ | (On Diff #124848) | Hm... Also, is this related to libFuzzer? |
lib/fuzzer/CMakeLists.txt | ||
36 | Why is this guarded with "CMAKE_CXX_COMPILER_ID MATCHES Clang"? | |
lib/fuzzer/tests/CMakeLists.txt | ||
22 | Why do we need stdlib=libc++ for unit tests? I'd rather have a separate lit test in test/fuzzer that uses stdlib=libc++. |
CMakeLists.txt | ||
---|---|---|
355 ↗ | (On Diff #124848) | That's an alternate location for libc++ when cross-compiling runtimes (e.g. this is used by Fuchsia's toolchain build). Another one would be for monorepo layout. I'm fine splitting this into a separate change. |
lib/fuzzer/CMakeLists.txt | ||
36 | This is so we can use -stdlib=libc++ with Clang so it automatically includes the correct path to libc++ headers. Alternative would be to use -nostdinc++ and manually add the path to libc++ headers to CFLAGS. That would work for GCC as well. | |
lib/fuzzer/tests/CMakeLists.txt | ||
22 | This is the same as in the other CMakeLists.txt file, the solution mentioned there would work here as well. |
could you please also add a lit test that uses -stdlib=libc++?
CMakeLists.txt | ||
---|---|---|
355 ↗ | (On Diff #124848) | Yes, please. |
lib/fuzzer/CMakeLists.txt | ||
36 | I see... | |
lib/fuzzer/tests/CMakeLists.txt | ||
22 | Ah, yea, the unit tests are linked with the guts of libFuzzer and so they need -stdlib=libc++. Got it. |
I've added the test but I don't think this is sufficient, we also need to check if the host clang supports it, so I need to add an extra CMake check.
LGTM for recent clang. Not sure how long ago -stdlib=libc++ support was added. If it was long enough ago, maybe the lit config is OK? Otherwise might need to check clang version or something.
test/fuzzer/lit.cfg | ||
---|---|---|
58 | What is the difference between -stdlib=libc++ and -lc++? Why does the original code use -lc++ instead? |
I managed to get this also working with GCC, but while testing the change and I noticed that 3 unit tests are currently failing, all with the same segfault: P8047, P8048, P8049. I'm debugging this now trying to figure what's the problem.
test/fuzzer/lit.cfg | ||
---|---|---|
58 | -stdlib=libc++ is handled by the driver which sets up the header include paths (i.e. the C++ library header path which is different for lib++ and stdlibc++) appropriately and adds all necessary libraries which could be more than just -lc++, e.g. -lc++abi or -lunwind. I don't know why the original code is setting the library explicitly since this should normally be handled by the driver appropriately for each host. |
Turned out the problem was an incompatibility between libc++abi (which libc++ assumes as the default C++ ABI) and libstdc++ (which is what is being used when linking against libstdc++). Setting -DLIBCXX_CXX_ABI=none seems to be the solution. I'm still trying to figure what's the best way to tests against libc++, simply linking them isn't enough because libc++.so.1 is not in the default dynamic linker search path.
I assume because of this the patch is not finished yet?
If I apply the patch, check-fuzzer does not work for me. I see output like following and I don't see test command-line has path to libc++
FAIL: LLVMFuzzer :: fuzzer-customcrossover.test (19 of 147) ******************** TEST 'LLVMFuzzer :: fuzzer-customcrossover.test' FAILED ******************** Script: -- ./bin/clang -std=c++11 -lstdc++ -gline-tables-only -fsanitize=address,fuzzer -Icompiler-rt/lib/fuzzer compiler-rt/test/fuzzer/CustomCrossOverTest.cpp -o projects/compiler-rt/test/fuzzer/Output/fuzzer-customcrossover.test.tmp-CustomCrossOverTest not projects/compiler-rt/test/fuzzer/Output/fuzzer-customcrossover.test.tmp-CustomCrossOverTest -seed=1 -runs=1000000 2>&1 | FileCheck compiler-rt/test/fuzzer/fuzzer-customcrossover.test --check-prefix=CHECK_CO projects/compiler-rt/test/fuzzer/Output/fuzzer-customcrossover.test.tmp-CustomCrossOverTest -seed=1 -runs=1000000 -cross_over=0 2>&1 | FileCheck compiler-rt/test/fuzzer/fuzzer-customcrossover.test --check-prefix=CHECK_NO_CO -- Exit Code: 1 Command Output (stderr): -- compiler-rt/test/fuzzer/fuzzer-customcrossover.test:8:11: error: expected string not found in input CHECK_CO: BINGO ^ <stdin>:9:30: note: scanning from here In LLVMFuzzerCustomCrossover uncaught_exceptions not yet implemented ^ <stdin>:17:142: note: possible intended match here #6 0x45a425 in std::uncaught_exceptions() (projects/compiler-rt/test/fuzzer/Output/fuzzer-customcrossover.test.tmp-CustomCrossOverTest+0x45a425) ^
lib/fuzzer/CMakeLists.txt | ||
---|---|---|
94 | it would be nice if add_custom_libcxx set some variable with include dir | |
98 | alignment of arguments is very unusual. could you make it more consistent with existing code? | |
test/fuzzer/CMakeLists.txt | ||
24–25 | could you please remove loop until we have more than one time? |
Yes, I'm still debugging the test failures. I see the same error, but only when using GCC as a host compiler. There's another issue when using Clang with libc++ as the host C++ library. These errors are related to exceptions implementation which uses non-versioned types unlike the rest of libc++'s implementation which is something that @EricWF mentioned in his comment. I'm trying to figure if I could disable exceptions altogether from the private libc++ since those shouldn't be needed for libFuzzer.
Yes, I'm still debugging the test failures. I see the same error, but only when using GCC as a host compiler. There's another issue when using Clang with libc++ as the host C++ library.
Mine host compiler was clang. libcxx sources were there but I didn't add them into include path. Something like this:
cmake -GNinja -DCMAKE_C_COMPILER=build0/bin/clang -DCMAKE_CXX_COMPILER=build0/bin/clang++ -DLLVM_ENABLE_LLD=ON -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_WERROR=ON -DLLVM_INCLUDE_GO_TESTS=OFF -DLLVM_CCACHE_BUILD=ON -DLLVM_BINUTILS_INCDIR=/usr/include '-DLLVM_ENABLE_PROJECTS=clang;lld;compiler-rt;libcxx;libcxxabi' /usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/llvm
So I finally ended up with something that works reliably with both libstdc++ and libc++. I plan on doing a bit more cleanup, but a question is what to do on Darwin and Windows. I'm relying on partial linking. That appears to be supported by ld64, but there doesn't seem be the --whole-archive option that we need as well (unless we upgrade CMake to 3.9). On Windows, I don't know if partial linking is supported at all. However, I also don't know if we even need this on Darwin and Windows rather than simply relying on the host C++ library?
I don't think we need this on OSX.
libFuzzer is not supported on Windows currently at all.
@phosek On darwin ld64 supports the flags -all_load and -force_load which work similarly to --whole-archive. The ld64 manpage has the documentation for those options.
Thanks, that's useful. Do you know what would be an equivalent of objcopy --localize-hidden on Darwin?
I've added another test case to check this work with both statically and dynamically linked libc++ (only if built as part of LLVM). Everything seems to be working fine, PTAL.
I see a lot of these on "ninja check-fuzzer"
******************** TEST 'LLVMFuzzer :: value-profile-set.test' FAILED ******************** Script: -- /usr/local/google/home/vitalybuka/src/llvm.git/out/build-clang-none/./bin/clang -std=c++11 -stdlib=libc++ -lc++ -static-libstdc++ -gline-tables-only -fsanitize=address,fuzzer -I/usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/lib/fuzzer /usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/test/fuzzer/FourIndependentBranchesTest.cpp -o /usr/local/google/home/vitalybuka/src/llvm.git/out/build-clang-none/projects/compiler-rt/test/fuzzer/Output/value-profile-set.test.tmp-FourIndependentBranchesTest not /usr/local/google/home/vitalybuka/src/llvm.git/out/build-clang-none/projects/compiler-rt/test/fuzzer/Output/value-profile-set.test.tmp-FourIndependentBranchesTest -seed=1 -use_cmp=0 -use_value_profile=1 -runs=100000000 2>&1 | FileCheck /usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/test/fuzzer/value-profile-set.test -- Exit Code: 1 Command Output (stderr): -- clang-6.0: warning: argument unused during compilation: '-static-libstdc++' [-Wunused-command-line-argument] /usr/bin/ld: cannot find -lc++ /usr/bin/ld: cannot find -lc++ clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation)
lib/fuzzer/FuzzerInterface.h | ||
---|---|---|
25 | this is header so it may have side effects. could you please use attribute instead of pragma? |
It was configured with:
cmake -GNinja -DCMAKE_C_COMPILER=/usr/local/google/home/vitalybuka/src/llvm.git/out/build/bin//clang -DCMAKE_CXX_COMPILER=/usr/local/google/home/vitalybuka/src/llvm.git/out/build/bin//clang++ -DLLVM_ENABLE_LLD=ON -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_WERROR=OFF -DLLVM_INCLUDE_GO_TESTS=OFF -DLLVM_CCACHE_BUILD=ON -DLLVM_BINUTILS_INCDIR=/usr/include '-DLLVM_ENABLE_PROJECTS=clang;lld;compiler-rt;libcxx;libcxxabi' /usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/llvm
test/fuzzer/lit.cfg | ||
---|---|---|
54 | Is this line necessary considering config.environment['LD_LIBRARY_PATH'] is already set above? |
I forgot to include a dependency on libc++ in tests using it, that should be fixed now.
check-fuzzer tests clean for me now with both clang and gcc.
lib/fuzzer/FuzzerInterface.h | ||
---|---|---|
66 | AFAICT, all libFuzzer code puts attributes before the function name. Could you follow that convention please? | |
test/fuzzer/lit.cfg | ||
58 | Now we are specifying both -stdlib=libc++ and -lc++. Why is -lc++ necessary now? |
test/fuzzer/lit.cfg | ||
---|---|---|
58 | Turned the problem is that we're always C compiler even when building C++ code so -stdlib=libc++ isn't sufficient. I've updated the script to also add --driver-mode=g++ in C++ mode so this is no longer needed. |
Vitaly, does this test clean on your end now?
test/fuzzer/lit.cfg | ||
---|---|---|
59 | What does -static-libstdc++ do here? Does it mix libstdc++ with libc++? |
test/fuzzer/lit.cfg | ||
---|---|---|
59 | static-libstdc++ works with both libc++ and libstdc++ (confusingly), the choice of C++ library is controlled by -stdlib= flag. This branch is used to check that libFuzzer can be used with statically linked C++ library (i.e. to ensure that the symbols from the internalized C++ library are properly hidden and localized to don't cause collision). |
It works for me now.
lib/fuzzer/FuzzerMain.cpp | ||
---|---|---|
19 | please remove space after "visibility" |
This change cause cross-compiles to fail if the target compiler can't run on the host machine -- in my case, I'm compiling Linux executables on Darwin, and the new relink step fails.
Perhaps it should be guarded by COMPILER_RT_CAN_EXECUTE_TESTS.
One disadvantage of this change is that the libc++ variant is reconfigured on every invocation of make / ninja, even if there is nothing to do:
[1/3] No force-reconfigure step for 'libcxx_fuzzer_x86_64' [2/3] Performing configure step for 'libcxx_fuzzer_x86_64' -- libcxx_fuzzer_x86_64 configure command succeeded. See also <...>/projects/compiler-rt/lib/fuzzer/libcxx_fuzzer_x86_64/src/libcxx_fuzzer_x86_64-stamp/libcxx_fuzzer_x86_64-configure-*.log [3/3] Performing build step for 'libcxx_fuzzer_x86_64' -- libcxx_fuzzer_x86_64 build command succeeded. See also <...>/projects/compiler-rt/lib/fuzzer/libcxx_fuzzer_x86_64/src/libcxx_fuzzer_x86_64-stamp/libcxx_fuzzer_x86_64-build-*.log
Can we avoid this and have ninja say that there is no work to do?
That's because add_custom_libcxx() calls ExternalProject_Add() directly instead of llvm_ExternalProject_Add(), which already has logic to handle this.
In fact, it looks like this is partial reimplementation of code in llvm/runtimes/CMakeLists.txt. Perhaps that code, mostly runtime_register_target(), could be moved to LLVMExternalProjectUtils.cmake and reused here?
Why is this guarded with "CMAKE_CXX_COMPILER_ID MATCHES Clang"?