This is an archive of the discontinued LLVM Phabricator instance.

[LibFuzzer] Fix some unit test crashes on OSX.
ClosedPublic

Authored by delcypher on Jun 6 2016, 5:47 PM.

Details

Summary

[LibFuzzer] Fix some unit test crashes on OSX.

This fixes the following unit tests:

FuzzerDictionary.ParseOneDictionaryEntry
FuzzerDictionary.ParseDictionaryFile

The issue appears to be mixing non-ASan-ified code (LibFuzzer) and
ASan-ified code (the unittest) as the tests would pass fine if
everything was built with ASan enabled.

I believe the issue is that different implementations of std::vector<>
are being used in LibFuzzer and outside LibFuzzer (in the unittests).
For Libcxx (I've not seen the issue manifest for libstdc++) we can disable
the ASanified std::vector<> by definining the `_LIBCPP_HAS_NO_ASAN` macro.
Doing this fixes the tests on OSX.

Diff Detail

Repository
rL LLVM

Event Timeline

delcypher updated this revision to Diff 59812.Jun 6 2016, 5:47 PM
delcypher retitled this revision from to [LibFuzzer] Fix some unit test crashes on OSX..
delcypher updated this object.
delcypher added reviewers: kcc, aizatsky.
delcypher added subscribers: kcc, aizatsky, zaks.anna and 3 others.
kcc edited edge metadata.Jun 6 2016, 6:03 PM

This is too complex. Please find a simpler solution.

delcypher edited edge metadata.Jun 6 2016, 6:21 PM
delcypher added a subscriber: beanz.
In D21049#450505, @kcc wrote:

This is too complex. Please find a simpler solution.

I could just use if(APPLE) but that isn't quite right as it's really a libcxx issue. It would be a lot simpler though.

@beanz as the resident LLVM CMake expert is there a cleaner way for me to detect in the target compiler is using libcxx? There is the HAVE_CXXABI_H in cmake/config-ix.cmake but I don't think that's really a proper indicator that the target compiler is using libcxx.

kcc added a comment.Jun 6 2016, 6:29 PM

How exactly does it crash?

>>! In D21049#450537, @kcc wrote:

How exactly does it crash?

When running `lib/Fuzzer/test/LLVMFuzzer-Unittest --gtest_filter=FuzzerDictionary.ParseOneDictionaryEntry`

Note: Google Test filter = FuzzerDictionary.ParseOneDictionaryEntry
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from FuzzerDictionary
[ RUN      ] FuzzerDictionary.ParseOneDictionaryEntry
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:357: Failure
Value of: ParseOneDictionaryEntry("\"abc\"", &U)
  Actual: false
Expected: true
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:358: Failure
Value of: Unit({'a', 'b', 'c'})
  Actual: { 'a' (97, 0x61), 'b' (98, 0x62), 'c' (99, 0x63) }
Expected: U
Which is: { '\0' }
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:359: Failure
Value of: ParseOneDictionaryEntry("abc=\"abc\"", &U)
  Actual: false
Expected: true
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:360: Failure
Value of: Unit({'a', 'b', 'c'})
  Actual: { 'a' (97, 0x61), 'b' (98, 0x62), 'c' (99, 0x63) }
Expected: U
Which is: { '\0' }
ASAN:DEADLYSIGNAL
=================================================================
==35158==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000006 (pc 0x0001051f212d bp 0x7fff5ab31e10 sp 0x7fff5ab31db0 T0)
==35158==The signal is caused by a READ memory access.
==35158==Hint: address points to the zero page.
    #0 0x1051f212c in fuzzer::ParseOneDictionaryEntry(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >*) string:1829
    #1 0x1050ffff8 in FuzzerDictionary_ParseOneDictionaryEntry_Test::TestBody() FuzzerUnittest.cpp:366
    #2 0x105198a82 in testing::Test::Run() gtest.cc:2161
    #3 0x10519a1ab in testing::TestInfo::Run() gtest.cc:2309
    #4 0x10519b968 in testing::TestCase::Run() gtest.cc:2416
    #5 0x1051a8658 in testing::internal::UnitTestImpl::RunAllTests() gtest.cc:4207
    #6 0x1051a7ef0 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) gtest.cc:2142
    #7 0x1051a7def in testing::UnitTest::Run() gtest.cc:3841
    #8 0x1051dfc36 in main TestMain.cpp:47
    #9 0x7fff8ce7a5ac in start (libdyld.dylib+0x35ac)
    #10 0x1  (<unknown module>)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV string:1829 in fuzzer::ParseOneDictionaryEntry(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >*)
==35158==ABORTING

In lldb

Process 35225 stopped
* thread #1: tid = 0x785871, 0x000000010012812d LLVMFuzzer-Unittest`fuzzer::ParseOneDictionaryEntry(Str=<unavailable>, U=<unavailable>) + 413 at FuzzerUtil.cpp:194, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x6)
    frame #0: 0x000000010012812d LLVMFuzzer-Unittest`fuzzer::ParseOneDictionaryEntry(Str=<unavailable>, U=<unavailable>) + 413 at FuzzerUtil.cpp:194
   191    assert(L <= R);
   192    for (size_t Pos = L; Pos <= R; Pos++) {
   193      uint8_t V = (uint8_t)Str[Pos];
-> 194      if (!isprint(V) && !isspace(V)) return false;
   195      if (V =='\\') {
   196        // Handle '\\'
   197        if (Pos + 1 <= R && (Str[Pos + 1] == '\\' || Str[Pos + 1] == '"')) {

and attached disassembly. In this case the crash occurs when executing

0x10012812d <+413>: movzbl (%rax,%r13), %edi

At the point of crashing

(lldb) register read rax
     rax = 0x0000000000000001
(lldb) register read r13
     r13 = 0x0000000000000005

@kcc: What I showed is not particularly illuminating (at least to me) but the important things I noticed are that

  • In the other failing unit test (lib/Fuzzer/test/LLVMFuzzer-Unittest --gtest_filter=FuzzerDictionary.ParseDictionaryFile) in that test also the crash is different and the this pointer inside the AssertionResult constructor inside GTest has become 0x0 for some reason.
Note: Google Test filter = FuzzerDictionary.ParseDictionaryFile
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from FuzzerDictionary
[ RUN      ] FuzzerDictionary.ParseDictionaryFile
ParseDictionaryFile: error in line 1
                zzz
ParseDictionaryFile: file does not exist or is empty
ParseDictionaryFile: error in line 4
                abc="abc"
ASAN:DEADLYSIGNAL
=================================================================
==41539==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000107cc5631 bp 0x7fff57f952e0 sp 0x7fff57f952b0 T0)
==41539==The signal is caused by a WRITE memory access.
==41539==Hint: address points to the zero page.
    #0 0x107cc5630 in testing::AssertionResult::AssertionResult(bool) gtest.h:271
    #1 0x107ca17a3 in testing::AssertionResult::AssertionResult(bool) gtest.h:271
    #2 0x107ca75f4 in FuzzerDictionary_ParseDictionaryFile_Test::TestBody() FuzzerUnittest.cpp:388
    #3 0x107d36a82 in testing::Test::Run() gtest.cc:2161
    #4 0x107d381ab in testing::TestInfo::Run() gtest.cc:2309
    #5 0x107d39968 in testing::TestCase::Run() gtest.cc:2416
    #6 0x107d46658 in testing::internal::UnitTestImpl::RunAllTests() gtest.cc:4207
    #7 0x107d45ef0 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) gtest.cc:2142
    #8 0x107d45def in testing::UnitTest::Run() gtest.cc:3841
    #9 0x107d7dc36 in main TestMain.cpp:47
    #10 0x7fff8ce7a5ac in start (libdyld.dylib+0x35ac)
    #11 0x1  (<unknown module>)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV gtest.h:271 in testing::AssertionResult::AssertionResult(bool)
==41539==ABORTING
Abort trap: 6
  • If I changed the CMake code to also build LibFuzzer with ASan the crash disappears
  • If I build the unit test with -D_LIBCPP_HAS_NO_ASAN the issue disappears. This makes me think the issue is likely related to different definitions (@zaks.anna tells me that libcxx has a ASan and non-ASan version std:vector<>) of std::vector<> (and maybe other libcxx data structures) being used inside and outside LibFuzzer.
  • If I force LibFuzzer to be built at -O0 (it's -O1 normally because the CMake code forces -O1 even when the build type is set to Debug) the crash disappears but the tests still fail with garbage data.
Note: Google Test filter = FuzzerDictionary.ParseOneDictionaryEntry
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from FuzzerDictionary
[ RUN      ] FuzzerDictionary.ParseOneDictionaryEntry
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:358: Failure
Value of: Unit({'a', 'b', 'c'})
  Actual: { 'a' (97, 0x61), 'b' (98, 0x62), 'c' (99, 0x63) }
Expected: U
Which is: { '\x90' (144), 'b' (98, 0x62), 'c' (99, 0x63) }
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:360: Failure
Value of: Unit({'a', 'b', 'c'})
  Actual: { 'a' (97, 0x61), 'b' (98, 0x62), 'c' (99, 0x63) }
Expected: U
Which is: { '0' (48, 0x30), '0' (48, 0x30), '0' (48, 0x30) }
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:363: Failure
Value of: Unit({'\\'})
  Actual: { '\\' (92, 0x5C) }
Expected: U
Which is: { '0' (48, 0x30) }
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:365: Failure
Value of: Unit({0xAB})
  Actual: { '\xAB' (171) }
Expected: U
Which is: { '0' (48, 0x30) }
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:367: Failure
Value of: Unit({0xAB, 'z', 0xDE})
  Actual: { '\xAB' (171), 'z' (122, 0x7A), '\xDE' (222) }
Expected: U
Which is: { '0' (48, 0x30), '0' (48, 0x30), '0' (48, 0x30) }
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:369: Failure
Value of: Unit({'#'})
  Actual: { '#' (35, 0x23) }
Expected: U
Which is: { '0' (48, 0x30) }
/Volumes/data/dev/libfuzzer/src/lib/Fuzzer/test/FuzzerUnittest.cpp:371: Failure
Value of: Unit({'"'})
  Actual: { '"' (34, 0x22) }
Expected: U
Which is: { '0' (48, 0x30) }
[  FAILED  ] FuzzerDictionary.ParseOneDictionaryEntry (3 ms)
[----------] 1 test from FuzzerDictionary (3 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (3 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] FuzzerDictionary.ParseOneDictionaryEntry

 1 FAILED TEST
delcypher updated this object.Jun 6 2016, 9:25 PM
kcc added a comment.Jun 6 2016, 9:47 PM

The problem with linking asan + no-asan is this:
https://github.com/google/sanitizers/wiki/AddressSanitizerContainerOverflow#false-positives

I am not aware of any other issue caused by linking asan and no-asan together.
Before fixing this problem I will need to understand it more.
And it is very unlikely that the correct fix is in cmake.
Remember -- the majority of the users are not going to build libFuzzer with cmake
and thus we should not be fixing bugs in cmake, unless they are test-specific.

It is quite likely that we'll need to get rid of STL in libFuzzer completely, for this and for other reasons.
It is even stated in the docs (search for stl in http://llvm.org/docs/LibFuzzer.html), although I am not quite ready to do that yet.

kcc added a comment.Jun 6 2016, 9:48 PM

and, btw, I don't think we should be building the unit tests with asan. Or should we?

The problem with linking asan + no-asan is this:
https://github.com/google/sanitizers/wiki/AddressSanitizerContainerOverflow#false-positives
I am not aware of any other issue caused by linking asan and no-asan together.
Before fixing this problem I will need to understand it more.

This problem is cased by container overflow. However, it does not always result in false positives as the page implies. We've seen several wired crashes due to this problem. These are much more rare and are very hard to reduce but they do exist. The only solution is not to build with container overflow instrumentation (the run time option does not hep). Here is an explanation of why they happen when we mix ASanified and non-ASanified code.

The container overflow instrumentation is all in the headers, guarded by the _LIBCPP_HAS_NO_ASAN macro. It's possible that the instrumented functions do not get fully inlined in both instrumented code and non-instrumented code. This means that the linker will see 2 copies of the vector functions - one is ASan version and one a vanilla version. It will pick one of them (the first one it sees on OS X) and use that one in both instrumented and non-instrumented code. If the one it picks comes from the ASan version of the header and is called from the non-instrumented code, wired things happen. This is what we see here. The only workaround is to set _LIBCPP_HAS_NO_ASAN when building.

I do not know why we saw it more than you, maybe it happens more often on the Mac for some reason. We have a plan for a complex solution that would involve linker changes to better diagnose this issue at build time. However, it's not going to land soon. Another option is to just disable container overflow by default either on the Mac or everywhere.

In D21049#450598, @kcc wrote:

The problem with linking asan + no-asan is this:
https://github.com/google/sanitizers/wiki/AddressSanitizerContainerOverflow#false-positives

I am not aware of any other issue caused by linking asan and no-asan together.
Before fixing this problem I will need to understand it more.
And it is very unlikely that the correct fix is in cmake.
Remember -- the majority of the users are not going to build libFuzzer with cmake
and thus we should not be fixing bugs in cmake, unless they are test-specific.

I think this bug is specific to the unittests. The unittests are calling into libFuzzer internals that use C++ standard library types in their interface. The public facing functions (LLVMFuzzerTestOneInput, LLVMFuzzerInitialize, and LLVMFuzzerCustomMutator) do not as they use plain C types.

In D21049#450599, @kcc wrote:

and, btw, I don't think we should be building the unit tests with asan. Or should we?

Unless in your unittests you need to interact with parts of the runtime provided by compiler-rt and/or need coverage to be tracked during execution of the unit tests then I would say you probably want ASan to ensure that LibFuzzer and the unit tests are built in the same way to avoid mixing ASan-ified and non-ASan-ified code.

I suspect that not building the unittests with ASan would fix this issue but I haven't tested that yet. I've only tried the other way (i.e building both LibFuzzer and the unittests with ASan).

delcypher added a comment.EditedJun 6 2016, 11:47 PM
In D21049#450599, @kcc wrote:

and, btw, I don't think we should be building the unit tests with asan. Or should we?

Unless in your unittests you need to interact with parts of the runtime provided by compiler-rt and/or need coverage to be tracked during execution of the unit tests then I would say you probably want ASan to ensure that LibFuzzer and the unit tests are built in the same way to avoid mixing ASan-ified and non-ASan-ified code.

I suspect that not building the unittests with ASan would fix this issue but I haven't tested that yet. I've only tried the other way (i.e building both LibFuzzer and the unittests with ASan).

Eurgh. Just tried this. It doesn't work because

  • libgtest and LLVMSupport (which libgtest depends on) are built with the ASan so the unittests will fail to link if we just pass -fno-sanitize=all to the compile and link command line for the unittests.
  • The coverage runtime functions are missing too if ASan is not available. These get referenced from libgtest and LLVMSupport.

Changing this would be a pain because LibFuzzer is coupled to LLVM's build system and building LibFuzzer is coupled with building LLVM with ASan and sanitizer coverage.

kcc added a comment.Jun 7 2016, 11:26 AM

Ok... if this is specific to unit tests (at least now) it does not affect the users and thus we can make a change in cmake rules for just the unit tests.
However the current change looks too big for that. Can you simply add _LIBCPP_HAS_NO_ASAN to the unit test code (and will that work?)

Another option is to just disable container overflow by default either on the Mac or everywhere.

Given that how much trouble it causes on Mac (much more than on Linux) I don't mind if we disable it on Mac (only).

In D21049#451336, @kcc wrote:

Ok... if this is specific to unit tests (at least now) it does not affect the users and thus we can make a change in cmake rules for just the unit tests.
However the current change looks too big for that. Can you simply add _LIBCPP_HAS_NO_ASAN to the unit test code (and will that work?)

That will work provided that the _LICPP_HAS_NO_ASAN define causes libstdc++ to behave weirdly. I'll revise this patch and test.

Side note. Using ASAN_OPTIONS=detect_container_overflow=0 when running the unit test does not fix the issue.

delcypher updated this revision to Diff 59971.Jun 7 2016, 4:46 PM
delcypher updated this object.
  • Always pass define when building unit test to simplify CMake code.
kcc added a comment.Jun 7 2016, 4:49 PM

That's way better, but can you try one more step?
Move this into test/FuzzerUnittest.cpp

In D21049#451718, @kcc wrote:

That's way better, but can you try one more step?
Move this into test/FuzzerUnittest.cpp

Putting

#define _LIBCPP_HAS_NO_ASAN

at the top of test/FuzzerUnittest.cpp seems to work. Unfortunately I can't guard that with #if defined(_LIBCPP_VERSION) because that doesn't seem to work unless I've included a header file from libc++ first.

kcc added a comment.Jun 7 2016, 5:02 PM
In D21049#451718, @kcc wrote:

That's way better, but can you try one more step?
Move this into test/FuzzerUnittest.cpp

Putting

#define _LIBCPP_HAS_NO_ASAN

at the top of test/FuzzerUnittest.cpp seems to work. Unfortunately I can't guard that with #if defined(_LIBCPP_VERSION) because that doesn't seem to work unless I've included a header file from libc++ first.

That's fine, I think. just put
#define _LIBCPP_HAS_NO_ASAN // with a comment

In D21049#451734, @kcc wrote:
In D21049#451718, @kcc wrote:

That's way better, but can you try one more step?
Move this into test/FuzzerUnittest.cpp

Putting

#define _LIBCPP_HAS_NO_ASAN

at the top of test/FuzzerUnittest.cpp seems to work. Unfortunately I can't guard that with #if defined(_LIBCPP_VERSION) because that doesn't seem to work unless I've included a header file from libc++ first.

That's fine, I think. just put
#define _LIBCPP_HAS_NO_ASAN // with a comment

Okay. Before going with that solution I need to check another test failure though. The FuzzerMutate.ShuffleBytes2 unit test is failing (and has been since I started working on LibFuzzer) and I haven't looked into why yet. Hopefully that won't be tricky to debug.

delcypher updated this revision to Diff 60306.Jun 9 2016, 9:42 PM
  • Put preprocessor define in FuzzerUnittest.cpp rather than in the CMakeLists.txt file

@kcc: Sorry for the delay. I've been investigating the other failing unit test and I've determined the cause and it is unrelated to the issue that this patch tries to solve so this patch is ready to go (modulo any changes you want to the comments).

kcc accepted this revision.Jun 9 2016, 10:12 PM
kcc edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jun 9 2016, 10:12 PM
This revision was automatically updated to reflect the committed changes.