This is an archive of the discontinued LLVM Phabricator instance.

[libFuzzer] Link libFuzzer's own interceptors when other compiler runtimes are not linked.
ClosedPublic

Authored by dokyungs on Jul 9 2020, 10:34 AM.

Details

Summary

libFuzzer intercepts certain library functions such as memcmp/strcmp by defining weak hooks. Weak hooks, however, are called only when other runtimes such as ASan is linked. This patch defines libFuzzer's own interceptors, which is linked into the libFuzzer executable when other runtimes are not linked, i.e., when -fsanitize=fuzzer is given, but not others.

The patch once landed but was reverted in 8ef9e2bf355d05bc81d8b0fe1e5333eec59a0a91 due to an assertion failure caused by calling an intercepted function, strncmp, while initializing the interceptors in fuzzerInit(). This issue is now fixed by calling internal_strncmp() when the fuzzer has not been initialized yet, instead of recursively calling fuzzerInit() again.

Diff Detail

Event Timeline

dokyungs created this revision.Jul 9 2020, 10:34 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 9 2020, 10:34 AM
Herald added subscribers: Restricted Project, cfe-commits, mgorny. · View Herald Transcript

Seems like the general approach we want.

Could you:

  • Fix the lint warnings
  • Find out why the unit tests failed
  • Add strcmp
  • Modify the memcmp/strcmp unit tests to show that we can solve them with/without ASan

After that I'll take a closer look.

dokyungs updated this revision to Diff 277462.Jul 13 2020, 9:31 AM

Add interceptors for all the functions libFuzzer has a weak interceptor for, and duplicate existing interceptor test cases with new compiler flags (-fno-sanitize=address).

Builtin libfunc optimizations may transform memcmp and strcmp-like functions. To disable such optimizations, -fno-builtin= flag was additionally added in compiling new test cases. FWIW, the original test cases didn't require such flags since other sanitizers including ASan disables those optimizations in their LLVM pass by dropping libfunc attribute in the call instructions.

Builtin libfunc optimizations may transform memcmp and strcmp-like functions. To disable such optimizations, -fno-builtin= flag was additionally added in compiling new test cases. FWIW, the original test cases didn't require such flags since other sanitizers including ASan disables those optimizations in their LLVM pass by dropping libfunc attribute in the call instructions.

It sounds like we need to add -fno-builtin in the clang driver when building with sancov as well. Otherwise, users won't get any benefit of this patch without doing clang++ -fsanitize=fuzzer my_fuzz_target.cpp -fno-builtin-strstr -fno-builtin-strncmp -fno-builtin-strcmp -fno-builtin-memcmp?

Builtin libfunc optimizations may transform memcmp and strcmp-like functions. To disable such optimizations, -fno-builtin= flag was additionally added in compiling new test cases. FWIW, the original test cases didn't require such flags since other sanitizers including ASan disables those optimizations in their LLVM pass by dropping libfunc attribute in the call instructions.

It sounds like we need to add -fno-builtin in the clang driver when building with sancov as well. Otherwise, users won't get any benefit of this patch without doing clang++ -fsanitize=fuzzer my_fuzz_target.cpp -fno-builtin-strstr -fno-builtin-strncmp -fno-builtin-strcmp -fno-builtin-memcmp?

Right. Apparently with -O2 many calls to memcmp-like functions are removed. I just wondered, though, what makes more sense: disabling such optimization when building (i) with sancov, or (ii) with -fsanitize=fuzzer? If we go for (i), would it make sense to do it in the SanitizerCoverage module pass like other sanitizers do? What do you think? Also, can it be addressed in a follow-up patch?

morehouse added a comment.EditedJul 13 2020, 12:12 PM

Builtin libfunc optimizations may transform memcmp and strcmp-like functions. To disable such optimizations, -fno-builtin= flag was additionally added in compiling new test cases. FWIW, the original test cases didn't require such flags since other sanitizers including ASan disables those optimizations in their LLVM pass by dropping libfunc attribute in the call instructions.

It sounds like we need to add -fno-builtin in the clang driver when building with sancov as well. Otherwise, users won't get any benefit of this patch without doing clang++ -fsanitize=fuzzer my_fuzz_target.cpp -fno-builtin-strstr -fno-builtin-strncmp -fno-builtin-strcmp -fno-builtin-memcmp?

Right. Apparently with -O2 many calls to memcmp-like functions are removed. I just wondered, though, what makes more sense: disabling such optimization when building (i) with sancov, or (ii) with -fsanitize=fuzzer? If we go for (i), would it make sense to do it in the SanitizerCoverage module pass like other sanitizers do? What do you think? Also, can it be addressed in a follow-up patch?

My opinion is to make it part of -fsanitize=fuzzer, or maybe disable parts of the builtin optimization pass for functions with the OptForFuzzing attribute.

And I think a follow-up patch is easier to review.

dokyungs updated this revision to Diff 277530.Jul 13 2020, 12:25 PM

Fixed a few LINT warnings by defining some macros that resemble the ones used in other sanitizer interception code.

Right. Apparently with -O2 many calls to memcmp-like functions are removed. I just wondered, though, what makes more sense: disabling such optimization when building (i) with sancov, or (ii) with -fsanitize=fuzzer? If we go for (i), would it make sense to do it in the SanitizerCoverage module pass like other sanitizers do? What do you think? Also, can it be addressed in a follow-up patch?

My opinion is to make it part of -fsanitize=fuzzer, or maybe disable parts of the builtin optimization pass for functions with the OptForFuzzing attribute.

And I think a follow-up patch is easier to review.

Agreed with Matt on all of the above.

morehouse added inline comments.Jul 13 2020, 3:23 PM
compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
15

We should include FuzzerDefs.h to use this, not FuzzerBuiltins.h.

26

Any reason not to use uintptr_t instead?

42

Nit: I think we should prefer C++ constructs, so reinterpret_cast<uintptr_t> here.

117

Why extern "C++"? I don't think we want that here.

124

Also why extern "C++" here?

141

Nit: reinterpret_cast<uintptr_t>()

160

Nit: Please add this comment:

} // extern "C"
compiler-rt/test/fuzzer/no-asan-strstr.test
5 ↗(On Diff #277530)

Since these "no-asan" cases are pretty small, can we just add them to the existing *cmp tests?

dokyungs updated this revision to Diff 277620.Jul 13 2020, 5:09 PM

Addressed Matt's comments.

A major change in this round that needs explanation is introduction of FuzzerPlatform.h. Previously I defined strstr and strcasestr with extern "C++" to workaround conflicting definition errors resulting from including <string.h>. But since including it is not necessary when compiling this interceptor module, this patch now separates out platform related macros from FuzzerDef.h into FuzzerPlatform.h, and the module includes FuzzerPlatform.h, not FuzzerDef.h.

dokyungs marked 9 inline comments as done.Jul 13 2020, 5:13 PM
dokyungs added inline comments.
compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
117

Removed by not including #include <string>.

124

Removed by not including #include <string>.

dokyungs marked 3 inline comments as done.Jul 13 2020, 5:17 PM
dokyungs updated this revision to Diff 277624.Jul 13 2020, 5:21 PM

strncmp test should include -fno-builtin-strncmp, not -fno-builtin-strcmp

dokyungs updated this revision to Diff 277625.Jul 13 2020, 5:36 PM

Use unique output file name for each subtest, and add no-asan subtest in memcmp64.test

Addressed Matt's comments.

A major change in this round that needs explanation is introduction of FuzzerPlatform.h. Previously I defined strstr and strcasestr with extern "C++" to workaround conflicting definition errors resulting from including <string.h>. But since including it is not necessary when compiling this interceptor module, this patch now separates out platform related macros from FuzzerDef.h into FuzzerPlatform.h, and the module includes FuzzerPlatform.h, not FuzzerDef.h.

What was the conflicting definition error? Does string.h have inline definitions for those functions?

compiler-rt/test/fuzzer/memcmp.test
16

CHECK1 and CHECK2 is unnecessary. Let's use the same CHECK for both.

hctim added inline comments.Jul 14 2020, 9:55 AM
clang/lib/Driver/SanitizerArgs.cpp
244

HWASan doesn't have interceptors - we actually want fuzzer interceptors under needsFuzzer() && needsHwasanRt() (just remove the !needsHwasanRt()).

compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
53

Why not #include <sanitizer/common_interface_defs.h> (and below)?

compiler-rt/lib/fuzzer/FuzzerPlatform.h
1 ↗(On Diff #277625)

Nit - name change

8 ↗(On Diff #277625)

Nit - remove blank comment

compiler-rt/test/fuzzer/memcmp.test
16

No need to have different CHECK's throughout this patchset (in both tests we're checking for the same thing.

UNSUPPORTED: freebsd
RUN: %cpp_compiler %S/MemcmpTest.cpp -o %t-MemcmpTest
RUN: not %run %t-MemcmpTest               -seed=1 -runs=10000000   2>&1 | FileCheck %s

RUN: %cpp_compiler -fno-sanitize=address -fno-builtin-memcmp %S/MemcmpTest.cpp -o %t-NoAsanMemcmpTest
RUN: not %run %t-MemcmpTest               -seed=1 -runs=10000000   2>&1 | FileCheck %s

CHECK: BINGO

Addressed Matt's comments.

A major change in this round that needs explanation is introduction of FuzzerPlatform.h. Previously I defined strstr and strcasestr with extern "C++" to workaround conflicting definition errors resulting from including <string.h>. But since including it is not necessary when compiling this interceptor module, this patch now separates out platform related macros from FuzzerDef.h into FuzzerPlatform.h, and the module includes FuzzerPlatform.h, not FuzzerDef.h.

What was the conflicting definition error? Does string.h have inline definitions for those functions?

I was misled; the error is actually ambiguating new "declarations", not definitions. The exact error message goes like:

error: ambiguating new declaration of ‘char* strcasestr(const char*, const char*)’
  104 | ATTRIBUTE_INTERFACE char *strcasestr(const char *s1, const char *s2) {
      |                           ^~~~~~~~~~
In file included from include/c++/v1/string.h:60,
                 from include/c++/v1/cstring:60,
                 from include/c++/v1/algorithm:641,
                 from include/c++/v1/__string:57,
                 from include/c++/v1/string_view:175,
                 from include/c++/v1/string:506,
                 from FuzzerInterceptors.cpp:14:
/usr/include/string.h:356:26: note: old declaration ‘const char* strcasestr(const char*, const char*)’
  356 | extern "C++" const char *strcasestr (const char *__haystack,
      |                          ^~~~~~~~~~

C++'s declarations of strstr/strcasestr each have two different versions (const v. non const), and neither of them matches C's strstr/strcasestr declarations. So I could either (i) make libFuzzer's declarations of strstr/strcasestr match one of C++ versions (for this reason there was "extern C++ ..."), or (ii) make them match C declarations of strstr/strcasestr and remove C++ declarations by not including string.h. I realized that (ii) is a simpler solution, so I changed the code that way.

dokyungs updated this revision to Diff 277901.Jul 14 2020, 10:54 AM

Addressed comments.

I was misled; the error is actually ambiguating new "declarations", not definitions. The exact error message goes like:

error: ambiguating new declaration of ‘char* strcasestr(const char*, const char*)’
  104 | ATTRIBUTE_INTERFACE char *strcasestr(const char *s1, const char *s2) {
      |                           ^~~~~~~~~~
In file included from include/c++/v1/string.h:60,
                 from include/c++/v1/cstring:60,
                 from include/c++/v1/algorithm:641,
                 from include/c++/v1/__string:57,
                 from include/c++/v1/string_view:175,
                 from include/c++/v1/string:506,
                 from FuzzerInterceptors.cpp:14:
/usr/include/string.h:356:26: note: old declaration ‘const char* strcasestr(const char*, const char*)’
  356 | extern "C++" const char *strcasestr (const char *__haystack,
      |                          ^~~~~~~~~~

C++'s declarations of strstr/strcasestr each have two different versions (const v. non const), and neither of them matches C's strstr/strcasestr declarations. So I could either (i) make libFuzzer's declarations of strstr/strcasestr match one of C++ versions (for this reason there was "extern C++ ..."), or (ii) make them match C declarations of strstr/strcasestr and remove C++ declarations by not including string.h. I realized that (ii) is a simpler solution, so I changed the code that way.

Got it. We only need to intercept the libc version of strstr. We don't care about libc++ since that one just forwards to libc. So I think this is the right approach.

compiler-rt/lib/fuzzer/FuzzerDefs.h
23 ↗(On Diff #277625)

Nit: we should really not include this here. Instead, we should include it directly in any libFuzzer file that needs the platform macros.

Of course, that will add some noise to this patch. Could you splice this change into a separate patch that we can submit first?

dokyungs updated this revision to Diff 277904.Jul 14 2020, 11:04 AM
dokyungs marked 5 inline comments as done.

Use one CHECK for two subtests

dokyungs updated this revision to Diff 277996.Jul 14 2020, 3:13 PM

Addressed comments.

This revision is now accepted and ready to land.Jul 14 2020, 3:20 PM
hctim accepted this revision.Jul 14 2020, 3:29 PM
hctim marked an inline comment as done.

LGTM

compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
53

The comment above is obsolete - please mark as done before closing out the patch.

dokyungs marked an inline comment as done.Jul 14 2020, 4:03 PM

I applied this patch locally and ran the fuzzer tests. I get a segfault:

$ clang++ -fsanitize=fuzzer -g -m32 SimpleHashTest.cpp
$ gdb --args ./a.out -seed=1
...
(gdb) run
...
Program received signal SIGSEGV, Segmentation fault. 
0x00000000 in ?? ()
(gdb) bt
#0  0x00000000 in ?? ()
#1  0x080800b5 in strstr () at /usr/local/google/home/mascasa/code/llvm-project/compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp:96
#2  0xf7cc7bf5 in __pthread_initialize_minimal () from /lib/i386-linux-gnu/libpthread.so.0
#3  0xf7cc7014 in _init () from /lib/i386-linux-gnu/libpthread.so.0
#4  0x00000055 in ?? ()
#5  0xf7fcc6a0 in ?? ()

It looks like the REAL(strstr) isn't set up before it's called. Could you please take a look?

dokyungs updated this revision to Diff 278341.EditedJul 15 2020, 5:34 PM

Ensure the fuzzer RT module is initialized at the beginning of the interceptors.

Interceptors can be called before __fuzzer_init is called. So I added a check at the beginning of the interceptors, which ensures that __fuzzer_init has been called before proceeding.

morehouse added inline comments.Jul 16 2020, 9:52 AM
compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
54

These are in the global namespace, and have C mangling, which is unnecessary. Please either put them in a namespace or make them static.

56

This also doesn't need C mangling.

64

Let's prefer a function rather than a macro for this.

144

Let's use true/false rather than 1/0 for bools.

morehouse added inline comments.Jul 16 2020, 9:55 AM
compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
18

Nit: Let's move this down with the other defines.

31

Nit: Can we move the other includes down by this one?

dokyungs updated this revision to Diff 278565.Jul 16 2020, 11:54 AM

Addressed comments.

dokyungs marked 6 inline comments as done.Jul 16 2020, 11:55 AM
morehouse added inline comments.Jul 16 2020, 12:11 PM
compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
52

Sorry, one last nit:

If we're preferring LLVM style, let's capitalize variable names.

dokyungs updated this revision to Diff 278589.Jul 16 2020, 1:09 PM

Addressed comments.

dokyungs marked an inline comment as done.Jul 16 2020, 1:10 PM
morehouse accepted this revision.Jul 16 2020, 1:25 PM
This revision was automatically updated to reflect the committed changes.
dokyungs reopened this revision.Jul 17 2020, 1:30 PM
This revision is now accepted and ready to land.Jul 17 2020, 1:30 PM
dokyungs updated this revision to Diff 278892.Jul 17 2020, 1:33 PM

Introduce internal_(memcmp|strncmp|strstr) and use them before interceptors are fully initialized.

dokyungs edited the summary of this revision. (Show Details)Jul 17 2020, 1:44 PM
dokyungs edited the summary of this revision. (Show Details)
morehouse added inline comments.Jul 17 2020, 2:24 PM
compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
63

Can we use size_t instead of uintptr_t?

78

Can we use uint8_t and static_cast?

120

I think ensureFuzzerInited is no longer useful here.

compiler-rt/test/fuzzer/CustomAllocatorTest.cpp
15 ↗(On Diff #278892)

Do we need this file? Can we use EmptyTest.cpp instead?

compiler-rt/test/fuzzer/custom-allocator.test
3

Why do we need each of these flags?

compiler-rt/test/fuzzer/memcmp.test
9

Why is the custom allocator test here useful?

dokyungs updated this revision to Diff 278923.Jul 17 2020, 4:03 PM

Addressed comments.

dokyungs updated this revision to Diff 278924.Jul 17 2020, 4:05 PM
dokyungs marked 2 inline comments as done.

Removed CustomAllocatorTest.cpp. Instead, use EmptyTest.cpp.

dokyungs marked 4 inline comments as done.Jul 17 2020, 4:26 PM
dokyungs added inline comments.
compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
120

Fixed also in strncmp and strstr interceptors.

compiler-rt/test/fuzzer/custom-allocator.test
3

With all the flags, I designed this test for the recent failure scenario in which tcmalloc calls strncmp (+memcmp/strstr) when the fuzzer interceptor library is linked into the libFuzzer executable.

As such, we need to turn off ASan (-fno-sanitize=address) when building the executable to let the fuzzer interceptor library be linked.

As to the flags used to build the allocator shared library, I wanted to disable ASan and Fuzzer (via -fno-sanitize=all) because allocator libraries are typically not instrumented for OOB/UAF errors or coverage. I also wanted to prevent the compiler from optimizing out our calls to strncmp(+memcmp/strstr) by giving -fno-builtin; calls to these functions must go to the fuzzer interceptor library to comply with the scenario.

compiler-rt/test/fuzzer/memcmp.test
9

To make sure exercise the path where memcmp is called (i) in the calloc context, and (ii) then again in the LLVMFuzzerTestOneInput context. %t-NoAsanCustomAllocatorTest only tests (i), and %t-NoAsanMemcmpTest only tests (ii).

morehouse added inline comments.Jul 20 2020, 9:13 AM
compiler-rt/test/fuzzer/custom-allocator.test
3

Yes, those flags make sense. What about -fPIC %ld_flags_rpath_so1 -O0 -shared?

dokyungs updated this revision to Diff 279946.Jul 22 2020, 2:38 PM
dokyungs marked an inline comment as done.

Introduce internal_strcmp and update tests accordingly.

dokyungs marked 4 inline comments as done.Jul 22 2020, 2:40 PM
dokyungs added inline comments.
compiler-rt/test/fuzzer/custom-allocator.test
3

Removed unnecessary use of a shared library for testing custom allocator. Instead, compile and statically link the allocator into the test.

morehouse added inline comments.Jul 22 2020, 4:28 PM
compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
87

Lot's of common code with internal_strncmp. Let's factor it out into a helper function.

dokyungs updated this revision to Diff 279982.Jul 22 2020, 5:33 PM

Introduced a helper function to reduce duplicated code.

dokyungs marked 2 inline comments as done.Jul 22 2020, 5:35 PM
dokyungs added inline comments.
compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
87

Factored it out into a new function: internal_strcmp_strncmp

This revision was automatically updated to reflect the committed changes.
dokyungs marked an inline comment as done.
dmajor added a subscriber: dmajor.Jul 23 2020, 3:19 PM

After this commit, several of our builds are failing with FuzzerInterceptors.cpp:30:10: fatal error: 'sanitizer/common_interface_defs.h' file not found. This is odd because the file certainly seems like it exists. Is there a missing include path somewhere, perhaps?

After this commit, several of our builds are failing with FuzzerInterceptors.cpp:30:10: fatal error: 'sanitizer/common_interface_defs.h' file not found. This is odd because the file certainly seems like it exists. Is there a missing include path somewhere, perhaps?

There seems to be a missing include_directories in CMakeLists.txt. I will send out a patch for review shortly.