This is an archive of the discontinued LLVM Phabricator instance.

[libc] Lay out framework for fuzzing libc functions.
ClosedPublic

Authored by PaulkaToast on Feb 5 2020, 2:12 PM.

Details

Summary

Added fuzzing test for strcpy and some documentation related to fuzzing.
This will be the first step in integrating this with oss-fuzz.

Event Timeline

PaulkaToast created this revision.Feb 5 2020, 2:12 PM
abrachet added inline comments.
libc/fuzzing/string/strcpy_fuzz.cpp
8

Does oss-fuzz require this to not be mangled?

9–11

No brackets here or the for and its if and also the last if. I think !size might be more common but I don't have a big preference.

15–21

Maybe we will eventually add free standing function templates like those found in <algorithm> so things like this can become cpp::replace(data, data + size, 0, 'a')

16

Capitilize replace

26–28

Is this not assert(strcmp(dest, src)) because you think NDEBUG might be defined for this file?

sivachandra added inline comments.Feb 5 2020, 3:15 PM
libc/fuzzing/string/strcpy_fuzz.cpp
8

Just a few high level comments for now. Might have more later.

Avoid using malloc/memcpy/abort:

PaulkaToast marked 3 inline comments as done.
PaulkaToast added inline comments.Feb 6 2020, 11:45 AM
libc/fuzzing/string/strcpy_fuzz.cpp
8

Yes, LibFuzzer and indirectly oss-fuzz requires symbols to be unmangled.

8

Just to address the first comment. Non-zero returns are reserved by LibFuzzer. The usage to indicate fault is to crash the program.

26–28

oss-fuzz compiles with optimization -o3 enabled. Does NDEBUG get defined with that level of optimization? If it does then assert will not crash the fuzzer as expected.

abrachet added inline comments.Feb 6 2020, 12:07 PM
libc/fuzzing/string/strcpy_fuzz.cpp
26–28

@sivachandra said to avoid abort, so that would mean avoid assert too, we could change this abort to __builtin_trap.

FWIW https://godbolt.org/z/khdC4E

sivachandra added inline comments.Feb 6 2020, 12:18 PM
libc/fuzzing/string/strcpy_fuzz.cpp
26–28

I am not really against using abort but against including stdlib.h. So, we can create an abort wrapper in utils/CPP which allows us to avoid including stdlib.h.

But, if exit is disallowed in a fuzz target, is abort OK? FWIW, libcxx's fuzz tests seem to prefer assert as @abrachet suggested.

Also, asking for my own knowledge: Should one care about correctness in a fuzz test? Correctness is important of course, but is that the goal of a fuzz test?

sivachandra added inline comments.Feb 6 2020, 3:04 PM
libc/fuzzing/string/strcpy_fuzz.cpp
26–28

Just correcting myself: If we conclude we need abort/assert, then we should put their wrappers in a fuzzing specific util-library and not in utils/CPP.

MaskRay added inline comments.Feb 11 2020, 9:59 PM
libc/fuzzing/string/strcpy_fuzz.cpp
13

If malloc returns NULL, return 0, otherwise when the system is under high memory pressure, the code may incorrectly trigger a crash.

14

Placing malloc in the function LLVMFuzzerTestOneInput may make tests run slowly.

26

Braces around a single statement are not common in LLVM code. I think Google code tends to have more braces because:

% cat a.c
int main() {
  if (strcmp(dest, src) != 0)
    abort();
}
% clang-format --style=Google a.c
int main() {
  if (strcmp(dest, src) != 0) abort();
}

Many consider if (...) ... on the same line strange. LLVM style does not have the problem.

PaulkaToast marked 3 inline comments as done.
PaulkaToast added inline comments.
libc/fuzzing/string/strcpy_fuzz.cpp
14

The test case is rather simple so it runs sufficiently fast about 150k+ executions per second on one of my machine's cores.

Since we cannot modify the fuzzer input data the only alternative would be using a static buffer, however that introduces a size constraint and we could miss a bug with bigger strings.

26

Ah, thank you!

abrachet added inline comments.Feb 13 2020, 2:01 PM
libc/fuzzing/string/strcpy_fuzz.cpp
14

nit: make data const uint8_t* then.

PaulkaToast marked an inline comment as done.Feb 13 2020, 2:07 PM
PaulkaToast added inline comments.
libc/fuzzing/string/strcpy_fuzz.cpp
26–28

LibFuzzer's documentation makes use of __builtin_trap so I'm replacing abort calls.

As for correctness note, I believe that that there is a usefulness to having correctness also be a goal of the fuzz test. Here are some reason for this practice we might consider?

Removed most dependencies on system libc headers and integrated changes requested by Kostya.

sivachandra accepted this revision.Feb 21 2020, 2:55 PM

I terribly back-logged. I can fix up the target naming scheme issues later. Just one nit to fix and you can land it.

libc/docs/fuzzing.rst
2

Limit to 80-char line widths.

This revision is now accepted and ready to land.Feb 21 2020, 2:55 PM
abrachet added inline comments.Feb 21 2020, 3:18 PM
libc/fuzzing/string/strcpy_fuzz.cpp
2

We would probably want a header comment here? Also please run clang-format

6

validate input -> Validate input. The same for the rest of the comments

17

Couldn't this just be from i = 0 to size?

PaulkaToast marked 4 inline comments as done.
PaulkaToast added inline comments.
libc/fuzzing/string/strcpy_fuzz.cpp
17

The length of the string that strcpy copies may not be the same as the size of the fuzzing input due to null-terminators appearing at random in data.

abrachet added inline comments.Feb 21 2020, 5:39 PM
libc/fuzzing/string/strcpy_fuzz.cpp
17

Then if it is completely random this if (data[size - 1] != '\0') return 0; will end the test 255/256 times, no?

Also without removing the 0's like before from the input then the average length will be just 256 then. Is this a problem? Or perhaps a better question is it a smaller problem than the previously raised concerned that the extra allocation was too costly?

23

Should we be failing when the system can't allocate memory this isn't __llvm_libc::strcpy's fault.

PaulkaToast marked an inline comment as done.Feb 21 2020, 5:52 PM
PaulkaToast added inline comments.
libc/fuzzing/string/strcpy_fuzz.cpp
23

This was discussed with Kotsya in an offline meeting. It is extremely unlikely that the system would be out of memory for the relatively small sizes (by default under 4k bytes though oss-fuzz tests with larger inputs). So a failure here probably indicates something going wrong with the memory allocator which is worth catching as well.

PaulkaToast marked an inline comment as done.Feb 21 2020, 6:16 PM
PaulkaToast added inline comments.
libc/fuzzing/string/strcpy_fuzz.cpp
17

Apologies, it wouldn't be completely random, the fuzzer is coverage guided so it'll learn that null-terminated strings are what we expect and it should then provide that more often. This was explained to me offline and it seems that modifying the input isn't necessarily needed and we should leave it up to the fuzzer.

abrachet accepted this revision.Feb 21 2020, 6:39 PM
abrachet added inline comments.
libc/fuzzing/string/strcpy_fuzz.cpp
17

I see. Sorry about the ignorance on fuzzers!

PaulkaToast marked an inline comment as done.Feb 21 2020, 7:04 PM
PaulkaToast added inline comments.
libc/fuzzing/string/strcpy_fuzz.cpp
17

No worries, I was ignorant too until it was explained to me. (:

This revision was automatically updated to reflect the committed changes.
sivachandra added inline comments.Feb 21 2020, 9:11 PM
libc/fuzzing/string/strcpy_fuzz.cpp
2

Ah, thanks for catching the missing license header.