Added fuzzing test for strcpy and some documentation related to fuzzing.
This will be the first step in integrating this with oss-fuzz.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 47082 Build 49857: arc lint + arc unit
Event Timeline
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
8 | Does oss-fuzz require this to not be mangled? | |
9–11 | No brackets here or the for and its if and also the last if. I think !size might be more common but I don't have a big preference. | |
15–21 | Maybe we will eventually add free standing function templates like those found in <algorithm> so things like this can become cpp::replace(data, data + size, 0, 'a') | |
16 | Capitilize replace | |
26–28 | Is this not assert(strcmp(dest, src)) because you think NDEBUG might be defined for this file? |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
8 | Just a few high level comments for now. Might have more later. Avoid using malloc/memcpy/abort:
|
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
8 | Yes, LibFuzzer and indirectly oss-fuzz requires symbols to be unmangled. | |
8 | Just to address the first comment. Non-zero returns are reserved by LibFuzzer. The usage to indicate fault is to crash the program. | |
26–28 | oss-fuzz compiles with optimization -o3 enabled. Does NDEBUG get defined with that level of optimization? If it does then assert will not crash the fuzzer as expected. |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
26–28 | @sivachandra said to avoid abort, so that would mean avoid assert too, we could change this abort to __builtin_trap. |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
26–28 | I am not really against using abort but against including stdlib.h. So, we can create an abort wrapper in utils/CPP which allows us to avoid including stdlib.h. But, if exit is disallowed in a fuzz target, is abort OK? FWIW, libcxx's fuzz tests seem to prefer assert as @abrachet suggested. Also, asking for my own knowledge: Should one care about correctness in a fuzz test? Correctness is important of course, but is that the goal of a fuzz test? |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
26–28 | Just correcting myself: If we conclude we need abort/assert, then we should put their wrappers in a fuzzing specific util-library and not in utils/CPP. |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
13 | If malloc returns NULL, return 0, otherwise when the system is under high memory pressure, the code may incorrectly trigger a crash. | |
14 | Placing malloc in the function LLVMFuzzerTestOneInput may make tests run slowly. | |
26 | Braces around a single statement are not common in LLVM code. I think Google code tends to have more braces because: % cat a.c int main() { if (strcmp(dest, src) != 0) abort(); } % clang-format --style=Google a.c int main() { if (strcmp(dest, src) != 0) abort(); } Many consider if (...) ... on the same line strange. LLVM style does not have the problem. |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
14 | The test case is rather simple so it runs sufficiently fast about 150k+ executions per second on one of my machine's cores. Since we cannot modify the fuzzer input data the only alternative would be using a static buffer, however that introduces a size constraint and we could miss a bug with bigger strings. | |
26 | Ah, thank you! |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
14 | nit: make data const uint8_t* then. |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
26–28 | LibFuzzer's documentation makes use of __builtin_trap so I'm replacing abort calls. As for correctness note, I believe that that there is a usefulness to having correctness also be a goal of the fuzz test. Here are some reason for this practice we might consider? |
Removed most dependencies on system libc headers and integrated changes requested by Kostya.
I terribly back-logged. I can fix up the target naming scheme issues later. Just one nit to fix and you can land it.
libc/docs/fuzzing.rst | ||
---|---|---|
2 | Limit to 80-char line widths. |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
17 | The length of the string that strcpy copies may not be the same as the size of the fuzzing input due to null-terminators appearing at random in data. |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
17 | Then if it is completely random this if (data[size - 1] != '\0') return 0; will end the test 255/256 times, no? Also without removing the 0's like before from the input then the average length will be just 256 then. Is this a problem? Or perhaps a better question is it a smaller problem than the previously raised concerned that the extra allocation was too costly? | |
22 | Should we be failing when the system can't allocate memory this isn't __llvm_libc::strcpy's fault. |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
22 | This was discussed with Kotsya in an offline meeting. It is extremely unlikely that the system would be out of memory for the relatively small sizes (by default under 4k bytes though oss-fuzz tests with larger inputs). So a failure here probably indicates something going wrong with the memory allocator which is worth catching as well. |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
17 | Apologies, it wouldn't be completely random, the fuzzer is coverage guided so it'll learn that null-terminated strings are what we expect and it should then provide that more often. This was explained to me offline and it seems that modifying the input isn't necessarily needed and we should leave it up to the fuzzer. |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
17 | I see. Sorry about the ignorance on fuzzers! |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
17 | No worries, I was ignorant too until it was explained to me. (: |
libc/fuzzing/string/strcpy_fuzz.cpp | ||
---|---|---|
2 | Ah, thanks for catching the missing license header. |
Limit to 80-char line widths.