This change allows FuzzerMerge to process additional "FT_REL" and "COV_REL" markers for module-relative features and coverage. This allows merging even if the module load order may change between runs.
|1,440 ms||x64 debian > libFuzzer.libFuzzer::merge-control-file.test|
Script: -- : 'RUN: at line 2'; mkdir -p /mnt/disks/ssd0/agent/llvm-project/build/projects/compiler-rt/test/fuzzer/X86_64DefaultLinuxConfig/Output/merge-control-file.test.tmp
|1,760 ms||x64 debian > libFuzzer.libFuzzer::merge.test|
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/mnt/disks/ssd0/agent/llvm-project/compiler-rt/lib/fuzzer -m64 /mnt/disks/ssd0/agent/llvm-project/compiler-rt/test/fuzzer/FullCoverageSetTest.cpp -o /mnt/disks/ssd0/agent/llvm-project/build/projects/compiler-rt/test/fuzzer/X86_64DefaultLinuxConfig/Output/merge.test.tmp-FullCoverageSetTest
Does this patch affect performance noticeably?
|427 ↗||(On Diff #321773)|
How confident are we that features won't overflow uint32_t and wraparound?
|435 ↗||(On Diff #321773)|
Please explain this calculation with a comment.
|446 ↗||(On Diff #321773)|
I think the bitmask is superfluous, since the cast will drop the top 32 bits anyway
|483 ↗||(On Diff #321773)|
Please update this comment.
Bitmask is superfluous.
This only compares Len bytes, not Len uint32_ts.
The parsing here is confusing/tricky. Please add comments to clarify what's happening.
Aren't both of these cases also "too short"?
How are these "unknown markers"?
Addressed (most of) morehouse's comments. I still need to measure performance impact.
Whoops! Thanks, good catch!
I think Past Me was looking at it as "the header says it's longer than it is", but that's confusing. I've changed it to "Incomplete DSO section" and "DSO length mismatch" and added tests to the latter for both too many and too few features.
Distinguished between unexpected and unknown.
Regarding performance: I added some microbenchmarks in Merger::Parse and Fuzzer::CrashResistantMergeInternalStep around the parts that use ModuleRelativeValues objects. I then ran 1000 iterations of a test based on the first non-empty merge in compiler-rt/test/fuzzer/merge.test.
Roughly speaking, this change did increase the time spent in those loops by around 2x: from an average of 18.2 microseconds to 37.8 for merging, and from 20.9 to 45.7 microseconds for parsing. This seems intuitively right, as this change is adding a map lookup to each set insertion. Overall, for the whole -merge=1 command I measured a slowdown of approximately 7.3%, although I'm less confident of my methodology there. I ran time on a whole for i in $(seq 1000); do ... done kind of thing, and the standard deviation was higher than I'd like. Based on the microbenchmarks, I would have expected something closer to 3% (based on the 166.7 extra microseconds from 6 merges and two parses performed in the test).
All in all, it's a bit higher than I had hoped. Let me know if you feel this is getting to be too expensive. I think the module-relative values are definitely needed (since multiple remote processes may be started and have their module load order interleaved), but maybe it's worth only using them for remote fuzzers, and keep using the current approach for normal fuzzers. Supporting two different formats has some obvious maintenance drawbacks, but I could understand if it were necessary.
I'm playing a bit with a version that introduces new markers: "FT_REL" and "COV_REL". I'll finish that up soon, post it, and collect some more perf numbers.
As for Fork; I'm *really* tempted to simply drop the fork-related changes, and say -fork=1 doesn't work with -remote=1, which would be enforced in D94522. WDYT?
|427 ↗||(On Diff #321773)|
This is now handled by D97992.
I've tried to rewrite these checks using more of C++ style than C. Let me know if it's still confusing.
Fine with me if it reduces performance issues.
Okay; I've reworked this. It doesn't have any dependency on FuzzerModuleRelative now, and distinguishes between "FT/COV" and "FT_REL/COV_REL". It also doesn't touch FuzzerFork, as discussed.
In terms of performance, the smallest of microbenchmarks are kinda of apples-to-oranges due to various loop refactors, etc. Still most steps are +/- 10 microseconds. All told, it adds up to about 199 microseconds added to the average 27 *milliseconds* spent in CrashResistantMerge over 10k iterations (excluding measurements more than 10x the stdev; the outer loop's syscalls led to some aggressive outliers both before and after the change). This represents a performance increase of about 0.7%. I have a few ideas how to possibly squeeze even more; but I suspect we're in the realm of good enough.
Actually, this is wrong: I can't look up a module by a relative feature! I'll upload a new diff to D94512 that includes a ModuleInfoByHash and use that here. I'll also add a comment, and more importantly, add the tests that would have caught this.
Yes, and I'm already guaranteeing that in the unit tests. Thanks. +1