This is an archive of the discontinued LLVM Phabricator instance.

[SanitizerCoverage] Prevent /OPT:REF from stripping constructors
ClosedPublic

Authored by metzman on Sep 14 2018, 1:03 PM.

Details

Summary

Linking with the /OPT:REF linker flag when building COFF files causes
the linker to strip SanitizerCoverage's constructors. Prevent this by
giving the constructors WeakODR linkage and by passing the linker a
directive to include sancov.module_ctor.

Include a test in compiler-rt to verify libFuzzer can be linked using
/OPT:REF

Diff Detail

Repository
rL LLVM

Event Timeline

metzman created this revision.Sep 14 2018, 1:03 PM
metzman retitled this revision from Don't make sancov module constructor a comdat. Doing so causes the constructor to be stripped if the linker is told to strip unreferenced functions (/OPT:REF) to Don't make sancov module constructor comdat..Sep 14 2018, 1:07 PM
metzman edited the summary of this revision. (Show Details)
metzman edited the summary of this revision. (Show Details)
metzman updated this revision to Diff 165606.Sep 14 2018, 4:00 PM
  • Improve comments.
  • Add test to compiler-rt
metzman updated this revision to Diff 165609.Sep 14 2018, 4:04 PM
  • fix spacing
metzman retitled this revision from Don't make sancov module constructor comdat. to [SanitizerCoverage] Don't make sancov module constructor comdat.Sep 16 2018, 3:12 PM
metzman edited the summary of this revision. (Show Details)
metzman added subscribers: morehouse, rnk.

@morehouse @rnk
Please take a look.
This patch allows libFuzzer targets to be compiled with /OPT:REF and adds a test to ensure this.

morehouse accepted this revision.Sep 17 2018, 9:53 AM

Change is fine with me since Linux behavior is unchanged. @rnk can comment on whether this is the best approach for Windows or not.

This revision is now accepted and ready to land.Sep 17 2018, 9:53 AM

@rnk what do you think of this?
This change does cause many constructor calls to happen (thousands of times for some fuzzers) but there doesn't seem to be a problem with this as libFuzzer ignores the redundant calls.
Also, I'm not worried about a performance penalty from the redundant calls since they are cheap and only happen on program startup.

rnk added a comment.Sep 17 2018, 7:15 PM

I think we need to back up and think about what we're doing where and why.

If the application were fully statically linked, we wouldn't need to do anything in instrumentation code at all, we could just reference __start_*/__stop_* directly from the libfuzzer runtime, and that would contain all of sanitizer coverage data. The reason we have these module constructor calls is to deal with the case where the application is composed of multiple DSOs/DLLs. Each DSO will contain its own guards, counters, and pc arrays, and we can't arrange for them all to be contiguous.

To deal with this, during instrumentation, we inject an initializer (these sancov.module_ctor functions) into every module compiled with coverage instrumentation. We do it instrumentation and not in a compiler-rt library, because we only link the fuzzer RTL into the main executable (I think). By adding this initializer in instrumentation, we can discover every .SCOV table in the whole application.

We only need one registration call per DSO, so the existing code uses comdats to deduplicate them. However, we are using comdats in a non-portable way. We give the symbol internal linkage, and then put it in a comdat group. For that to really be meaningful on COFF, the function needs to have some externally visible linkage, like weak_odr or linkonce_odr. I remember that doesn't work, but can you remind me why? It would require adding /include:sancov.module_ctor (and maybe renaming it to remove the .) to ensure it doesn't get stripped.

But, our whole fancy instrumentation-based comdat-deduplicating registration thing doesn't work because we only provide start_sancov_guards etc in the libfuzzer runtime. We can probably fix that by emitting the __start/__stop symbols in the instrumentation pass, but that starts to feel like a lot of complexity.

If we only care about statically linked apps, I think it would be much simpler to bail out of SanitizerCoverageModule::CreateInitCallsForSections early on COFF targets, and put an initializer in libfuzzer that calls __sanitizer_cov_trace_pc_guard_init(&__start___sancov_guards, ...). Then we don't need to worry about /OPT:REF, comdats, and we can remove the COFF-only logic to add sizeof(uint64_t) __start symbols.

Basically, move the complexity to the runtime, and then we have looser coupling between instrumentation and runtime. The runtime will know the __start symbol is a uint64_t because that's where we define it.

Does that make sense?

metzman planned changes to this revision.Sep 21 2018, 9:09 AM
In D52119#1237797, @rnk wrote:

I think we need to back up and think about what we're doing where and why.

If the application were fully statically linked, we wouldn't need to do anything in instrumentation code at all, we could just reference __start_*/__stop_* directly from the libfuzzer runtime, and that would contain all of sanitizer coverage data. The reason we have these module constructor calls is to deal with the case where the application is composed of multiple DSOs/DLLs. Each DSO will contain its own guards, counters, and pc arrays, and we can't arrange for them all to be contiguous.

Thank you for pointing this out. I didn't think of the benefits for dynamic linking.

To deal with this, during instrumentation, we inject an initializer (these sancov.module_ctor functions) into every module compiled with coverage instrumentation. We do it instrumentation and not in a compiler-rt library, because we only link the fuzzer RTL into the main executable (I think). By adding this initializer in instrumentation, we can discover every .SCOV table in the whole application.

I'm not sure the part about "the main executable" is correct.
I think we link the fuzzer RTL against everything.
Isn't this is how the __sanitizer_coverage_trace_* functions (such as [[ https://cs.chromium.org/chromium/src/third_party/libFuzzer/src/FuzzerTracePC.cpp?q=__sanitizer&sq=package:chromium&g=0&l=473 | __sanitizer_cov_trace_pc_guard ]]) which are defined in the fuzzer RTL, are used in all DSO/modules?
I also think this is why -fsanitize=fuzzer-no-link (which I think is confusingly/incorrectly named, since even when using this, we link against the fuzzer RTL, just not main) exists, without it the linker would be upset because 2 main functions exist, one defined by libFuzzer and one by the application.

We only need one registration call per DSO, so the existing code uses comdats to deduplicate them. However, we are using comdats in a non-portable way. We give the symbol internal linkage, and then put it in a comdat group. For that to really be meaningful on COFF, the function needs to have some externally visible linkage, like weak_odr or linkonce_odr. I remember that doesn't work, but can you remind me why? It would require adding /include:sancov.module_ctor (and maybe renaming it to remove the .) to ensure it doesn't get stripped.

I tried changing the linkage to every other possible value, but each time I compiled a target with /OPT:REF the constructors didn't execute.
I did also separately try adding a #pragma to include sancov.module_ctor in sanitizer_coverage_win_sections.cc but this caused a link failure.

If I recall correctly, I think manually adding !{!"/include:sancov.module_ctor"} to llvm.linker.options in the LLVM IR file of the target I was compiling worked. I assume that injecting this in SanitizerCoverage.cpp would work too.

But, our whole fancy instrumentation-based comdat-deduplicating registration thing doesn't work because we only provide start_sancov_guards etc in the libfuzzer runtime. We can probably fix that by emitting the __start/__stop symbols in the instrumentation pass, but that starts to feel like a lot of complexity.

Based on my possibly wrong understanding, I don't think we can emit these symbols in the pass. I think by having the linker define __start_* and __stop_* we don't need to keep PCs in the PC table for functions that have been stripped by the linker (ie: when the linker strips a function, it removes the elements associated with the function from the PC table and counter array).

If we only care about statically linked apps, I think it would be much simpler to bail out of SanitizerCoverageModule::CreateInitCallsForSections early on COFF targets, and put an initializer in libfuzzer that calls __sanitizer_cov_trace_pc_guard_init(&__start___sancov_guards, ...). Then we don't need to worry about /OPT:REF, comdats, and we can remove the COFF-only logic to add sizeof(uint64_t) __start symbols.

Now that I think about it, I probably do want to support for non-statically linked apps in the future (I think even in static chrome builds some libraries, such as libEGL which provides WebGL, are loaded dynamically).
I think it is fine to ignore them for now, but I'm not sure we should use solutions that will never work for them.

Basically, move the complexity to the runtime, and then we have looser coupling between instrumentation and runtime. The runtime will know the __start symbol is a uint64_t because that's where we define it.

Does that make sense?

I think I understand what you want me to do (eliminate constructors for msvc, emit the symbols in the pass, and then reference them directly from libFuzzer), but I think there are some alternatives that still allow the symbols to be linker defined and don't break dynamic linking further:

  1. My current solution
  2. A solution that allows deduplication but still forces the linker to not strip all of the constructor, possibly as my solution with llvm.linker.options (which I'm actually not sure would allow duplication even if it does make the constructor comdat).

Do you prefer one or the other (or is my understanding wrong and I should make your suggested change of moving the complexity to the runtime)?

rnk added a comment.Oct 2 2018, 10:18 AM

I think I understand what you want me to do (eliminate constructors for msvc, emit the symbols in the pass, and then reference them directly from libFuzzer), but I think there are some alternatives that still allow the symbols to be linker defined and don't break dynamic linking further:

  1. My current solution
  2. A solution that allows deduplication but still forces the linker to not strip all of the constructor, possibly as my solution with llvm.linker.options (which I'm actually not sure would allow duplication even if it does make the constructor comdat).

Do you prefer one or the other (or is my understanding wrong and I should make your suggested change of moving the complexity to the runtime)?

I think we should aim for 2. Adding /include: to llvm.linker.options should work if we also change the linkage of sancov.module_ctor to anything other than internal. The most reasonable linkage is probably weak_odr.

My concern with the first solution, making the constructor not comdat, is that it's going to slow down process startup, which mattered enough on ELF to make it worth doing the constructor deduplication.

If we are linking the sanitizer runtime into every DLL that uses sanitizer coverage, then we could do it in the runtime, but it would diverge significantly from the Linux implementation. If we could make 2 work it's probably best.

metzman updated this revision to Diff 169220.Oct 11 2018, 8:49 AM
  • Improve comment.
  • Add test to compiler-rt
  • improve comment
  • fix spacing
  • Use COMDAT but force include of constructors and give them weak ODR linking
This revision is now accepted and ready to land.Oct 11 2018, 8:49 AM
metzman updated this revision to Diff 169222.Oct 11 2018, 8:54 AM
  • clang-format
metzman updated this revision to Diff 169223.Oct 11 2018, 9:02 AM
  • clang-format

@rnk
Please take another look.

I think we should aim for 2. Adding /include: to llvm.linker.options should work if we also change the linkage of sancov.module_ctor to anything other than internal. The most reasonable linkage is probably weak_odr.

I've changed the code to go with "2". You were right that the linkage needed to change for this to work so I changed it to WeakODR

My concern with the first solution, making the constructor not comdat, is that it's going to slow down process startup, which mattered enough on ELF to make it worth doing the constructor deduplication.

I don't think this is a huge concern (I think startup difference wasn't noticeable to human), but I think my current solution does not have this issue even though the previous one did. I confirmed using print statements that there is only one call to the sancov.module_ctor in targets that had thousands with the previous solution.

If we are linking the sanitizer runtime into every DLL that uses sanitizer coverage, then we could do it in the runtime, but it would diverge significantly from the Linux implementation. If we could make 2 work it's probably best.

I don't actually know what affect this has on dynamic linking, since we are only using static in Chrome right now. I definitely want to support it soon so that others can use LibFuzzer on Windows, but it's certainly a lower priority than getting static builds working perfectly.
I observed the problems with OPT:REF (that this patch fixes) in static builds.

rnk added a comment.Oct 11 2018, 1:15 PM

Thanks! I think this will work well.

llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
310 ↗(On Diff #169223)

Unfortunately, I think the names passed to /include: on 32-bit need to be underscore prefixed. I'd use this code to deal with it:

SmallString<20> IncDirective = "/include:";
Mangler().getNameWithPrefix(PrefixedName, CtorFunc, true);
...MDString::get(*C, IncDirective)..
metzman updated this revision to Diff 169428.Oct 12 2018, 9:27 AM
  • Add support for i386.
metzman marked an inline comment as done.Oct 12 2018, 10:08 AM
metzman added inline comments.
llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
310 ↗(On Diff #169223)

I believe I have fixed this, please let me know what you think.

I haven't tested it because libFuzzer doesn't support builds for i386 (not really sure why).

metzman marked an inline comment as done.Oct 12 2018, 10:10 AM
metzman added inline comments.
llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
310 ↗(On Diff #169223)

I haven't tested it

By this I mean, I haven't tested it works for x86, I did test that it still works for x64.

rnk accepted this revision.Oct 12 2018, 10:58 AM

lgtm

llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
310 ↗(On Diff #169223)

That's fine.

metzman retitled this revision from [SanitizerCoverage] Don't make sancov module constructor comdat to [SanitizerCoverage] Prevent /OPT:REF from stripping constructor.Oct 12 2018, 11:08 AM
metzman edited the summary of this revision. (Show Details)
metzman edited the summary of this revision. (Show Details)
metzman retitled this revision from [SanitizerCoverage] Prevent /OPT:REF from stripping constructor to [SanitizerCoverage] Prevent /OPT:REF from stripping constructors.Oct 12 2018, 11:11 AM
metzman edited the summary of this revision. (Show Details)
This revision was automatically updated to reflect the committed changes.
pcc added a subscriber: pcc.Dec 14 2018, 10:59 PM
pcc added inline comments.
llvm/trunk/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
311

Instead of this you could have done appendToUsed(M, CtorFunc);, we automatically emit /include directives for symbols in llvm.used.
http://llvm-cs.pcc.me.uk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#1553

metzman marked an inline comment as done.Dec 17 2018, 3:11 PM
metzman added inline comments.
llvm/trunk/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
311

That definitely seems cleaner. Thanks!

I'll try this out when I get a chance.