This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO, NewPM] Register sanitizers with OptimizerLastPassBuilderHook
AbandonedPublic

Authored by vitalybuka on Feb 10 2021, 2:49 PM.

Details

Summary

Clang needs to register sanitizer at OptimizerLastEPCallback.
However those callbacks are executed on the ThinLTOPostLink and we had
no way to configure that.

OptimizerLastPassBuilderHook let clang to configure PassBuilder in ThinLTO backed.

This fixes sanitizers with distributed ThinLTO. Non distributed is not fixed yet.

Diff Detail

Event Timeline

vitalybuka created this revision.Feb 10 2021, 2:49 PM
vitalybuka requested review of this revision.Feb 10 2021, 2:50 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 10 2021, 2:50 PM
vitalybuka edited the summary of this revision. (Show Details)Feb 10 2021, 2:51 PM
vitalybuka added reviewers: aeubanks, tejohnson.
vitalybuka retitled this revision from [ThinLTO] Add Config::OptPassBuilderHook to [ThinLTO, NewPM] Add Config::OptPassBuilderHook.
vitalybuka edited the summary of this revision. (Show Details)Feb 10 2021, 2:54 PM

Test will follow if you agree with approach.

tejohnson added inline comments.Feb 10 2021, 4:01 PM
clang/lib/CodeGen/BackendUtil.cpp
1542

I can't find where this is defined.

llvm/include/llvm/LTO/Config.h
56

Is this essentially the new PM equivalent to PreCodeGenPassesHook above (since this new hook runs at the end of opt)?
Also, might be good to name this like OptimizerLastPassBuilderHook or something like that to indicate its position within the opt pipeline.

This makes sense to me, we need to add the PassBuilder callbacks in both the prelink/normal and postlink steps, and those are separated.

llvm/include/llvm/LTO/Config.h
56

I think the idea is that this callback adds PassBuilder callbacks in various parts of the the pipeline. Perhaps just PassBuilderHook? It's weird that we only allow one, but I guess we can make it a vector in the future if necessary.

llvm/lib/LTO/LTOBackend.cpp
330–331

this should go in runNewPMPasses(), not runNewPMCustomPasses().

vitalybuka marked an inline comment as done.

update

vitalybuka added inline comments.Feb 11 2021, 3:09 AM
clang/lib/CodeGen/BackendUtil.cpp
1542

to avoid noise I extracted into a separate NFC patch, to avoid, line 1061

llvm/include/llvm/LTO/Config.h
56

Kind of. PreCodeGenPassesHook is codegen and it's Legacy PM only.

56

I guess the point of OptimizerLastPassBuilderHook name is to show that this is not any PassBuilder, but the one used in ::opt()

tejohnson added inline comments.Feb 11 2021, 8:17 AM
clang/lib/CodeGen/BackendUtil.cpp
1066

Some comments around the cases being checked in this if-then-else would be helpful.

1328

I don't understand what you mean by "O0 called optimized"?
Also maybe make it clear that the first sentence goes with the second check?

Also, it isn't clear to me how this is preventing the sanitizers from being added in the ThinLTO pre-link compiles for optimized compiles, as shown in your tests. Unlike regular LTO pre-link optimized builds, which are still getting the sanitizers.

1541

This will add sanitizers for the ThinLTO post-link compiles in a distributed build environment. Does something need to set these up for ThinLTO post-link compiles in the in-process ThinLTO case? In that case thinBackend() is invoked via LTO.cpp through the linker.

llvm/include/llvm/LTO/Config.h
51–53

This one should say NewPM only.

53

This one should say legacy PM only

56

My point was just that they are effectively at the same place (the end of opt is pretty much the beginning of codegen), so I was wondering if it was worth making the legacy and new PM hooks be at the same place. But it probably doesn't matter much.

vitalybuka marked 5 inline comments as done.

update

vitalybuka retitled this revision from [ThinLTO, NewPM] Add Config::OptPassBuilderHook to [ThinLTO, NewPM] Register sanitizers with OptimizerLastPassBuilderHook.Feb 11 2021, 6:33 PM
vitalybuka edited the summary of this revision. (Show Details)
vitalybuka added inline comments.
clang/lib/CodeGen/BackendUtil.cpp
1328

Reworded:
OptimizerLastEPCallbacks is good for anything that not SkipThinLTOPostLink
We need protection only for OptimizerLastEPCallbacks+O0

1541

Yes. And I don't have good idea how to handle that.
Maybe addSanitizer() needs to be moved into
PassBuilder, but not sure how to get CodeGenOpts and LangOpts there.

For now this patch does not brake that case as it's already broken. Maybe we can resolve this later if needed.

llvm/include/llvm/LTO/Config.h
53

Codegen one is invoked on codegen PM which is different from opt PM
For now codegen PM is always legacy

vitalybuka added inline comments.Feb 11 2021, 6:35 PM
clang/lib/CodeGen/BackendUtil.cpp
1328

Reworded:
OptimizerLastEPCallbacks is good for anything that not SkipThinLTOPostLink
We need protection only for OptimizerLastEPCallbacks+O0

We need special protection only against (SkipThinLTOPostLink && O0)

aeubanks added inline comments.Feb 11 2021, 6:39 PM
clang/lib/CodeGen/BackendUtil.cpp
1069–1070

We could fix this. It doesn't make sense for only -O0 to run OptimizerLastEPCallbacks prelink. Would that help simplify this?

vitalybuka added inline comments.Feb 12 2021, 2:03 AM
clang/lib/CodeGen/BackendUtil.cpp
1069–1070

Do you mean move optimizer of -O0 into PostLink as well?
Yes, this would be more consistent.
Still I guess it should a separate patch.

tejohnson added inline comments.Feb 12 2021, 9:48 PM
clang/lib/CodeGen/BackendUtil.cpp
1068

nit: s/provideds/provided/
Also the line wrapping is off.

1069–1070

Or does it make sense to go the other way, i.e. find a way to add the sanitizer passes to the end of the pre-link of ThinLTO like for regular LTO? The ThinLTO pre-link doesn't execute the module optimization pipeline, but presumably at the end of buildThinLTOPreLinkDefaultPipeline we could invoke the OptimizerLastEPCallbacks? That avoids the issue of trying to get this working separately for in-process ThinLTO. And would be more consistent with regular LTO.

1078

You should only have to check PrepareForThinLTO to see if we are in the pre-link.

1080

nit: s/do not triggers/does not trigger/
Also the line wrapping is off.

1330

nit: s/allow to instromen module/allow instrumenting the module/

vitalybuka marked 4 inline comments as done.

fix language

undo CodeGenOpts.ThinLTOIndexFile.empty()

clang/lib/CodeGen/BackendUtil.cpp
1069–1070

I started from PreLink approach but @aeubanks advised the PostLink. At this point I slightly prefer it over PreLink.

Sanitizers In PostLink pros:
+ instrumentation of optimized code produces smaller binary
+ optimizer called after instrumentation may removes some checks and we miss some bugs (It's the case for non-LTO for at least msan. It stops reporting some bugs. Maybe just a bug in msan.)
+ StackSafetyAnalysis supports ThinLTO and can be used to optimize sanitizers.
+ consistent with non-LTO pipeline
+ OptimizerLastEPCallbacks called once per module (I guess not true for full LTO).

Sanitizers In PreLink pros:
+ in-process ThinLTO. does not need special handling
+ Simpler patch
+ Can keep inconsistent -O0 pipeline (not sure why)
+ consistent with regular LTO (but why regular LTO is not consistent with ThinLTO here?)

WDYT if ThinLTO PreLink with sanitizer use buildO0DefaultPipeline?

1078

It needed for clang/test/CodeGen/cspgo-instrumentation_thinlto.c:26:53

vitalybuka added inline comments.Feb 17 2021, 2:17 PM
clang/lib/CodeGen/BackendUtil.cpp
1069–1070

Sanitizers In PostLink pros:
+ instrumentation of optimized code produces smaller binary

Strangely it's an opposite. On our "internal large binary" I see smaller Asan/Msan sizes if we instrument PreLink.
So this pro can be discarded.

At this point I don't have strong preference.

How can we agree which approach to use?

tejohnson added inline comments.Feb 18 2021, 8:32 AM
clang/lib/CodeGen/BackendUtil.cpp
1069–1070

I like the approach of doing the instrumentation in the prelink for now. The consistency with O0 and with regular LTO is a bonus, and the ability to handle in-process ThinLTO is a big benefit.

It may have some benefits as you found with the binary size because we can apply ThinLTO optimizations to the instrumented code.

I'm curious - what is the issue with StackSafetyAnalysis when this is done in the pre link?

vitalybuka added inline comments.Feb 18 2021, 12:42 PM
clang/lib/CodeGen/BackendUtil.cpp
1069–1070

I'm curious - what is the issue with StackSafetyAnalysis when this is done in the pre link?

It's just about efficiency. StackSafetyAnalysis supports ThinLTO and in PostLink it can prove more safe allocas including even cross-module calls. It's good to have, but realistically we may never need this. We don't even yet integrated StackSafetyAnalysis into sanitizers (so far it is used only by SafeStack and MemoryTagging which are codegen passes).

In-process support looks more important.

vitalybuka abandoned this revision.Feb 23 2021, 2:01 PM

in favor of D96320