This is an archive of the discontinued LLVM Phabricator instance.

Use PassGate from LLVMContext if any otherwise global one
ClosedPublic

Authored by ebrevnov on Nov 1 2022, 3:10 AM.

Details

Summary

While specific PassGate can already be associated with LLVMContext it is simply ignored by the callback. As a result global one is always used. So the "meat" of that change is at StandardInstrumentations.cpp:771. All other changes are consequences of this change. In particular:

  1. Changed type of the first parameter of 'shouldRunPass' from "Pass *" to "StringRef". The problem is that "Pass" exists for legacy pass manager only. Thus we are unable to call it from new pass manager (via intstrumentation callbacks mechanism).
  2. LLVMContext has to be passed to registerCallbacks now
  3. OptBisect is just a customization of OptPassGate abstraction. Thus getGlobalPassGate has OptPassGate return type, while still returns an instance OptBisect (which is current default pass gate)

Diff Detail

Event Timeline

ebrevnov created this revision.Nov 1 2022, 3:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2022, 3:10 AM
ebrevnov requested review of this revision.Nov 1 2022, 3:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2022, 3:10 AM
ebrevnov edited the summary of this revision. (Show Details)Nov 1 2022, 10:31 PM
nhaehnle accepted this revision.Nov 2 2022, 8:03 AM

Seems pretty reasonable to me.

This revision is now accepted and ready to land.Nov 2 2022, 8:03 AM

ideally we wouldn't have to pass LLVMContext if possible, it makes it easier for users of this API to mess up. the optional FAM parameter to StandardInstrumentations already bothers me

I'd like to understand your motivation for this change to see if there's a nicer way of doing this (maybe there isn't and this patch is the way to go). is it to make -opt-bisect-limit work per-LLVMContext? or a custom downstream OptPassGate?

also the Pass* -> StringRef part of this patch looks like it can be split out into a separate patch

ideally we wouldn't have to pass LLVMContext if possible, it makes it easier for users of this API to mess up. the optional FAM parameter to StandardInstrumentations already bothers me

I agree. Let's see if we can find something better...

I'd like to understand your motivation for this change to see if there's a nicer way of doing this (maybe there isn't and this patch is the way to go). is it to make -opt-bisect-limit work per-LLVMContext? or a custom downstream OptPassGate?

The motivation is to make -opt-bisect-limit work per-LLVMContext' (as it was supposed to work I guess). In our environment, multiple threads are doing compilations in parallel and we want to be able to bisect only some particular ones.

also the Pass* -> StringRef part of this patch looks like it can be split out into a separate patch

Sure, that can be done if requested

It's sort of strange to have a pass gate reference in the LLVMContext and not use it. But you could just have a StandardInstrumentations::setPassGate method to store a reference locally.

It's sort of strange to have a pass gate reference in the LLVMContext and not use it. But you could just have a StandardInstrumentations::setPassGate method to store a reference locally.

IMHO, we should make StandardInstrumentations context aware... Alternative to StandardInstrumentations::setPassGate could be passing LLVMContext during StandardInstrumentations construction.

It's sort of strange to have a pass gate reference in the LLVMContext and not use it. But you could just have a StandardInstrumentations::setPassGate method to store a reference locally.

IMHO, we should make StandardInstrumentations context aware... Alternative to StandardInstrumentations::setPassGate could be passing LLVMContext during StandardInstrumentations construction.

Yeah, the use cases I'm aware of are pretty much tied to an LLVMContext already anyway, so that seems fine to me too.

thanks for the context, agreed that using the LLVMContext's pass gate is what's intended

the cleanest way IMO is to get the IR's LLVMContext (e.g. via unwrapModule(IR, /*Force*/ true).getContext()) and use the pass gate from there, rather than collect the pass gate at the beginning. WDYT of something like https://reviews.llvm.org/D137358?

ebrevnov updated this revision to Diff 473168.Nov 4 2022, 2:10 AM

Integreated "unwrapModule" approach

thanks for the context, agreed that using the LLVMContext's pass gate is what's intended

the cleanest way IMO is to get the IR's LLVMContext (e.g. via unwrapModule(IR, /*Force*/ true).getContext()) and use the pass gate from there, rather than collect the pass gate at the beginning. WDYT of something like https://reviews.llvm.org/D137358?

Totally agree. Just didn't realize we already can access the module. Integrated suggested approach except caching part. I think caching is unnecessary complication in this case.

thanks for the context, agreed that using the LLVMContext's pass gate is what's intended

the cleanest way IMO is to get the IR's LLVMContext (e.g. via unwrapModule(IR, /*Force*/ true).getContext()) and use the pass gate from there, rather than collect the pass gate at the beginning. WDYT of something like https://reviews.llvm.org/D137358?

Totally agree. Just didn't realize we already can access the module. Integrated suggested approach except caching part. I think caching is unnecessary complication in this case.

then you start running into compile time issues: https://llvm-compile-time-tracker.com/compare.php?from=66b830889d9bc4bfc58aa1ea47d1b3a7cbee7c5c&to=b7972184a59c0aa0b634de653be1c1363e980f4c&stat=instructions:u
previously we didn't call OptPassGateInstrumentation::registerCallbacks and run these callbacks before every pass if -opt-bisect-limit was enabled, with this patch we do. caching mitigates a lot of that. but even better is if you can somehow go back to not calling OptPassGateInstrumentation::registerCallbacks when we're not using opt-bisect. With your per-LLVMContext opt-bisect, do you still respect -opt-bisect-limit, or are you manually setting pass gates? if you're still using -opt-bisect-limit, we can go back to checking if it's set and skipping OptPassGateInstrumentation::registerCallbacks and we don't have to worry about caching.

thanks for the context, agreed that using the LLVMContext's pass gate is what's intended

the cleanest way IMO is to get the IR's LLVMContext (e.g. via unwrapModule(IR, /*Force*/ true).getContext()) and use the pass gate from there, rather than collect the pass gate at the beginning. WDYT of something like https://reviews.llvm.org/D137358?

Totally agree. Just didn't realize we already can access the module. Integrated suggested approach except caching part. I think caching is unnecessary complication in this case.

then you start running into compile time issues: https://llvm-compile-time-tracker.com/compare.php?from=66b830889d9bc4bfc58aa1ea47d1b3a7cbee7c5c&to=b7972184a59c0aa0b634de653be1c1363e980f4c&stat=instructions:u
previously we didn't call OptPassGateInstrumentation::registerCallbacks and run these callbacks before every pass if -opt-bisect-limit was enabled, with this patch we do.

That's kind of unexpected (at least for me). Do we known why such lightweight operation noticeably impacts compile time? Is it getting a module from a loop or something else?

caching mitigates a lot of that.

It does but it opens legality concerns. Do we have a guarantee that a module passed to the callback (through IR parameter) hasn't been changed between the invocations? If it can't change then why passing it to each callback invocation (as opposite to setting it once at creation time or later)? Same concerns apply to immutability of OptPassGate itself and it's IsEnabled field.

but even better is if you can somehow go back to not calling OptPassGateInstrumentation::registerCallbacks when we're not using opt-bisect. With your per-LLVMContext opt-bisect, do you still respect -opt-bisect-limit, or > are you manually setting pass gates?

Currently I'm setting it manually/programmatically to LLVMContext which has preference over global on if set.

if you're still using -opt-bisect-limit, we can go back to checking if it's set and skipping OptPassGateInstrumentation::registerCallbacks and we don't have to worry about caching.

In theory, we can introduce another global which will be set to 'true' only if any pass gate in the system is enabled and use that as a filter..... but that looks overcomplication I would really like to avoid.

aeubanks requested changes to this revision.Nov 7 2022, 1:34 PM

chasing pointers can be expensive, and the current compile time regressions are unacceptable so requesting changes in phabricator

perhaps we can only register the callback if either -opt-bisect-limit is set or if some other new cl::opt (-force-pass-gate-instrumentation?), then you'd specify that option when debugging?

This revision now requires changes to proceed.Nov 7 2022, 1:34 PM

chasing pointers can be expensive, and the current compile time regressions are unacceptable so requesting changes in phabricator

perhaps we can only register the callback if either -opt-bisect-limit is set or if some other new cl::opt (-force-pass-gate-instrumentation?), then you'd specify that option when debugging?

That looks a really hacky to me ... I would choose that option as a last resort.
Once realized we can't afford requesting pass gate from the module on each invocation of callback (due to CT) I'm inclined to think that passing LLVMContext (and FAM as well) during construction of StandardInstrumentations is the right way to go. This way we explicitly set expectations that StandardInstrumentations is tied to a single LLVMContext. Are you OK with such approach? If not then I will go with the caching approach (but I think it's less expressive and more complicated).

Gentle ping.

chasing pointers can be expensive, and the current compile time regressions are unacceptable so requesting changes in phabricator

perhaps we can only register the callback if either -opt-bisect-limit is set or if some other new cl::opt (-force-pass-gate-instrumentation?), then you'd specify that option when debugging?

That looks a really hacky to me ... I would choose that option as a last resort.
Once realized we can't afford requesting pass gate from the module on each invocation of callback (due to CT) I'm inclined to think that passing LLVMContext (and FAM as well) during construction of StandardInstrumentations is the right way to go. This way we explicitly set expectations that StandardInstrumentations is tied to a single LLVMContext. Are you OK with such approach? If not then I will go with the caching approach (but I think it's less expressive and more complicated).

I think I'd stlil slightly prefer caching (could cache the results of PassGate.isEnabled instead), but I think I'd be ok with adding LLVMContext as a parameter. We should add it as a required parameter to the StandardInstrumentations constructor though so people don't forget about it. (we should also move the FAM parameter into the constructor, but that's a separate issue)

ebrevnov updated this revision to Diff 475700.Nov 15 2022, 11:50 PM

Pass LLVMContext to the StandardInstrumentations constructor

lg other than one comment + formatting

llvm/include/llvm/Passes/StandardInstrumentations.h
79–80

clang-format?

llvm/lib/Passes/StandardInstrumentations.cpp
770

this should be in OptPassGateInstrumentation::registerCallbacks gating registerShouldRunOptionalPassCallback to avoid an extra callback per pass run

ebrevnov added inline comments.Nov 22 2022, 1:11 AM
llvm/include/llvm/Passes/StandardInstrumentations.h
79–80

clang-format works this way for me...what problem do you see here?

llvm/lib/Passes/StandardInstrumentations.cpp
770

PassGate can be set/remove/changed to LLVMContext at any time using LLVMContext::setOptPassGate() API. Thus it would be invalid to "cache" the state in the registerCallbacks.

aeubanks added inline comments.Nov 22 2022, 11:41 AM
llvm/include/llvm/Passes/StandardInstrumentations.h
79–80

there should be a space in Context(Context){};

I was confused why clang-format wasn't adding the space, eventually figured out it's the unnecessary semicolon haha

llvm/lib/Passes/StandardInstrumentations.cpp
770

again, this causes compile time issues: https://llvm-compile-time-tracker.com/compare.php?from=66b830889d9bc4bfc58aa1ea47d1b3a7cbee7c5c&to=c0e4d153eecc11f1bed41f242e72197c0d5c7baa&stat=instructions:u

I don't think it's reasonable to change the gate while passes are running

(also I had to update BackendUtil.cpp in clang to get it building, you can see that in the precommit bots)

ebrevnov marked an inline comment as done.Nov 22 2022, 10:13 PM
ebrevnov added inline comments.
llvm/include/llvm/Passes/StandardInstrumentations.h
79–80

Ough... fixed.

llvm/lib/Passes/StandardInstrumentations.cpp
770

I thought getting module from "IR" slows down things.... looks like that's something else.
Ok I've moved isEnabled to registerCallbacks.

I don't think it's reasonable to change the gate while passes are running

I don't think either but current API allows doing that...IMHO, we better change the API to prohibit such scenario.

(also I had to update BackendUtil.cpp in clang to get it building, you can see that in the precommit bots)

I see. Fixed. Thanks.

ebrevnov updated this revision to Diff 477446.Nov 23 2022, 3:49 AM
ebrevnov marked an inline comment as done.

Updated

Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2022, 3:49 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
aeubanks accepted this revision.Nov 23 2022, 9:15 AM

the precommit bot is still complaining that flang/lib/Frontend/FrontendActions.cpp needs to be updated

lgtm if the precommit bot doesn't show anything related to this patch

This revision is now accepted and ready to land.Nov 23 2022, 9:15 AM
ebrevnov updated this revision to Diff 477706.Nov 24 2022, 1:55 AM

Fixed build error in flang/lib/Frontend/FrontendActions.cpp

Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2022, 1:55 AM
This revision was landed with ongoing or failed builds.Nov 25 2022, 12:15 AM
This revision was automatically updated to reflect the committed changes.

Late to the party here, but passing an LLVMContext seems really strange from a JIT perspective.
We construct a pipeline that may be run over many modules each of them maybe using a different LLVM Context.

we don't currently support reusing a pipeline so I'm surprised that you're able to share/reuse pipelines without running into any issues

we don't currently support reusing a pipeline so I'm surprised that you're able to share/reuse pipelines without running into any issues

The way I see it, any pipeline state that can't be reused between runs is encapsulated within either the module or the analysis managers, with the analysis managers containing derived state that's expensive to compute about the module. Under that interpretation, the optimization passes are essentially stateless, so there's no reason they can't be run on multiple IR units (and CGSCC/Function/Loop passes must deal with this fact already since there's possibly more than one present per module). I don't know how much penalty there is for requiring passes to clean up after themselves, but I can't imagine that it's a very high cost.

we don't currently support reusing a pipeline so I'm surprised that you're able to share/reuse pipelines without running into any issues

In addition to @pchintalapudi's comment. Reuse of pipelines is something that we (JuliaLang) had come to expect from old PM. This is why opt had a -run-twice option to help flush
out bugs that arose out of the idea that passes would only be used once. @loladiro might remember those discussions.

So when we ported Julia to NewPM I didn't think twice and @pchintalapudi implemented our NewPM usage such that only the AnalysisManager
would be created fresh.

we don't currently support reusing a pipeline so I'm surprised that you're able to share/reuse pipelines without running into any issues

In addition to @pchintalapudi's comment. Reuse of pipelines is something that we (JuliaLang) had come to expect from old PM. This is why opt had a -run-twice option to help flush
out bugs that arose out of the idea that passes would only be used once. @loladiro might remember those discussions.

So when we ported Julia to NewPM I didn't think twice and @pchintalapudi implemented our NewPM usage such that only the AnalysisManager
would be created fresh.

Passes can store state that might break between runs (especially module passes since they typically only run once). I'm not saying it's impossible to reuse a pipeline, just that it's not tested, e.g. -run-twice isn't hooked up to the new PM. There was a discussion about this before somewhere, can't remember where. We'd need a lot more testing before we can claim to support reusing pipelines.

We'd need a lot more testing before we can claim to support reusing pipelines.

Sure, but moving pass pipeline state into the context makes it much harder to do this kind of testing.