This is an archive of the discontinued LLVM Phabricator instance.

[Attr] Add "noipa" function attribute
Needs ReviewPublic

Authored by dblaikie on Apr 21 2021, 7:25 PM.

Details

Summary

This is mostly inspired by the patch that added willreturn as a
reference point ( 3b77583e95236761d8741fd6df375975a8ca5d83 ).

Inspired by this thread:
https://lists.llvm.org/pipermail/llvm-dev/2021-April/149960.html

Solves PR41474

Diff Detail

Event Timeline

dblaikie created this revision.Apr 21 2021, 7:25 PM
dblaikie requested review of this revision.Apr 21 2021, 7:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2021, 7:25 PM

GCC docs: This attribute implies noinline, noclone and no_icf attributes. So for example:

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 6b966e7ca133..a13b1755cedf 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1757,6 +1757,10 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
     // Naked implies noinline: we should not be inlining such functions.
     B.addAttribute(llvm::Attribute::Naked);
     B.addAttribute(llvm::Attribute::NoInline);
+  } else if (D->hasAttr<NoIPAAttr>()) {
+    // NoIPA implies noinline: we should not be inlining such functions.
+    B.addAttribute(llvm::Attribute::NoIPA);
+    B.addAttribute(llvm::Attribute::NoInline);
   } else if (D->hasAttr<NoDuplicateAttr>()) {
     B.addAttribute(llvm::Attribute::NoDuplicate);
   } else if (D->hasAttr<NoInlineAttr>() && !F->hasFnAttribute(llvm::Attribute::AlwaysInline)) {

(just PoC, not tested)

GCC docs: This attribute implies noinline, noclone and no_icf attributes. So for example:

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 6b966e7ca133..a13b1755cedf 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1757,6 +1757,10 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
     // Naked implies noinline: we should not be inlining such functions.
     B.addAttribute(llvm::Attribute::Naked);
     B.addAttribute(llvm::Attribute::NoInline);
+  } else if (D->hasAttr<NoIPAAttr>()) {
+    // NoIPA implies noinline: we should not be inlining such functions.
+    B.addAttribute(llvm::Attribute::NoIPA);
+    B.addAttribute(llvm::Attribute::NoInline);
   } else if (D->hasAttr<NoDuplicateAttr>()) {
     B.addAttribute(llvm::Attribute::NoDuplicate);
   } else if (D->hasAttr<NoInlineAttr>() && !F->hasFnAttribute(llvm::Attribute::AlwaysInline)) {

(just PoC, not tested)

I think there's a reasonable argument to be made for keeping the attributes orthogonal - to implement the GCC compatible support in Clang we can always add both attributes in Clang's IRGen.

xbolva00 added a comment.EditedApr 22 2021, 11:59 AM

Check leaf attribute: https://reviews.llvm.org/D90275

I think you miss similar changes in SemaDeclAttr.cpp and CGCall.cpp and some testcases with

__attribute__((noipa))

GCC docs: This attribute implies noinline, noclone and no_icf attributes. So for example:

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 6b966e7ca133..a13b1755cedf 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1757,6 +1757,10 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
     // Naked implies noinline: we should not be inlining such functions.
     B.addAttribute(llvm::Attribute::Naked);
     B.addAttribute(llvm::Attribute::NoInline);
+  } else if (D->hasAttr<NoIPAAttr>()) {
+    // NoIPA implies noinline: we should not be inlining such functions.
+    B.addAttribute(llvm::Attribute::NoIPA);
+    B.addAttribute(llvm::Attribute::NoInline);
   } else if (D->hasAttr<NoDuplicateAttr>()) {
     B.addAttribute(llvm::Attribute::NoDuplicate);
   } else if (D->hasAttr<NoInlineAttr>() && !F->hasFnAttribute(llvm::Attribute::AlwaysInline)) {

(just PoC, not tested)

I think there's a reasonable argument to be made for keeping the attributes orthogonal - to implement the GCC compatible support in Clang we can always add both attributes in Clang's IRGen.

This also avoids the awkwardness of the optnone-requires-noinline situation (where adding optnone means validation failures until you add noinline too) - or if we made it implied like your patch does - then things get weird on roundtrip (the attribute gets added when parsing the IR? so the output IR is different from the input IR).

Hmm, I guess the naked-implies-noinline code above is a pretty good existence proof if we went that route, though. So probably not the worst design choice. Oh, hmm - if we only add the "implies" when parsing - what happens if someone makes an IR module in-memory via the C++ API? Looks like Clang has to intentionally add both attributes... not especially ergonomic.

(though that does mean if we later wanted to separate these ideas it would be difficult - because there could be code only adding Naked without noinline and now we'd be changing the behavior of that. (at least with the optnone-requires-noinline if we do remove that constraint existing users won't be adversely effected because they have to add both... well, in theory - I guess that's probably only enforced on reading, so if someone makes only in-memory IR they wouldn't see the constraint and they'd have problems)

ugh. Yeah, more reasons not to tie attributes together like this, I suspect.

Check leaf attribute: https://reviews.llvm.org/D90275

I think you miss similar changes in SemaDeclAttr.cpp and CGCall.cpp and some testcases with

__attribute__((noipa))

I'm not planning on adding the C noipa attribute to Clang (or at least not planning on doing it in this patch) - generally LLVM and Clang changes should be separated when possible, as they can in this case - the implementation and testing of the LLVM IR attribute can be done without changes to Clang, and should be done that way. Then Clang functionality can be built on top of that work in independent patches.

xbolva00 added a comment.EditedApr 22 2021, 12:10 PM

so the output IR is different from the input IR

No I dont mean input IR. I mean

int foo(int x) __attribute__((noipa)) {
    return x * 2;
}

int a(int x) {
    return foo(x);
}

Clang would codegen:

define dso_local i32 @foo(i32 %0) #0 {
  %2 = shl nsw i32 %0, 1
  ret i32 %2
}

attributes #0 = { noipa noinline }

WDYT?

Check leaf attribute: https://reviews.llvm.org/D90275

I think you miss similar changes in SemaDeclAttr.cpp and CGCall.cpp and some testcases with

__attribute__((noipa))

I'm not planning on adding the C noipa attribute to Clang (or at least not planning on doing it in this patch) - generally LLVM and Clang changes should be separated when possible, as they can in this case - the implementation and testing of the LLVM IR attribute can be done without changes to Clang, and should be done that way. Then Clang functionality can be built on top of that work in independent patches.

Oh, okay. I can take it then.

so the output IR is different from the input IR

No I dont mean input IR. I mean

int foo(int x) __attribute__((noipa)) {
    return x * 2;
}

int a(int x) {
    return foo(x);
}

Clang would codegen:

define dso_local i32 @foo(i32 %0) #0 {
  %2 = shl nsw i32 %0, 1
  ret i32 %2
}

attributes #0 = { noipa noinline }

WDYT?

Oh, sure - that'll happen in Clang (& I agree it should be done - both to match the GCC behavior, and because it seems like good behavior/likely what the user expects regardless), not in this LLVM patch,.

Great! (Sorry for small confusion)

The patch itself looks fine

jdoerfert added inline comments.Apr 22 2021, 1:14 PM
llvm/lib/IR/Globals.cpp
352

The only thing I'm not 100% convinced by is this part. I can see the appeal, but I can also imagine problems in the future. Not everything looking at the linkage might care about noipa. I'd rather introduce a helper that checks noipa and linkage, and maybe also the alwaysinline attribute. Basically,

/// Determine whether the function \p F is IPO amendable
///
/// If a function is exactly defined or it has alwaysinline attribute
/// and is viable to be inlined, we say it is IPO amendable
bool isFunctionIPOAmendable(const Function &F) {
┊ return !F->hasNoIPA() && (F.hasExactDefinition() || F.isAlwaysInlined());
}

whereas the "always inlined" function does not exist yet.

dblaikie added inline comments.Apr 22 2021, 1:25 PM
llvm/lib/IR/Globals.cpp
352

Yep - I mean this is the guts of it, so the bit that's interesting/debatable from a design perspective (but glad to hear the rest of it/the mechanical stuff is about right)

I worry about implementing this as a separate property and having to update every pass/place that queries this sort of thing. That'll mean likely changing every call to hasExactDefinition (admittedly there are fewer calls than I was expecting, which also sort of worries me a bit - about 20, most of them in FunctionAttrs) and adding new test coverage for every use? & all the future uses that might find hasExactDefinition and think that's enough.

Perhaps we could rename hasExactDefinition to something that would capture this new use case and reduce the chance it'd be used for something unsuitable later?

jdoerfert added inline comments.Apr 22 2021, 1:42 PM
llvm/lib/IR/Globals.cpp
352

I think "isFunctionIPOAmendable", or similar, makes it much clearer to the observer what is happening. We can also put more things in there, the always_inline logic, we can check for naked, etc.
Since any new name/function requires to go to all callers, I don't think there is much to gain/loose (assuming we don't overload hasExactDefinition).

dblaikie added inline comments.Apr 22 2021, 2:26 PM
llvm/lib/IR/Globals.cpp
352

You mean not much to gain/lose regarding whether the function is renamed V a new function is added?

Yeah, it's a bit of a stretch on my side, but I feel like renaming and adding some functionality to the function probably doesn't merit adding testing to al callers - but adding a new function and porting the passes over to it, I'd feel compelled to test each pass. How do you feel about either of those perspectives, or some other perspective on the change and testing of it?

jdoerfert added inline comments.
llvm/lib/IR/Globals.cpp
352

yes, renaming vs new function seems to be little different.

I favor a new function as there might be unknown uses downstream, but I'm not opposing renaming the existing one so we can make it do something else/more.

I would create a new helper, select the passes we know should deal with noipa explicitly, move them over to the new helper, and add tests. It is unclear how we could get away with less tests if we do something else, I mean without just not testing if a pass honors noipa now.

Function-attrs (and I can port them to the Attributor) is the top consumer. From reading through the uses of "exact definition" I'd say IPSCCP (via canTrackReturnsInterprocedurally) is another one. @arsenm might want to look at AMDGPUAnnotateKernelFeatures.cpp. PruneEH and DeadArgElim need updates and tests, though that should be easy. I have no idea what ObjCARCAPElim.cpp does, we probably should ping someone. That should cover everything in-tree, I think.

That said, I'm sure we have some volunteers to do some of the porting testing so it's not all on you.
(Looking at @xbolva00 ;) )

dblaikie added inline comments.Apr 22 2021, 2:48 PM
llvm/lib/IR/Globals.cpp
352

Yeah, I worry about keeping the existing one in place due to the risk of it being misused out of momentum/existing familiarity, rendering noipa less accurate.

Admittedly partly inspired by the GCC implementation ( https://gcc.gnu.org/legacy-ml/gcc/2016-12/msg00064.html ) that implemented noipa by way of marking such functions as interposable - which I tend to agree with. The idea that we have an existing property that this maps to (though there's reasonable disagreement about how accurately) - now we're adding a new way that that property can be expressed. That doesn't necessitate testing all functionality that depends on the property - only testing that the new expression of the property is working correctly.

jdoerfert added inline comments.Apr 22 2021, 5:18 PM
llvm/lib/IR/Globals.cpp
352

Yeah, I worry about keeping the existing one in place due to the risk of it being misused out of momentum/existing familiarity, rendering noipa less accurate

Fair. Maybe we rename it and introduce a new helper ;)

Renaming so downstream users notice, new helper to encapsulate all the "can we do IPA across this call edge" logic without polutting the "is this derefinable" code path.

I generally would value clear naming/design over the desire to "auto-port" existing users to what might be the right thing. While the latter has short term advantages it often is long term painful, and at some point someone will come along and split the linkage logic and the rest apart, or introduce an argument with default value, rendering all these thoughts mood. [Site note: I imagine we will actually have to find a way to split it to update our internalization optimization in which we want to work around derefinable linkage through a TU internal copy but not around noipa.]

That said, if people really think hiding noipa behind this function is the way to go, I'm not going to block that.

dblaikie added inline comments.Apr 22 2021, 7:08 PM
llvm/lib/IR/Globals.cpp
352

Actually, looking at this further - maybe it makes sense to actually move it further /down/ rather than up. Closer to how GCC modeled this:

one layer down, at GlobalValue::isInterposable I found a test for the getParent() && getParent()->getSemanticInterposition(), the latter uses module metadata to flag the module as semantically interposable.

The documentation for "isInterposable" seems more accurate to the semantics I want to implement here:

/// Whether the definition of this global may be replaced by something
/// non-equivalent at link time. For example, if a function has weak linkage
/// then the code defining it may be replaced by different code.

(where's isExactDefinition says "Inlining is okay across non-exact linkage types as long as they're not interposable" and mayBeDerefined says "Returns true if the definition of this global may be replaced by a differently optimized variant of the same source level function at link time." - while the latter is less precise, it still has that worrying "of the same source level function" - which could suggest that a function that may be derefined could still be inlined)

In fact, adding this noipa attribute could supersede the SemanticInterposition module metadata entirely - frontends could put noipa on every function (as they do with optnone at -O0 today). Should we call it something different than noipa, then? Should it be called semantic_interposition?

jdoerfert added inline comments.Apr 22 2021, 10:01 PM
llvm/lib/IR/Globals.cpp
352

On first thought I can see us adding an attribute to replace that metadata, it depends on how it is supposed to work in an LTO setting at the end of the day. That said, noipa should exist on its own.

I also still think we should not intertwine two things that are similar but different: "do not perform ipa/ipo" and "the semantics do not allow you to do ipa/ipo, among other things".

Long story short, I'll continue to be in favor of a new canIdoIPO(CallBase/Function) helper which we employ in our passes, though others should chime in if you prefer a solution in which noipa is checked in existing lookup functions.

@MaskRay @serge-sans-paille - you folks have any thoughts on this (see also the specific discussion thread in this review with @JDevlieghere). It looks like this attribute could allow per-function support for "-fsemantic-interposition" that would potentially replace the existing module metadata support for Semantic Interposition, perhaps? Is that feasible, would this be the right behavior? the right design/direction?

(also, I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

@MaskRay @serge-sans-paille - you folks have any thoughts on this (see also the specific discussion thread in this review with @JDevlieghere). It looks like this attribute could allow per-function support for "-fsemantic-interposition" that would potentially replace the existing module metadata support for Semantic Interposition, perhaps? Is that feasible, would this be the right behavior? the right design/direction?

(also, I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

(oh, asking because I came across the SemanticInterposition work done in D72829 while looking at where to implement this)

Wouldn't mind some thoughts from the other folks on the original thread, @rnk @mehdi_amini

xbolva00 added a comment.EditedApr 23 2021, 2:50 PM

I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

I dont think this is a good step. This would be a very invasive major change to rename optnone. You can always improve the documentation for the attributes to make it more clear..

I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

I dont think this is a good step. This would be a very invasive major change to rename optnone. You can always improve the documentation for the attributes to make it more clear..

Eh, mechanically it'd be a big patch, but not a lot of work I'd expect. But yeah - we'll figure that out in another patch if/when it comes to that.

MaskRay added a comment.EditedApr 25 2021, 3:03 PM

@MaskRay @serge-sans-paille - you folks have any thoughts on this (see also the specific discussion thread in this review with @JDevlieghere). It looks like this attribute could allow per-function support for "-fsemantic-interposition" that would potentially replace the existing module metadata support for Semantic Interposition, perhaps? Is that feasible, would this be the right behavior? the right design/direction?

(also, I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

I think the module flag metadata "SemanticInterposition" is more of a workaround for the existing (30+) uses for GlobalValue::isInterposable.
Many probably should respect the proposed LLVM IR noipa (subject to rename) by using a helper function (similar to isDefinitionExact) instead.

@MaskRay @serge-sans-paille - you folks have any thoughts on this (see also the specific discussion thread in this review with @JDevlieghere). It looks like this attribute could allow per-function support for "-fsemantic-interposition" that would potentially replace the existing module metadata support for Semantic Interposition, perhaps? Is that feasible, would this be the right behavior? the right design/direction?

(also, I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

I think the module flag metadata "SemanticInterposition" is more of a workaround for the existing (30+) uses for GlobalValue::isInterposable.
Many probably should respect the proposed LLVM IR noipa (subject to rename) by using a helper function (similar to isDefinitionExact).
If we simply make GlobalValue::isInterposable return false if the global value doesn't have dso_local, we may regress some optimization passes.
Ideally the frontend has set dso_local/dso_preemptable correctly so GlobalValue::isInterposable shouldn't need to check "SemanticInterposition".

Instrumentation passes can create functions on the fly. They are usually internal. If not (I don't know such a case), a module flag metadata serves as the purpose for setting the default dso_local/dso_preemptable.
I don't think such synthesized functions care about dso_local optimizations so this argument retaining "SemanticInterposition" is very weak.
If we simply make GlobalValue::isInterposable return false if the global value doesn't have dso_local, we may regress some optimization passes.
Ideally the frontend has set dso_local/dso_preemptable correctly so GlobalValue::isInterposable shouldn't need to check "SemanticInterposition".

llvm/lib/IR/Globals.cpp
329

Sent D101264 to refactor this function a bit.

MaskRay added inline comments.Apr 25 2021, 3:14 PM
llvm/lib/IR/Globals.cpp
352

+1 for adding a helper named isFunctionIPOAmendable or similar.

"SemanticInterposition" is more of a workaround to not regress existing GlobalValue::isInterposable call sites which might not be appropriate. If the call sites are fixed/confirmed ok, noipa can supersede "SemanticInterposition".

This attribute will be useful:) Thanks for working on it.

I agree with @jdoerfert (https://lists.llvm.org/pipermail/llvm-dev/2021-April/150062.html) that the proposed attribute, noinline, and optnone should be orthogonal: no one implies the other(s).
If my reading from the long mailing list thread is correct, it is still undecided how the clang attribute should behave.
(Personally I'd hope that the GCC attribute noipa ("This attribute is supported mainly for the purpose of testing the compiler.") were different from the GCC attribute noinline.)

If the clang attribute would end up combining two attributes, we should better make the IR/clang attribute names different.
nointeropt may be a better name anyway because analysis in ipa does not necessarily mean optimization? :)
If our clang attribute is likely to have different semantics (I'd favor orthogonal semantics), noipa would be inappropriate as it will conflict with the GCC semantics.
Naming it sometime different will be nice.
For the Linux kernel https://github.com/ClangBuiltLinux/linux/issues/1302 , the kernel can use the clang attribute based on __clang__

xbolva00 edited the summary of this revision. (Show Details)May 2 2021, 9:16 AM
ychen added a subscriber: ychen.May 15 2021, 11:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2022, 12:51 AM
Herald added a subscriber: ormris. · View Herald Transcript

This attribute will be useful:) Thanks for working on it.

I agree with @jdoerfert (https://lists.llvm.org/pipermail/llvm-dev/2021-April/150062.html) that the proposed attribute, noinline, and optnone should be orthogonal: no one implies the other(s).

At the IR level, I think that's fine/fair.

If my reading from the long mailing list thread is correct, it is still undecided how the clang attribute should behave.

I think Clang optnone -> IR optnone + noipa.

(Personally I'd hope that the GCC attribute noipa ("This attribute is supported mainly for the purpose of testing the compiler.") were different from the GCC attribute noinline.)

Sure - I forget the context, but I don't think there's any reason/need to have noipa and noinline alias each other, they are distinct features. (though, unfortunately, sometimes IPA produces the equivalent of inlining ("hey, this function always returns 5, we can just use 5 at the call site - oh, also the function has no side effects, so we can remove the now-unused call... and we've essentially inlined, even though we didn't use the inliner") - not sure if there are some IPA features we could intentionally break that'd stop LLVM doing "the equivalent of inlining, without using the inliner")

noipa will effectively imply the equivalent of noinline - because it'd be impossible to inline in the absence of a definition. But the inverse isn't true. I don't think we necessarily have to lower a clang-level noipa to an IR noipa+noinline, I think just using noipa would be sufficient to get the non-inlining semantics (again: how could you inline if you can't see the definition of a function).

If the clang attribute would end up combining two attributes, we should better make the IR/clang attribute names different.

Perhaps. Though probably the thing to rename would be optnone, rather than noipa. I think noipa is the right name. I think optnone as a name should/does include noipa. (I think this whole rabbit hole we got down to here is unfortunate and previous understanding of optnone did include noipa-like behavior (I think I've referenced previous patches that were consistent with that understanding & the original author (& myself as one of the reviewers of the feature) have stated that that was the intent of optnone, to also be noipa-like) - but I can appreciate that having distinct attributes can provide greater flexibility)

nointeropt may be a better name anyway because analysis in ipa does not necessarily mean optimization? :)

Eh, I don't think I'd want to go down that path - I think it's important to model this as noipa, as "imagine this function definition were not available at all" (as though it were defined in another translation unit entirely) - no analysis, no optimization, nothing - you can't see the definition at all. It's an easy to explain, robust concept.

If our clang attribute is likely to have different semantics (I'd favor orthogonal semantics), noipa would be inappropriate as it will conflict with the GCC semantics.

I don't think there's any need for our noipa to differ from GCC's semantics - their documentation that says noipa implies noinline can be seen as a description of behavior, not a constraint on how that behavior is achieved. noipa should logically disable inlining as a consequnce of disallowing interprodecural analysis/information.

Re-ping

Sorry I haven't picked this up again. I lost steam when folks suggested that to implement this we could/should revisit/rewrite/retest every call site that checks for interposability... that just doesn't seem productive to me/hard to justify the time to do all that work and doing it partially I think would be really harmful (leaving interposability unaligned/inconsistent with noipa in various subtle ways).

Happy to talk about how we could move this forward perhaps with more hands to do all that work, but I still just don't feel great about that as a direction, really.

xbolva00 added a comment.EditedMar 15 2022, 9:54 AM

I don't think there's any need for our noipa to differ from GCC's semantics - their documentation that says noipa implies noinline can be seen as a description of behavior, not a constraint on how that behavior is achieved. noipa should logically disable inlining as a consequnce of disallowing interprodecural analysis/information.

GCC’s attribute(noipa) (frontend) -> noipa, noinline, noclone, no_icf (midend).

https://github.com/gcc-mirror/gcc/commit/036ea39917b0ef6f07a7c3c3c06002c73fd238f5

Any good reason why not do it same way as well? Avoid hidden logical assumption that noipa is basically noinline + something.

And then llvm optimizers can check - if (attrs.contains (“noipa”) do not propagate constants, etc…

I lost steam when folks suggested that to implement this we could/should revisit/rewrite/retest every call site that checks for interposability... that just doesn't seem productive to me/hard to justify the time to do all that work and doing it partially I think would be really harmful (leaving interposability unaligned/inconsistent with noipa in various subtle ways).

:/ not productive approach to work on one perfect patch.

This work could be split easily. 1) Basic llvm support, 2) clang attribute, 3) complete llvm support.

rnk added a comment.Mar 15 2022, 10:17 AM

I agree, this is useful functionality, we should add it.

I would be OK with an all-in-one non-orthogonal noipa attribute. The only use case I can come up with for orthogonal attributes is for constructing fine-grained compiler test cases, or trying to carefully convince the optimizer to do one transform or another. The original use case seems more important.

I would also like to suggest another use case for the original attribute, which is that this feature supports hotpatching. If you block IPO across a particular function boundary, you can more reliably recompile and hotpatch in new code, without going to great lengths to break up modules into smaller object files and relinking that way.

I don't think there's any need for our noipa to differ from GCC's semantics - their documentation that says noipa implies noinline can be seen as a description of behavior, not a constraint on how that behavior is achieved. noipa should logically disable inlining as a consequnce of disallowing interprodecural analysis/information.

GCC’s attribute(noipa) (frontend) -> noipa, noinline, noclone, no_icf (midend).

That has some appeal, but what about future optimizations? This supports testability (you can block each transform individually), but it requires frontends to keep track of these unpacked attributes that block all interprocedural optimization. And there's a bitcode auto-upgrade problem: if the frontend intended to block all IPO, how to you upgrade that intention? If you ignore that, you can solve the frontend problem with a utility like AttrBuilder::addNoIpaAttrs() which adds all the relevant attributes (noinline, noicf, whatever).

Sorry I haven't picked this up again. I lost steam when folks suggested that to implement this we could/should revisit/rewrite/retest every call site that checks for interposability... that just doesn't seem productive to me/hard to justify the time to do all that work and doing it partially I think would be really harmful (leaving interposability unaligned/inconsistent with noipa in various subtle ways).

Happy to talk about how we could move this forward perhaps with more hands to do all that work, but I still just don't feel great about that as a direction, really.

I'm sympathetic. :) Are other reviewers mostly concerned with the naming of "derefined" here? I think I agree that the fewer states and special cases we allow, the better, so aligning interposability and noipa is appealing to me.

if llvm attribute noipa also adds logical assumption about noinline, we should not then audit all check of noinline whether they should be extended for noipa check, or?

If we clang’s noipa translates to noipa + noinline + …, we dont have this issue.

I don't think there's any need for our noipa to differ from GCC's semantics - their documentation that says noipa implies noinline can be seen as a description of behavior, not a constraint on how that behavior is achieved. noipa should logically disable inlining as a consequnce of disallowing interprodecural analysis/information.

GCC’s attribute(noipa) (frontend) -> noipa, noinline, noclone, no_icf (midend).

https://github.com/gcc-mirror/gcc/commit/036ea39917b0ef6f07a7c3c3c06002c73fd238f5

Any good reason why not do it same way as well? Avoid hidden logical assumption that noipa is basically noinline + something.

Because I don't think it should be necessary - noipa semantics should be strictly more powerful than noinline, so there shouldn't be a need to put noinline on the function.

If noinline ever makes a difference on a noipa function, I think that's a pretty serious bug.

I lost steam when folks suggested that to implement this we could/should revisit/rewrite/retest every call site that checks for interposability... that just doesn't seem productive to me/hard to justify the time to do all that work and doing it partially I think would be really harmful (leaving interposability unaligned/inconsistent with noipa in various subtle ways).

:/ not productive approach to work on one perfect patch.

I don't think the approach of revisiting/retesting all these call sites is the right way to go about things. So I stopped working on this because I couldn't think of/find a way to approach this that addressed both my perspectives and those of reviewers here.

This work could be split easily. 1) Basic llvm support, 2) clang attribute, 3) complete llvm support.

The intermediate state created by pieces of (3) I have two concerns with:

  1. I don't think it's a productive use of time to retest and rewrite every check for interposability
  2. It introduces (certainly initially) divergence between these two properties which I don't think should ever diverge

if llvm attribute noipa also adds logical assumption about noinline, we should not then audit all check of noinline whether they should be extended for noipa check, or?

No, I don't believe we do - if interposability is already well tested, that was the point of leveraging that definition/functionality - noipa should be very similar/the same as interposability. But if we have to revisit every check for interposability and rewrite the check - then, yes, we'd end up adding more test coverage and as a consequence we would end up adding testing for cases where noipa subsumes noinline, probably. (maybe the interposability testing is somewhere else/not in noinline directly - in which case we wouldn't add inlining testing, we'd test that other piece that happens to feed into inlining)

I agree, this is useful functionality, we should add it.

I would be OK with an all-in-one non-orthogonal noipa attribute. The only use case I can come up with for orthogonal attributes is for constructing fine-grained compiler test cases, or trying to carefully convince the optimizer to do one transform or another. The original use case seems more important.

I would also like to suggest another use case for the original attribute, which is that this feature supports hotpatching. If you block IPO across a particular function boundary, you can more reliably recompile and hotpatch in new code, without going to great lengths to break up modules into smaller object files and relinking that way.

Ah, indeed.

I don't think there's any need for our noipa to differ from GCC's semantics - their documentation that says noipa implies noinline can be seen as a description of behavior, not a constraint on how that behavior is achieved. noipa should logically disable inlining as a consequnce of disallowing interprodecural analysis/information.

GCC’s attribute(noipa) (frontend) -> noipa, noinline, noclone, no_icf (midend).

That has some appeal, but what about future optimizations? This supports testability (you can block each transform individually), but it requires frontends to keep track of these unpacked attributes that block all interprocedural optimization. And there's a bitcode auto-upgrade problem: if the frontend intended to block all IPO, how to you upgrade that intention? If you ignore that, you can solve the frontend problem with a utility like AttrBuilder::addNoIpaAttrs() which adds all the relevant attributes (noinline, noicf, whatever).

I'm just not sure it makes sense to decompose these - like I don't know what noipa without noinline or no_icf would mean - if we're saying it's invalid IR /not/ to combine them, then I'm still confused by how we could/would implement noipa that wouldn't implicitly subsume toinline/no_icf functionality anyway. (if there was an implementation of noipa that didn't implicitly subsume the functionality of noinline/no_icf, then I'd consider that feature brittle/buggy in a way I don't think it could/should be - if noipa is tested at the same level as interposable, it should flow through inlining and icf naturally without the need for specific attributes to disable them)

lebedev.ri resigned from this revision.Jan 12 2023, 5:30 PM

This review may be stuck/dead, consider abandoning if no longer relevant.
Removing myself as reviewer in attempt to clean dashboard.

dblaikie updated this revision to Diff 557848.Oct 23 2023, 11:50 AM

Rename mayBeDerefined to mayBeDerefinedOrNoIPA