Page MenuHomePhabricator

[Attr] Add "noipa" function attribute
Needs ReviewPublic

Authored by dblaikie on Wed, Apr 21, 7:25 PM.

Details

Summary

This is mostly inspired by the patch that added willreturn as a
reference point ( 3b77583e95236761d8741fd6df375975a8ca5d83 ).

Inspired by this thread:
https://lists.llvm.org/pipermail/llvm-dev/2021-April/149960.html

Solves PR41474

Diff Detail

Unit TestsFailed

TimeTest
0 msx64 debian > libomptarget.mapping::declare_mapper_nested_default_mappers_array.cpp
Script: -- : 'RUN: at line 1'; echo ignored-command

Event Timeline

dblaikie created this revision.Wed, Apr 21, 7:25 PM
dblaikie requested review of this revision.Wed, Apr 21, 7:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptWed, Apr 21, 7:25 PM

GCC docs: This attribute implies noinline, noclone and no_icf attributes. So for example:

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 6b966e7ca133..a13b1755cedf 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1757,6 +1757,10 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
     // Naked implies noinline: we should not be inlining such functions.
     B.addAttribute(llvm::Attribute::Naked);
     B.addAttribute(llvm::Attribute::NoInline);
+  } else if (D->hasAttr<NoIPAAttr>()) {
+    // NoIPA implies noinline: we should not be inlining such functions.
+    B.addAttribute(llvm::Attribute::NoIPA);
+    B.addAttribute(llvm::Attribute::NoInline);
   } else if (D->hasAttr<NoDuplicateAttr>()) {
     B.addAttribute(llvm::Attribute::NoDuplicate);
   } else if (D->hasAttr<NoInlineAttr>() && !F->hasFnAttribute(llvm::Attribute::AlwaysInline)) {

(just PoC, not tested)

GCC docs: This attribute implies noinline, noclone and no_icf attributes. So for example:

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 6b966e7ca133..a13b1755cedf 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1757,6 +1757,10 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
     // Naked implies noinline: we should not be inlining such functions.
     B.addAttribute(llvm::Attribute::Naked);
     B.addAttribute(llvm::Attribute::NoInline);
+  } else if (D->hasAttr<NoIPAAttr>()) {
+    // NoIPA implies noinline: we should not be inlining such functions.
+    B.addAttribute(llvm::Attribute::NoIPA);
+    B.addAttribute(llvm::Attribute::NoInline);
   } else if (D->hasAttr<NoDuplicateAttr>()) {
     B.addAttribute(llvm::Attribute::NoDuplicate);
   } else if (D->hasAttr<NoInlineAttr>() && !F->hasFnAttribute(llvm::Attribute::AlwaysInline)) {

(just PoC, not tested)

I think there's a reasonable argument to be made for keeping the attributes orthogonal - to implement the GCC compatible support in Clang we can always add both attributes in Clang's IRGen.

xbolva00 added a comment.EditedThu, Apr 22, 11:59 AM

Check leaf attribute: https://reviews.llvm.org/D90275

I think you miss similar changes in SemaDeclAttr.cpp and CGCall.cpp and some testcases with

__attribute__((noipa))

GCC docs: This attribute implies noinline, noclone and no_icf attributes. So for example:

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 6b966e7ca133..a13b1755cedf 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1757,6 +1757,10 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
     // Naked implies noinline: we should not be inlining such functions.
     B.addAttribute(llvm::Attribute::Naked);
     B.addAttribute(llvm::Attribute::NoInline);
+  } else if (D->hasAttr<NoIPAAttr>()) {
+    // NoIPA implies noinline: we should not be inlining such functions.
+    B.addAttribute(llvm::Attribute::NoIPA);
+    B.addAttribute(llvm::Attribute::NoInline);
   } else if (D->hasAttr<NoDuplicateAttr>()) {
     B.addAttribute(llvm::Attribute::NoDuplicate);
   } else if (D->hasAttr<NoInlineAttr>() && !F->hasFnAttribute(llvm::Attribute::AlwaysInline)) {

(just PoC, not tested)

I think there's a reasonable argument to be made for keeping the attributes orthogonal - to implement the GCC compatible support in Clang we can always add both attributes in Clang's IRGen.

This also avoids the awkwardness of the optnone-requires-noinline situation (where adding optnone means validation failures until you add noinline too) - or if we made it implied like your patch does - then things get weird on roundtrip (the attribute gets added when parsing the IR? so the output IR is different from the input IR).

Hmm, I guess the naked-implies-noinline code above is a pretty good existence proof if we went that route, though. So probably not the worst design choice. Oh, hmm - if we only add the "implies" when parsing - what happens if someone makes an IR module in-memory via the C++ API? Looks like Clang has to intentionally add both attributes... not especially ergonomic.

(though that does mean if we later wanted to separate these ideas it would be difficult - because there could be code only adding Naked without noinline and now we'd be changing the behavior of that. (at least with the optnone-requires-noinline if we do remove that constraint existing users won't be adversely effected because they have to add both... well, in theory - I guess that's probably only enforced on reading, so if someone makes only in-memory IR they wouldn't see the constraint and they'd have problems)

ugh. Yeah, more reasons not to tie attributes together like this, I suspect.

Check leaf attribute: https://reviews.llvm.org/D90275

I think you miss similar changes in SemaDeclAttr.cpp and CGCall.cpp and some testcases with

__attribute__((noipa))

I'm not planning on adding the C noipa attribute to Clang (or at least not planning on doing it in this patch) - generally LLVM and Clang changes should be separated when possible, as they can in this case - the implementation and testing of the LLVM IR attribute can be done without changes to Clang, and should be done that way. Then Clang functionality can be built on top of that work in independent patches.

xbolva00 added a comment.EditedThu, Apr 22, 12:10 PM

so the output IR is different from the input IR

No I dont mean input IR. I mean

int foo(int x) __attribute__((noipa)) {
    return x * 2;
}

int a(int x) {
    return foo(x);
}

Clang would codegen:

define dso_local i32 @foo(i32 %0) #0 {
  %2 = shl nsw i32 %0, 1
  ret i32 %2
}

attributes #0 = { noipa noinline }

WDYT?

Check leaf attribute: https://reviews.llvm.org/D90275

I think you miss similar changes in SemaDeclAttr.cpp and CGCall.cpp and some testcases with

__attribute__((noipa))

I'm not planning on adding the C noipa attribute to Clang (or at least not planning on doing it in this patch) - generally LLVM and Clang changes should be separated when possible, as they can in this case - the implementation and testing of the LLVM IR attribute can be done without changes to Clang, and should be done that way. Then Clang functionality can be built on top of that work in independent patches.

Oh, okay. I can take it then.

so the output IR is different from the input IR

No I dont mean input IR. I mean

int foo(int x) __attribute__((noipa)) {
    return x * 2;
}

int a(int x) {
    return foo(x);
}

Clang would codegen:

define dso_local i32 @foo(i32 %0) #0 {
  %2 = shl nsw i32 %0, 1
  ret i32 %2
}

attributes #0 = { noipa noinline }

WDYT?

Oh, sure - that'll happen in Clang (& I agree it should be done - both to match the GCC behavior, and because it seems like good behavior/likely what the user expects regardless), not in this LLVM patch,.

Great! (Sorry for small confusion)

The patch itself looks fine

jdoerfert added inline comments.Thu, Apr 22, 1:14 PM
llvm/lib/IR/Globals.cpp
352

The only thing I'm not 100% convinced by is this part. I can see the appeal, but I can also imagine problems in the future. Not everything looking at the linkage might care about noipa. I'd rather introduce a helper that checks noipa and linkage, and maybe also the alwaysinline attribute. Basically,

/// Determine whether the function \p F is IPO amendable
///
/// If a function is exactly defined or it has alwaysinline attribute
/// and is viable to be inlined, we say it is IPO amendable
bool isFunctionIPOAmendable(const Function &F) {
┊ return !F->hasNoIPA() && (F.hasExactDefinition() || F.isAlwaysInlined());
}

whereas the "always inlined" function does not exist yet.

dblaikie added inline comments.Thu, Apr 22, 1:25 PM
llvm/lib/IR/Globals.cpp
352

Yep - I mean this is the guts of it, so the bit that's interesting/debatable from a design perspective (but glad to hear the rest of it/the mechanical stuff is about right)

I worry about implementing this as a separate property and having to update every pass/place that queries this sort of thing. That'll mean likely changing every call to hasExactDefinition (admittedly there are fewer calls than I was expecting, which also sort of worries me a bit - about 20, most of them in FunctionAttrs) and adding new test coverage for every use? & all the future uses that might find hasExactDefinition and think that's enough.

Perhaps we could rename hasExactDefinition to something that would capture this new use case and reduce the chance it'd be used for something unsuitable later?

jdoerfert added inline comments.Thu, Apr 22, 1:42 PM
llvm/lib/IR/Globals.cpp
352

I think "isFunctionIPOAmendable", or similar, makes it much clearer to the observer what is happening. We can also put more things in there, the always_inline logic, we can check for naked, etc.
Since any new name/function requires to go to all callers, I don't think there is much to gain/loose (assuming we don't overload hasExactDefinition).

dblaikie added inline comments.Thu, Apr 22, 2:26 PM
llvm/lib/IR/Globals.cpp
352

You mean not much to gain/lose regarding whether the function is renamed V a new function is added?

Yeah, it's a bit of a stretch on my side, but I feel like renaming and adding some functionality to the function probably doesn't merit adding testing to al callers - but adding a new function and porting the passes over to it, I'd feel compelled to test each pass. How do you feel about either of those perspectives, or some other perspective on the change and testing of it?

jdoerfert added inline comments.
llvm/lib/IR/Globals.cpp
352

yes, renaming vs new function seems to be little different.

I favor a new function as there might be unknown uses downstream, but I'm not opposing renaming the existing one so we can make it do something else/more.

I would create a new helper, select the passes we know should deal with noipa explicitly, move them over to the new helper, and add tests. It is unclear how we could get away with less tests if we do something else, I mean without just not testing if a pass honors noipa now.

Function-attrs (and I can port them to the Attributor) is the top consumer. From reading through the uses of "exact definition" I'd say IPSCCP (via canTrackReturnsInterprocedurally) is another one. @arsenm might want to look at AMDGPUAnnotateKernelFeatures.cpp. PruneEH and DeadArgElim need updates and tests, though that should be easy. I have no idea what ObjCARCAPElim.cpp does, we probably should ping someone. That should cover everything in-tree, I think.

That said, I'm sure we have some volunteers to do some of the porting testing so it's not all on you.
(Looking at @xbolva00 ;) )

dblaikie added inline comments.Thu, Apr 22, 2:48 PM
llvm/lib/IR/Globals.cpp
352

Yeah, I worry about keeping the existing one in place due to the risk of it being misused out of momentum/existing familiarity, rendering noipa less accurate.

Admittedly partly inspired by the GCC implementation ( https://gcc.gnu.org/legacy-ml/gcc/2016-12/msg00064.html ) that implemented noipa by way of marking such functions as interposable - which I tend to agree with. The idea that we have an existing property that this maps to (though there's reasonable disagreement about how accurately) - now we're adding a new way that that property can be expressed. That doesn't necessitate testing all functionality that depends on the property - only testing that the new expression of the property is working correctly.

jdoerfert added inline comments.Thu, Apr 22, 5:18 PM
llvm/lib/IR/Globals.cpp
352

Yeah, I worry about keeping the existing one in place due to the risk of it being misused out of momentum/existing familiarity, rendering noipa less accurate

Fair. Maybe we rename it and introduce a new helper ;)

Renaming so downstream users notice, new helper to encapsulate all the "can we do IPA across this call edge" logic without polutting the "is this derefinable" code path.

I generally would value clear naming/design over the desire to "auto-port" existing users to what might be the right thing. While the latter has short term advantages it often is long term painful, and at some point someone will come along and split the linkage logic and the rest apart, or introduce an argument with default value, rendering all these thoughts mood. [Site note: I imagine we will actually have to find a way to split it to update our internalization optimization in which we want to work around derefinable linkage through a TU internal copy but not around noipa.]

That said, if people really think hiding noipa behind this function is the way to go, I'm not going to block that.

dblaikie added inline comments.Thu, Apr 22, 7:08 PM
llvm/lib/IR/Globals.cpp
352

Actually, looking at this further - maybe it makes sense to actually move it further /down/ rather than up. Closer to how GCC modeled this:

one layer down, at GlobalValue::isInterposable I found a test for the getParent() && getParent()->getSemanticInterposition(), the latter uses module metadata to flag the module as semantically interposable.

The documentation for "isInterposable" seems more accurate to the semantics I want to implement here:

/// Whether the definition of this global may be replaced by something
/// non-equivalent at link time. For example, if a function has weak linkage
/// then the code defining it may be replaced by different code.

(where's isExactDefinition says "Inlining is okay across non-exact linkage types as long as they're not interposable" and mayBeDerefined says "Returns true if the definition of this global may be replaced by a differently optimized variant of the same source level function at link time." - while the latter is less precise, it still has that worrying "of the same source level function" - which could suggest that a function that may be derefined could still be inlined)

In fact, adding this noipa attribute could supersede the SemanticInterposition module metadata entirely - frontends could put noipa on every function (as they do with optnone at -O0 today). Should we call it something different than noipa, then? Should it be called semantic_interposition?

jdoerfert added inline comments.Thu, Apr 22, 10:01 PM
llvm/lib/IR/Globals.cpp
352

On first thought I can see us adding an attribute to replace that metadata, it depends on how it is supposed to work in an LTO setting at the end of the day. That said, noipa should exist on its own.

I also still think we should not intertwine two things that are similar but different: "do not perform ipa/ipo" and "the semantics do not allow you to do ipa/ipo, among other things".

Long story short, I'll continue to be in favor of a new canIdoIPO(CallBase/Function) helper which we employ in our passes, though others should chime in if you prefer a solution in which noipa is checked in existing lookup functions.

@MaskRay @serge-sans-paille - you folks have any thoughts on this (see also the specific discussion thread in this review with @JDevlieghere). It looks like this attribute could allow per-function support for "-fsemantic-interposition" that would potentially replace the existing module metadata support for Semantic Interposition, perhaps? Is that feasible, would this be the right behavior? the right design/direction?

(also, I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

@MaskRay @serge-sans-paille - you folks have any thoughts on this (see also the specific discussion thread in this review with @JDevlieghere). It looks like this attribute could allow per-function support for "-fsemantic-interposition" that would potentially replace the existing module metadata support for Semantic Interposition, perhaps? Is that feasible, would this be the right behavior? the right design/direction?

(also, I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

(oh, asking because I came across the SemanticInterposition work done in D72829 while looking at where to implement this)

Wouldn't mind some thoughts from the other folks on the original thread, @rnk @mehdi_amini

xbolva00 added a comment.EditedFri, Apr 23, 2:50 PM

I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

I dont think this is a good step. This would be a very invasive major change to rename optnone. You can always improve the documentation for the attributes to make it more clear..

I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

I dont think this is a good step. This would be a very invasive major change to rename optnone. You can always improve the documentation for the attributes to make it more clear..

Eh, mechanically it'd be a big patch, but not a lot of work I'd expect. But yeah - we'll figure that out in another patch if/when it comes to that.

MaskRay added a comment.EditedSun, Apr 25, 3:03 PM

@MaskRay @serge-sans-paille - you folks have any thoughts on this (see also the specific discussion thread in this review with @JDevlieghere). It looks like this attribute could allow per-function support for "-fsemantic-interposition" that would potentially replace the existing module metadata support for Semantic Interposition, perhaps? Is that feasible, would this be the right behavior? the right design/direction?

(also, I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

I think the module flag metadata "SemanticInterposition" is more of a workaround for the existing (30+) uses for GlobalValue::isInterposable.
Many probably should respect the proposed LLVM IR noipa (subject to rename) by using a helper function (similar to isDefinitionExact) instead.

@MaskRay @serge-sans-paille - you folks have any thoughts on this (see also the specific discussion thread in this review with @JDevlieghere). It looks like this attribute could allow per-function support for "-fsemantic-interposition" that would potentially replace the existing module metadata support for Semantic Interposition, perhaps? Is that feasible, would this be the right behavior? the right design/direction?

(also, I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases)

I think the module flag metadata "SemanticInterposition" is more of a workaround for the existing (30+) uses for GlobalValue::isInterposable.
Many probably should respect the proposed LLVM IR noipa (subject to rename) by using a helper function (similar to isDefinitionExact).
If we simply make GlobalValue::isInterposable return false if the global value doesn't have dso_local, we may regress some optimization passes.
Ideally the frontend has set dso_local/dso_preemptable correctly so GlobalValue::isInterposable shouldn't need to check "SemanticInterposition".

Instrumentation passes can create functions on the fly. They are usually internal. If not (I don't know such a case), a module flag metadata serves as the purpose for setting the default dso_local/dso_preemptable.
I don't think such synthesized functions care about dso_local optimizations so this argument retaining "SemanticInterposition" is very weak.
If we simply make GlobalValue::isInterposable return false if the global value doesn't have dso_local, we may regress some optimization passes.
Ideally the frontend has set dso_local/dso_preemptable correctly so GlobalValue::isInterposable shouldn't need to check "SemanticInterposition".

llvm/lib/IR/Globals.cpp
329

Sent D101264 to refactor this function a bit.

MaskRay added inline comments.Sun, Apr 25, 3:14 PM
llvm/lib/IR/Globals.cpp
352

+1 for adding a helper named isFunctionIPOAmendable or similar.

"SemanticInterposition" is more of a workaround to not regress existing GlobalValue::isInterposable call sites which might not be appropriate. If the call sites are fixed/confirmed ok, noipa can supersede "SemanticInterposition".

This attribute will be useful:) Thanks for working on it.

I agree with @jdoerfert (https://lists.llvm.org/pipermail/llvm-dev/2021-April/150062.html) that the proposed attribute, noinline, and optnone should be orthogonal: no one implies the other(s).
If my reading from the long mailing list thread is correct, it is still undecided how the clang attribute should behave.
(Personally I'd hope that the GCC attribute noipa ("This attribute is supported mainly for the purpose of testing the compiler.") were different from the GCC attribute noinline.)

If the clang attribute would end up combining two attributes, we should better make the IR/clang attribute names different.
nointeropt may be a better name anyway because analysis in ipa does not necessarily mean optimization? :)
If our clang attribute is likely to have different semantics (I'd favor orthogonal semantics), noipa would be inappropriate as it will conflict with the GCC semantics.
Naming it sometime different will be nice.
For the Linux kernel https://github.com/ClangBuiltLinux/linux/issues/1302 , the kernel can use the clang attribute based on __clang__

xbolva00 edited the summary of this revision. (Show Details)Sun, May 2, 9:16 AM