This is mostly inspired by the patch that added willreturn as a
reference point ( 3b77583e95236761d8741fd6df375975a8ca5d83 ).
Inspired by this thread:
https://lists.llvm.org/pipermail/llvm-dev/2021-April/149960.html
Solves PR41474
Differential D101011
[Attr] Add "noipa" function attribute dblaikie on Apr 21 2021, 7:25 PM. Authored by
Details This is mostly inspired by the patch that added willreturn as a Inspired by this thread: Solves PR41474
Diff Detail
Event TimelineComment Actions GCC docs: This attribute implies noinline, noclone and no_icf attributes. So for example: diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 6b966e7ca133..a13b1755cedf 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -1757,6 +1757,10 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D, // Naked implies noinline: we should not be inlining such functions. B.addAttribute(llvm::Attribute::Naked); B.addAttribute(llvm::Attribute::NoInline); + } else if (D->hasAttr<NoIPAAttr>()) { + // NoIPA implies noinline: we should not be inlining such functions. + B.addAttribute(llvm::Attribute::NoIPA); + B.addAttribute(llvm::Attribute::NoInline); } else if (D->hasAttr<NoDuplicateAttr>()) { B.addAttribute(llvm::Attribute::NoDuplicate); } else if (D->hasAttr<NoInlineAttr>() && !F->hasFnAttribute(llvm::Attribute::AlwaysInline)) { (just PoC, not tested) Comment Actions I think there's a reasonable argument to be made for keeping the attributes orthogonal - to implement the GCC compatible support in Clang we can always add both attributes in Clang's IRGen. Comment Actions Check leaf attribute: https://reviews.llvm.org/D90275 I think you miss similar changes in SemaDeclAttr.cpp and CGCall.cpp and some testcases with __attribute__((noipa)) Comment Actions This also avoids the awkwardness of the optnone-requires-noinline situation (where adding optnone means validation failures until you add noinline too) - or if we made it implied like your patch does - then things get weird on roundtrip (the attribute gets added when parsing the IR? so the output IR is different from the input IR). Hmm, I guess the naked-implies-noinline code above is a pretty good existence proof if we went that route, though. So probably not the worst design choice. Oh, hmm - if we only add the "implies" when parsing - what happens if someone makes an IR module in-memory via the C++ API? Looks like Clang has to intentionally add both attributes... not especially ergonomic. (though that does mean if we later wanted to separate these ideas it would be difficult - because there could be code only adding Naked without noinline and now we'd be changing the behavior of that. (at least with the optnone-requires-noinline if we do remove that constraint existing users won't be adversely effected because they have to add both... well, in theory - I guess that's probably only enforced on reading, so if someone makes only in-memory IR they wouldn't see the constraint and they'd have problems) ugh. Yeah, more reasons not to tie attributes together like this, I suspect. Comment Actions I'm not planning on adding the C noipa attribute to Clang (or at least not planning on doing it in this patch) - generally LLVM and Clang changes should be separated when possible, as they can in this case - the implementation and testing of the LLVM IR attribute can be done without changes to Clang, and should be done that way. Then Clang functionality can be built on top of that work in independent patches. Comment Actions
No I dont mean input IR. I mean int foo(int x) __attribute__((noipa)) { return x * 2; } int a(int x) { return foo(x); } Clang would codegen: define dso_local i32 @foo(i32 %0) #0 { %2 = shl nsw i32 %0, 1 ret i32 %2 } attributes #0 = { noipa noinline } WDYT? Comment Actions Oh, sure - that'll happen in Clang (& I agree it should be done - both to match the GCC behavior, and because it seems like good behavior/likely what the user expects regardless), not in this LLVM patch,.
Comment Actions @MaskRay @serge-sans-paille - you folks have any thoughts on this (see also the specific discussion thread in this review with @JDevlieghere). It looks like this attribute could allow per-function support for "-fsemantic-interposition" that would potentially replace the existing module metadata support for Semantic Interposition, perhaps? Is that feasible, would this be the right behavior? the right design/direction? (also, I'm considering renaming this to "nointeropt" and changing "optnone" to "nointraopt" for symmetry/clarity (& then implementing clang optnone as "nointeropt+nointraopt"), in case that helps make the names more general/useful for different use cases) Comment Actions (oh, asking because I came across the SemanticInterposition work done in D72829 while looking at where to implement this) Wouldn't mind some thoughts from the other folks on the original thread, @rnk @mehdi_amini Comment Actions
I dont think this is a good step. This would be a very invasive major change to rename optnone. You can always improve the documentation for the attributes to make it more clear.. Comment Actions Eh, mechanically it'd be a big patch, but not a lot of work I'd expect. But yeah - we'll figure that out in another patch if/when it comes to that. Comment Actions I think the module flag metadata "SemanticInterposition" is more of a workaround for the existing (30+) uses for GlobalValue::isInterposable. I think the module flag metadata "SemanticInterposition" is more of a workaround for the existing (30+) uses for GlobalValue::isInterposable. Instrumentation passes can create functions on the fly. They are usually internal. If not (I don't know such a case), a module flag metadata serves as the purpose for setting the default dso_local/dso_preemptable.
Comment Actions This attribute will be useful:) Thanks for working on it. I agree with @jdoerfert (https://lists.llvm.org/pipermail/llvm-dev/2021-April/150062.html) that the proposed attribute, noinline, and optnone should be orthogonal: no one implies the other(s). If the clang attribute would end up combining two attributes, we should better make the IR/clang attribute names different. Comment Actions At the IR level, I think that's fine/fair.
I think Clang optnone -> IR optnone + noipa.
Sure - I forget the context, but I don't think there's any reason/need to have noipa and noinline alias each other, they are distinct features. (though, unfortunately, sometimes IPA produces the equivalent of inlining ("hey, this function always returns 5, we can just use 5 at the call site - oh, also the function has no side effects, so we can remove the now-unused call... and we've essentially inlined, even though we didn't use the inliner") - not sure if there are some IPA features we could intentionally break that'd stop LLVM doing "the equivalent of inlining, without using the inliner") noipa will effectively imply the equivalent of noinline - because it'd be impossible to inline in the absence of a definition. But the inverse isn't true. I don't think we necessarily have to lower a clang-level noipa to an IR noipa+noinline, I think just using noipa would be sufficient to get the non-inlining semantics (again: how could you inline if you can't see the definition of a function).
Perhaps. Though probably the thing to rename would be optnone, rather than noipa. I think noipa is the right name. I think optnone as a name should/does include noipa. (I think this whole rabbit hole we got down to here is unfortunate and previous understanding of optnone did include noipa-like behavior (I think I've referenced previous patches that were consistent with that understanding & the original author (& myself as one of the reviewers of the feature) have stated that that was the intent of optnone, to also be noipa-like) - but I can appreciate that having distinct attributes can provide greater flexibility)
Eh, I don't think I'd want to go down that path - I think it's important to model this as noipa, as "imagine this function definition were not available at all" (as though it were defined in another translation unit entirely) - no analysis, no optimization, nothing - you can't see the definition at all. It's an easy to explain, robust concept.
I don't think there's any need for our noipa to differ from GCC's semantics - their documentation that says noipa implies noinline can be seen as a description of behavior, not a constraint on how that behavior is achieved. noipa should logically disable inlining as a consequnce of disallowing interprodecural analysis/information. Sorry I haven't picked this up again. I lost steam when folks suggested that to implement this we could/should revisit/rewrite/retest every call site that checks for interposability... that just doesn't seem productive to me/hard to justify the time to do all that work and doing it partially I think would be really harmful (leaving interposability unaligned/inconsistent with noipa in various subtle ways). Happy to talk about how we could move this forward perhaps with more hands to do all that work, but I still just don't feel great about that as a direction, really. Comment Actions
GCC’s attribute(noipa) (frontend) -> noipa, noinline, noclone, no_icf (midend). https://github.com/gcc-mirror/gcc/commit/036ea39917b0ef6f07a7c3c3c06002c73fd238f5 Any good reason why not do it same way as well? Avoid hidden logical assumption that noipa is basically noinline + something. And then llvm optimizers can check - if (attrs.contains (“noipa”) do not propagate constants, etc… Comment Actions
:/ not productive approach to work on one perfect patch. This work could be split easily. 1) Basic llvm support, 2) clang attribute, 3) complete llvm support. Comment Actions I agree, this is useful functionality, we should add it. I would be OK with an all-in-one non-orthogonal noipa attribute. The only use case I can come up with for orthogonal attributes is for constructing fine-grained compiler test cases, or trying to carefully convince the optimizer to do one transform or another. The original use case seems more important. I would also like to suggest another use case for the original attribute, which is that this feature supports hotpatching. If you block IPO across a particular function boundary, you can more reliably recompile and hotpatch in new code, without going to great lengths to break up modules into smaller object files and relinking that way. That has some appeal, but what about future optimizations? This supports testability (you can block each transform individually), but it requires frontends to keep track of these unpacked attributes that block all interprocedural optimization. And there's a bitcode auto-upgrade problem: if the frontend intended to block all IPO, how to you upgrade that intention? If you ignore that, you can solve the frontend problem with a utility like AttrBuilder::addNoIpaAttrs() which adds all the relevant attributes (noinline, noicf, whatever). I'm sympathetic. :) Are other reviewers mostly concerned with the naming of "derefined" here? I think I agree that the fewer states and special cases we allow, the better, so aligning interposability and noipa is appealing to me. Comment Actions if llvm attribute noipa also adds logical assumption about noinline, we should not then audit all check of noinline whether they should be extended for noipa check, or? If we clang’s noipa translates to noipa + noinline + …, we dont have this issue. Comment Actions Because I don't think it should be necessary - noipa semantics should be strictly more powerful than noinline, so there shouldn't be a need to put noinline on the function. If noinline ever makes a difference on a noipa function, I think that's a pretty serious bug. I don't think the approach of revisiting/retesting all these call sites is the right way to go about things. So I stopped working on this because I couldn't think of/find a way to approach this that addressed both my perspectives and those of reviewers here.
The intermediate state created by pieces of (3) I have two concerns with:
Comment Actions No, I don't believe we do - if interposability is already well tested, that was the point of leveraging that definition/functionality - noipa should be very similar/the same as interposability. But if we have to revisit every check for interposability and rewrite the check - then, yes, we'd end up adding more test coverage and as a consequence we would end up adding testing for cases where noipa subsumes noinline, probably. (maybe the interposability testing is somewhere else/not in noinline directly - in which case we wouldn't add inlining testing, we'd test that other piece that happens to feed into inlining) Ah, indeed.
I'm just not sure it makes sense to decompose these - like I don't know what noipa without noinline or no_icf would mean - if we're saying it's invalid IR /not/ to combine them, then I'm still confused by how we could/would implement noipa that wouldn't implicitly subsume toinline/no_icf functionality anyway. (if there was an implementation of noipa that didn't implicitly subsume the functionality of noinline/no_icf, then I'd consider that feature brittle/buggy in a way I don't think it could/should be - if noipa is tested at the same level as interposable, it should flow through inlining and icf naturally without the need for specific attributes to disable them) Comment Actions This review may be stuck/dead, consider abandoning if no longer relevant. Comment Actions sync/update based on recent discussions https://discourse.llvm.org/t/force-optimizations-even-when-optnone-is-present/74216/19 |
Sent D101264 to refactor this function a bit.