A @llvm.experimental.noalias.scope.decl is only useful if there is !alias.scope and !noalias metadata that uses the declared scope.
When that is not the case for at least one of the two, the intrinsic call can as well be removed.
Details
Diff Detail
Event Timeline
InstSimplify is not a good place to put this (it's mostly a testing pass -- you'll find that it's only scheduled once in a standard pipeline). InstCombine is a better place. I would collect scopes during initial worklist population and then delete when the declaration is visited.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
3753 | Why only MD_alias_scope and not also MD_noalias? |
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
3753 | MD_noalias is telling what scopes (variables) are 'active'/'visible' for the memory instruction, but to which it will not alias. A no-aliasing relationship between two instructions can be deduced by checking if the !alias.scope of one instruction is a subset of the !noalias of the other instruction (also taking into account the 'domain'). Once a scope is not any more used in !alias.scope (aka, those memory instructions have been optimized away, or they lost the metadata), the location of that scope declaration has no influence any more and can be omitted. Of course the same reasoning can be done for the !noalias metadata: once a declared scope is not used any more in !noalias you can come to the same conclusion and omit that declaration. hmm.. My reasoning was that we are more likely to hit this with the '!alias.scope'. But a more complete solution can also track the UsedNoAliasScopes. If the declared scope is not found in one of the two, it (and its declaration) become useless and we can omit it. Removing the declaration is easy and safe. I am not sure if removing the metadata associated with that scope is also easy and safe to do. Shared metadata between functions could make it tricky, although that should only happen in testcases as far as I am aware. So, the short answer is: focusing on MD_alias_scope is likely the most effective (and corresponds to how the full restrict patches handle it) But, extending the analysis to also track MD_noalias can provide a more complete solution, at a slightly higher cost. @hfinkel, do you agree with my reasoning ? @nikic, @jdoerfert shall I add the UsedNoAliasScopes tracking as well ? |
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
3753 | At least longer term, I would like that every scope use also has a corresponding decl. With that in mind, I don't really like the idea of leading behind "dangling" scope references, even if it is technically correct. If it doesn't lose much practical optimization power, I would prefer it if we only dropped decls for scopes that are used in neither alias.scope nor noalias. But if this is a significant restriction, then I think your current approach of looking at alias.scope is fine (as it is annotated to loads, and those are much more likely to be eliminated than stores). |
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
3753 | I understand. Again, the original idea is based on the full restrict version with the 'usage based' elimination of the scope declarations: when no actual pointer is based on the scope declaration associated with the restrict pointer, that declaration is not needed any more. A 'completely clean' version could combine this with going over _all_ instructions in the function and removing that scope from the metadata. With full restrict, that cost was considered to be too high. There, it is also possible to have multiple declarations using the same scope, but associated to different 'restrict pointers' sharing that scope. IMHO, it does not make sense to only remove the declaration if the scope is not used at all. Once it is not used by either all !alias.scope, or all !noalias, it can be removed. A further refinement can indeed spend time in actually removing the unused/ineffective scope from the metadata. But I would rather do that as a followup change. |
I noticed that D93040 added a decent 2% penalty for trampd3d-v4. In https://llvm-compile-time-tracker.com/index.php?config=O3&stat=instructions&branch=dobbelaj-snps/perf/D95141_20210123_01 you can see that the cleanup wins back most of this regression.
4% for LTO.
It would be good to revert D93040 maybe.. Or be sure this fix goes in before llvm 12.
It is kinda unfortunate to introduce new regressions few days before rc1.
Look at both !alias.scope and !noalias to decide when to omit a declaration.
During the analysis of !alias.scope and !nolias : when a 'list of scopes' was already analysed we do not need to go over it again.
See https://llvm-compile-time-tracker.com/index.php?config=O3&stat=instructions&branch=dobbelaj-snps/perf/D95141_20210123_02 for the compile time effect.
Logic seems sounds to me. Code as well, one minor nit below. I have no objections to this.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
3765 | Style: I would prefer early exits but feel free to ignore if you think its not better. const auto *MDScopeList = dyn_cast_or_null<MDNode>(ScopeList); if (!MDScopeList || !Container.insert(MDSCopeList).second) return; for (auto &MDOperand : MDScopeList->operands()) if (auto *MDScope = dyn_cast<MDNode>(MDOperand)) Container.insert(MDScope); Is it legal that the cast fails? If the verifier would complain we might just assume it is an MDNode. | |
llvm/test/Transforms/Coroutines/ex4.ll | ||
56 | I'm confused where these come from in the first place but that is unrelated. |
During the analysis of !alias.scope and !nolias : when a 'list of scopes' was already analysed we do not need to go over it again.
I would expect that analysing scope list multiple times isn't really what's causing the compile-time hit here. It's probably the fact that you're looking up two kinds of metadata for each instruction, and metadata lookup is pretty expensive. You might want to check what the impact is if you only analyze mayReadOrWriteMemory() instructions.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
3780 | I've just pushed a NoAliasScopeDeclInst class to make this kind of code nicer. auto *Decl = dyn_cast<NoAliasScopeDeclInst>(Inst); if (!Decl) return false; const MDNode *MD = Decl->getScopeList(); |
Though possibly that won't help, as it's exactly loads, stores and calls that typically carry metadata anyway. Maybe just skipping DbgInfoIntrinsic would help to avoid a larger hit for -g builds...
Thanks for the feedback ! I'll check some variants tomorrow (Sunday). First getting some sleep ;)
llvm/test/Transforms/Coroutines/ex4.ll | ||
---|---|---|
56 | I would assume the inlining of a function that returns a struct. Those typically have a noalias attribute for the return argument. |
Use NoAliasScopeDeclInst. Only analyze non-debug instructions with metadata. (*1)
This seems to be faster than analyzing non-debug instruction that are reading or writing memory. (*2)
See https://llvm-compile-time-tracker.com/index.php?config=O3&stat=instructions&branch=dobbelaj-snps/perf/D95141_20210124_01 for some comparisons:
- *1 corresponds to: 374c7514b58d963cbab82341ad8383fcc89f588d
- *2 corresponds to: 26ed00968ed9925ead692dfd8a75bdf8dc4a3492
Thanks. I'll still do the small refactoring recommended by @jdoerfert and then do the commits.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
3765 | ok. I don't think the verifier is already complaining. I am all for making this more strict, but rather do this in a followup patch. |
Why only MD_alias_scope and not also MD_noalias?