Instead of using VMap, which may include instructions from the caller as a result of simplification, iterate over the (FirstNewBlock, Caller->end()) range, which will only include new instructions.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/Utils/InlineFunction.cpp | ||
---|---|---|
913 | I've left this as a TODO to be more confident that this is safe to backport to LLVM 12. |
Looks good. Could you also add the other example as a test case ?
I am wondering if we should extend the testcases with the llvm.experimental.noalias.scope.decl. I assume they got lost while reducing the testcases ?
Both examples are already included in the test: @caller tests PropagateCallSiteMetadata and @self_caller tests ScopedAliasMetadataDeepCloner::remap.
I am wondering if we should extend the testcases with the llvm.experimental.noalias.scope.decl. I assume they got lost while reducing the testcases ?
Done! I hadn't included them originally because they're not strictly necessary, but it does make sense to have them for clarity.
Yes indeed. Not sure how I overlooked it.
I am wondering if we should extend the testcases with the llvm.experimental.noalias.scope.decl. I assume they got lost while reducing the testcases ?
Done! I hadn't included them originally because they're not strictly necessary, but it does make sense to have them for clarity.
Thanks.
LGTM
llvm/lib/Transforms/Utils/InlineFunction.cpp | ||
---|---|---|
997 | Just wondering: don't we have a similar issue here ? |
llvm/lib/Transforms/Utils/InlineFunction.cpp | ||
---|---|---|
997 | Yes, unfortunately we do. It took me a while, but I found a suitable llvm.masked.load fold to exploit: define <2 x i8> @callee(<2 x i8>* %ptr1, <2 x i8>* noalias %ptr2, <2 x i1> %mask, <2 x i8> %passthru) { %ret = call <2 x i8> @llvm.masked.load.v2i8(<2 x i8>* %ptr1, i32 1, <2 x i1> %mask, <2 x i8> %passthru) store <2 x i8> zeroinitializer, <2 x i8>* %ptr2 ret <2 x i8> %ret } define void @caller(<2 x i8>* %ptr1, <2 x i8>* %ptr2) { %passthru = load <2 x i8>, <2 x i8>* %ptr2 call <2 x i8> @callee(<2 x i8>* %ptr1, <2 x i8>* %ptr2, <2 x i1> zeroinitializer, <2 x i8> %passthru) ret void } declare <2 x i8> @llvm.masked.load.v2i8(<2 x i8>*, i32, <2 x i1>, <2 x i8>) Results in: define <2 x i8> @callee(<2 x i8>* %ptr1, <2 x i8>* noalias %ptr2, <2 x i1> %mask, <2 x i8> %passthru) { %ret = call <2 x i8> @llvm.masked.load.v2i8.p0v2i8(<2 x i8>* %ptr1, i32 1, <2 x i1> %mask, <2 x i8> %passthru) store <2 x i8> zeroinitializer, <2 x i8>* %ptr2, align 2 ret <2 x i8> %ret } define void @caller(<2 x i8>* %ptr1, <2 x i8>* %ptr2) { %passthru = load <2 x i8>, <2 x i8>* %ptr2, align 2, !noalias !0 call void @llvm.experimental.noalias.scope.decl(metadata !0) store <2 x i8> zeroinitializer, <2 x i8>* %ptr2, align 2, !alias.scope !0 ret void } In this case, we don't even need the incorrect annotation to go on an instruction from the caller, it can also happen with a callee instruction: define <2 x i8> @callee(<2 x i8>* %ptr1, <2 x i8>* noalias %ptr2, <2 x i1> %mask) { %passthru = load <2 x i8>, <2 x i8>* %ptr2 %ret = call <2 x i8> @llvm.masked.load.v2i8(<2 x i8>* %ptr1, i32 1, <2 x i1> %mask, <2 x i8> %passthru) store <2 x i8> zeroinitializer, <2 x i8>* %ptr2 ret <2 x i8> %ret } define void @caller(<2 x i8>* %ptr1, <2 x i8>* %ptr2) { call <2 x i8> @callee(<2 x i8>* %ptr1, <2 x i8>* %ptr2, <2 x i1> zeroinitializer) ret void } declare <2 x i8> @llvm.masked.load.v2i8(<2 x i8>*, i32, <2 x i1>, <2 x i8>) Results in: define <2 x i8> @callee(<2 x i8>* %ptr1, <2 x i8>* noalias %ptr2, <2 x i1> %mask) { %passthru = load <2 x i8>, <2 x i8>* %ptr2, align 2 %ret = call <2 x i8> @llvm.masked.load.v2i8.p0v2i8(<2 x i8>* %ptr1, i32 1, <2 x i1> %mask, <2 x i8> %passthru) store <2 x i8> zeroinitializer, <2 x i8>* %ptr2, align 2 ret <2 x i8> %ret } define void @caller(<2 x i8>* %ptr1, <2 x i8>* %ptr2) { call void @llvm.experimental.noalias.scope.decl(metadata !0) %passthru.i = load <2 x i8>, <2 x i8>* %ptr2, align 2, !alias.scope !0, !noalias !0 store <2 x i8> zeroinitializer, <2 x i8>* %ptr2, align 2, !alias.scope !0 ret void } Note how the metadata is claiming that the load does not alias ... with itself. So, in this case the problem is a bit different from the other two: We have to not just prevent annotating instructions from the caller, we should prevent annotating any instructions for which some kind of folding occurred. |
llvm/lib/Transforms/Utils/InlineFunction.cpp | ||
---|---|---|
997 | Anyone know how to fix this nicely? I don't think we have any information on whether an instruction was simplified or not at this point. The only thing that comes to mind is to go through all instructions and check which ones occur in the map multiple times. Do we need to change the cloning implementation to track simplification info in a side table? |
llvm/lib/Transforms/Utils/InlineFunction.cpp | ||
---|---|---|
997 | I submitted https://bugs.llvm.org/show_bug.cgi?id=50589 to keep track of this issue. |
I've left this as a TODO to be more confident that this is safe to backport to LLVM 12.