This is an archive of the discontinued LLVM Phabricator instance.

[MemorySSA] Skip phi translation for phis in cycles (WIP).
AbandonedPublic

Authored by fhahn on Jun 2 2020, 6:57 AM.

Details

Summary

Currently we use the phi-translated address when traversing the
definitions upwards. The translated address is then used to check for
clobbers along a path. If the Phi is part of a cycle, this is not
correct I think, as it means we only check for clobbers with the initial
value of the cycle. But it is possible that a definition along the path
clobbers an address other than the initial value of the cycle.

This patch just detects trivial cycles, as I want to make sure I am not
missing anything before generalizing.

Consider the example below. Without this patch, MemorySSA will set the
defining access of %lv = load i16, i16* %arrayidx, align 2, !tbaa !3
to LiveOnEntry, because it phi-translates the location of the load to
getelementptr inbounds [2 x i16], [2 x i16]* @c, i64 0, i64 1 in
%entry, but the loop body is executed twice and it loads from
getelementptr inbounds [2 x i16], [2 x i16]* @c, i64 0, i64 1 andj
getelementptr inbounds [2 x i16], [2 x i16]* @c, i64 0, i64 0. The
latter aliases the store in %entry.

define i32 @main(i32* noalias %ptr) {
entry:
  %s1.ptr = getelementptr inbounds [2 x i16], [2 x i16]* @c, i64 0, i64 0
  store i16 1, i16* %s1.ptr, align 2, !tbaa !3
  br label %for.body

for.body:
  %storemerge2 = phi i32 [ 1, %entry ], [ %dec, %for.body ]
  %idxprom1 = zext i32 %storemerge2 to i64
  %arrayidx = getelementptr inbounds [2 x i16], [2 x i16]* @c, i64 0, i64 %idxprom1
  %lv = load i16, i16* %arrayidx, align 2, !tbaa !3
  %conv = sext i16 %lv to i32
  store i32 %conv, i32* %ptr, align 4, !tbaa !7
  %dec = add nsw i32 %storemerge2, -1
  %cmp = icmp sgt i32 %storemerge2, 0
  br i1 %cmp, label %for.body, label %for.end

for.end:
  %s2.ptr = getelementptr inbounds [2 x i16], [2 x i16]* @c, i64 0, i64 0
  store i16 0, i16* %s2.ptr, align 2, !tbaa !3
  ret i32 0
}

Fixes PR46156.

Diff Detail

Event Timeline

fhahn created this revision.Jun 2 2020, 6:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2020, 6:57 AM
fhahn added a comment.Jun 10 2020, 8:16 AM

ping. Does the problem/reasoning make sense?

It looks like this is a regression from https://reviews.llvm.org/D79068 .

Yes, when we're doing PHI translation in MemorySSA, we have to consider if there are cycles. Not sure what the algorithm needs to look like off the top of my head.

fhahn added a comment.Jun 10 2020, 5:17 PM

It looks like this is a regression from https://reviews.llvm.org/D79068 .

I am not sure, I think the patch above fixes an issue with translating a phi to a wrong address. I think the issue in this patch is slightly different: the PHI is translated correctly, but the result is used in an unsound way.

Yes, when we're doing PHI translation in MemorySSA, we have to consider if there are cycles. Not sure what the algorithm needs to look like off the top of my head.

Great, I just wanted to make sure I am not missing anything obvious before digging too deep.

It looks like this is a regression from https://reviews.llvm.org/D79068 .

I am not sure, I think the patch above fixes an issue with translating a phi to a wrong address.

I just meant that your testcase passes on an LLVM build from April 29, but fails on one from April 30.

Thank you for the testcase, this does show the issue very well.
The solution I think it's incomplete. This can occur in any loop, not just for a trivial phi. For example with just adding an if/else in the loop.

define i32 @test1(i32* noalias %ptr) {
entry:
  %s1.ptr = getelementptr inbounds [2 x i16], [2 x i16]* @c, i64 0, i64 0
; 1 = MemoryDef(liveOnEntry)
  store i16 1, i16* %s1.ptr, align 2, !tbaa !0
  br label %for.body

for.body:                                         ; preds = %merge.body, %entry
; 4 = MemoryPhi({entry,1},{merge.body,2})
  %storemerge2 = phi i32 [ 1, %entry ], [ %dec, %merge.body ]
  %idxprom1 = zext i32 %storemerge2 to i64
  %arrayidx = getelementptr inbounds [2 x i16], [2 x i16]* @c, i64 0, i64 %idxprom1
; MemoryUse(liveOnEntry)
  %lv = load i16, i16* %arrayidx, align 2, !tbaa !0
  %conv = sext i16 %lv to i32
; 2 = MemoryDef(4)
  store i32 %conv, i32* %ptr, align 4, !tbaa !4
  %dec = add nsw i32 %storemerge2, -1
  %cmpif = icmp sgt i32 %storemerge2, 1
  br i1 %cmpif, label %if.body, label %else.body

if.body:                                          ; preds = %for.body
  br label %merge.body

else.body:                                        ; preds = %for.body
  br label %merge.body

merge.body:                                       ; preds = %else.body, %if.body
  %cmp = icmp sgt i32 %storemerge2, 0
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %merge.body
  %s2.ptr = getelementptr inbounds [2 x i16], [2 x i16]* @c, i64 0, i64 0
; 3 = MemoryDef(2)
  store i16 0, i16* %s2.ptr, align 2, !tbaa !0
  ret i32 0
}

A conservative approach would be: "if this block is part of a loop".
I'm not sure how to make this check cheap however.

A potential solution is passing LoopInfo if available and check if OriginalAccess is in the blocks of any Loop. But I expect this check can get expensive with a large number of loops/blocks, plus LI is generally available in loop passes, so when LI is available the answer is likely skip PhiTranslation without checking the blocks at all..

Checking LoopInfo is a bad idea; even if LoopInfo doesn't detect any loops, there can still be CFG cycles.

I think the solution need to be at the traversal level in MSSA. It's correct to do the phi translation for the predecessor entry and return NoAlias. It's also correct to go on the backedge and find no other aliasing accesses inside the loop (loop is defined here as finding the starting point of the walk, in this case the phi). It is not however correct to merge the two paths and claim NoAlias.
I'm still working out the proper solution.

fhahn abandoned this revision.Jul 30 2020, 10:53 AM

@asbirlea put up proper fix, which considers phi translation when exploring different paths: D84905. Thanks!