This is an archive of the discontinued LLVM Phabricator instance.

[GVN PRE] Patch the source for Phi node in PRE
ClosedPublic

Authored by skatkov on Nov 4 2017, 11:37 PM.

Details

Summary

If we plan to use instruction in Phi node during PRE we must
patch all int incoming values, otherwise it is possible that
we can see poison where program does not expect to see it.

This is the similar what GVN does.

The added test test/Transforms/GVN/PRE/pre-jt-add.ll shows an
example of wrong optimization done by jump threading due to
GVN PRE did not patch incoming value.

Diff Detail

Event Timeline

skatkov created this revision.Nov 4 2017, 11:37 PM

Please review this fix for functional issue.

dberlin edited edge metadata.Nov 6 2017, 8:36 PM

TL; DR I'm .. confused.

I don't feel like i understand why your statement is true, and i have trouble seeing why it would be.

It's definitely true that the end result of PRE (IE the resulting phi) should have patchreplacementvalue called when it's used to replace an instruction.
It does, AFAIK.

I don't understand either the comment you've added, or how it relates.
In fact, i literally don't understand the comment.
We aren't using V instead of CurInstr, we are using v in a phi we will later use to replace curinstr.
Those are not the same thing.

Your second test also passes right now without this patch.

It appears you are trying to say that use in a phi node should mean we have to change the nuw/nsw flags on the originals.

It's unclear to me why, and why, if it's not safe, it's okay to PRE things this way.
(IE either you have proven the flags or you have proven the value is different). It's a different case from replacing one value with another. When we straight replace one value with another, we are relying on value equality alone.

Here we are relying on path equality as well. Your first example is quite literally saying "along this path, it's this instruction and has nsw nuw flags. Along this other path, it does not have those flags". That is still true.
I don't see why that would break anything.

Further, right now, -gvn -enable-pre -O2 produces the same result for test1 whether you have the flags are on add.1 or not.

So i don't even see the brokenness you are claiming is happening.

I feel like you need a lot more explanation here.

Hi Daniel,

TL; DR I'm .. confused.

I don't feel like i understand why your statement is true, and i have trouble seeing why it would be.

Let me try to explain :)

It's definitely true that the end result of PRE (IE the resulting phi) should have patchreplacementvalue called when it's used to replace an instruction.
It does, AFAIK.

It does not (see code below my patched piece). I do not see an invocation of patchreplacementvalue till line #2305 where replacement actually happens. Moreover as I know phi cannot have nsw/nuw falgs...

I don't understand either the comment you've added, or how it relates.
In fact, i literally don't understand the comment.
We aren't using V instead of CurInstr, we are using v in a phi we will later use to replace curinstr.
Those are not the same thing.

Yes, I was a bit relaxed in comment saying this. Specifically I meant that we plan to use V value instead of CurInst if control flow will go on the corresponding path.
So, yes you are right, we use V in Phi node which will replace the CurInst.

Your second test also passes right now without this patch.

Yes, that is true, I just added the second test to emphasize that we do not need patch the original CurInst when we move it to predecessor. It seems to me it is a correct behavior due to it does not introduce new conditions for instruction which uses CurInst.

It appears you are trying to say that use in a phi node should mean we have to change the nuw/nsw flags on the originals.

Right, except CurInst which will be moved to predecessor.

It's unclear to me why, and why, if it's not safe, it's okay to PRE things this way.
(IE either you have proven the flags or you have proven the value is different). It's a different case from replacing one value with another. When we straight replace one value with another, we are relying on value equality alone.

Here we are relying on path equality as well. Your first example is quite literally saying "along this path, it's this instruction and has nsw nuw flags. Along this other path, it does not have those flags". That is still true.
I don't see why that would break anything.

Further, right now, -gvn -enable-pre -O2 produces the same result for test1 whether you have the flags are on add.1 or not.

So i don't even see the brokenness you are claiming is happening.

I feel like you need a lot more explanation here.

The problem is as follows. Let's consider the modified version of the first test.
Let return block ends up not by store and ret but with conditional branch, basing on %add.2.
Let add instructions look like:
%add.1 = add nuw nsw i32 %v, -1
%add.2 = add i32 %v, -1

in this case GVN PRE will create a Phi node
%add.2.pre-phi = phi i32 [ %.pre, %bb1 ], [ %add.1, %bb ]
and condition will be, say %add.2.pre-phi > 0

Now, for example JumpThreading pass basing on
%add.1 = add nuw nsw i32 %v, -1
can prove that %v == 0 (due to nuw) and condition is always false.
so it can eliminate the branch from bb -> return and use the branch bb -> branch corresponding to false in our conditions.

This is incorrect due to original program does not have such semantic.

I hope it helps. If there are other questions, I'll be happy to answer them.

And thanks for review this!

I think it would be a lot easier to understand your example if you could produce IR that, when run through -gvn -enable-pre -jump-threading, produces the wrong result without this patch.

I think it would be a lot easier to understand your example if you could produce IR that, when run through -gvn -enable-pre -jump-threading, produces the wrong result without this patch.

Ok, will do.

skatkov updated this revision to Diff 121855.Nov 7 2017, 2:13 AM
skatkov edited the summary of this revision. (Show Details)

Add a test showing the incorrect behavior of jump threading due to change done by GVN PRE.

Whether it looks better?

Your patch has convinced me that GVN is probably generally broken here, but just happens to not value number or create new phi nodes except for PRE, so it doesn't generate a lot of instances of this issue.
So yeah, i'm fine with the patch in general

I'd appreciate Eli's opinion on the following:

Right now, patchReplacementInstruction has this comment:

t more restrictive than the value
  // being replaced.
  // Note that if 'I' is a load being replaced by some operation,
  // for example, by an arithmetic operation, then andIRFlags()
  // would just erase all math flags from the original arithmetic
  // operation, which is clearly not wanted and not needed.

I'm unsure this is true.

If we take the example for jump threading, and make it go through a load (where the load has the value of the two adds), and then replace the load with a phi of the two values, is there a reason we don't have the same problem?

lib/Transforms/Scalar/GVN.cpp
2292

Please replace this comment with "If we use an existing value in this phi, we have to patch the original value because the phi will be used to replace a later value"

If we take the example for jump threading, and make it go through a load (where the load has the value of the two adds), and then replace the load with a phi of the two values, is there a reason we don't have the same problem?

GVN can do a few different kinds of replacements. It can replace a scalar operation with the value of a different scalar operation. It can replace a load with the value of a different load. Or it can replace the value of the load with an operand of a store.

The problem the call to andIRFlags solves is an issue with replacing a scalar operation with a different scalar operation: "nsw" isn't passed into the value numbering, so we sometimes end up replacing an operation which can't produce poison with an operation which can produce poison. For example, GVN can replace "add i32 %x, i32 %y" with "add nsw i32 %x, i32 %y". The latter produces poison in some cases, so it isn't precisely equivalent. So when we perform the replacement, we need to erase the nsw.

If we're replacing a load with the operand of a store, we do not want to call andIRFlags; the original IR flags are fine, and "and"ing together the flags for two different operations doesn't make sense. We check isa<LoadInst> to catch this case.

Of course, if you're trying to some other, more complicated replacement, this logic could break down.

In load PRE, it is replacing the value of a load with a phi of other values.

As you said, it does not consider nsw while value numbering.

Imagine a load of a phi that value numbering considers equivalent to

pred1:
1= add nsw, something

pred2:
2 = add something

Despite the difference in flags, it will consider these to be okay values to phi together, and use to replace the load.

So it will generate phi(1, 2) and use it to replace the load.

This is precisely the same thing happening with scalar pre, in this testcase (the insertion it performs is irrelevant).

Hence my question is whether this is magically more okay than what was happening in the scalar pre case.

I think the answer is no, it's not okay, and needs to be fixed, but am not sure :)

Oh, I see. That looks like a bug.

...

So yeah, i'm fine with the patch in general

...

May I land a patch?

skatkov added inline comments.Nov 8 2017, 8:16 PM
lib/Transforms/Scalar/GVN.cpp
2292

will do before landing or if I need to update a patch.

Can you please fix LoadPRE as well?
ConstructSSALoadSet needs to be patching instructions it materializes if they come from an existing value.

Here is a testcase that demonstrates the problem:

@H = common global i32 0
@G = common global i32 0

define i32 @test(i1 %cond, i32 %v) nounwind {
entry:
  br i1 %cond, label %bb, label %bb1

bb:
  %add.1 = add nuw nsw i32 %v, -1
  store i32 %add.1, i32* @G, align 4
  br label %merge

bb1:
  %add.2 = add i32 %v, -1
  store i32 %add.2, i32* @G, align 4
  br label %merge

merge:
  %foo = load i32, i32* @G, align 4
  %cmp = icmp sgt i32 %foo, 0
  br i1 %cmp, label %action, label %return

action:
  store i32 %foo, i32* @H, align 4
  br label %return

return:
  %p = phi i32 [0, %merge], [1, %action]
  ret i32 %p
}

Output after -gvn:

; Function Attrs: nounwind
define i32 @test(i1 %cond, i32 %v) #0 {
entry:
  br i1 %cond, label %bb, label %bb1

bb:                                               ; preds = %entry
  %add.1 = add nuw nsw i32 %v, -1
  store i32 %add.1, i32* @G, align 4
  br label %merge

bb1:                                              ; preds = %entry
  %add.2 = add i32 %v, -1
  store i32 %add.2, i32* @G, align 4
  br label %merge

merge:                                            ; preds = %bb1, %bb
  %foo = phi i32 [ %add.2, %bb1 ], [ %add.1, %bb ]
  %cmp = icmp sgt i32 %foo, 0
  br i1 %cmp, label %action, label %return

action:                                           ; preds = %merge
  store i32 %foo, i32* @H, align 4
  br label %return

return:                                           ; preds = %action, %merge
  %p = phi i32 [ 0, %merge ], [ 1, %action ]
  ret i32 %p
}

attributes #0 = { nounwind }

Output after jump threading:

; ModuleID = 'pre-jt-load.ll'
source_filename = "pre-jt-load.ll"

@H = common global i32 0
@G = common global i32 0

; Function Attrs: nounwind
define i32 @test(i1 %cond, i32 %v) #0 {
entry:
  br i1 %cond, label %merge.thread, label %merge

merge.thread:                                     ; preds = %entry
  %add.1 = add nuw nsw i32 %v, -1
  store i32 %add.1, i32* @G, align 4
  br label %return

merge:                                            ; preds = %entry
  %add.2 = add i32 %v, -1
  store i32 %add.2, i32* @G, align 4
  %cmp = icmp sgt i32 %add.2, 0
  br i1 %cmp, label %action, label %return

action:                                           ; preds = %merge
  store i32 %add.2, i32* @H, align 4
  br label %return

return:                                           ; preds = %merge.thread, %action, %merge
  %p = phi i32 [ 0, %merge ], [ 1, %action ], [ 0, %merge.thread ]
  ret i32 %p
}

attributes #0 = { nounwind }

May I do it as a separate patch? Land it and upload a new one for load?

dberlin accepted this revision.Nov 8 2017, 9:01 PM

That's fine too.

This revision is now accepted and ready to land.Nov 8 2017, 9:01 PM
This revision was automatically updated to reflect the committed changes.