This is an archive of the discontinued LLVM Phabricator instance.

[GVN] Add phi-translate for scalarpre as a temporary solution
ClosedPublic

Authored by wmi on Apr 19 2017, 3:45 PM.

Details

Summary

Now phi-translate only exists in loadpre. scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis.

long a[100], b[100];
long g1, g2, g3;

attribute((pure))
long goo();

void foo(long a, long b, long c, long d) {

g1 = a * b;
if (__builtin_expect(g2 > 3, 0)) {
  a = c;
  b = d;
  g2 = a * b;
}
g3 = a * b;      // fully redundant.

}

The patch adds support for phi-translate in scalarpre. This is only a temporary solution before the newpre based on newgvn is available.

Diff Detail

Repository
rL LLVM

Event Timeline

wmi created this revision.Apr 19 2017, 3:45 PM
dberlin edited edge metadata.Apr 19 2017, 4:04 PM

A few things.

  1. There is, strangely, a doPhiTranslation function that Value has, you can use that :)
  1. Newgvn gets this example case without PRE with my latest patch

It understand the equivalence between op of phis and phi of ops, and so comes up with:

define void @test1(i64 %a, i64 %b, i64 %c, i64 %d) {
entry:
  %mul = mul nsw i64 %b, %a
  store i64 %mul, i64* @g1, align 8
  %t0 = load i64, i64* @g2, align 8
  %cmp = icmp sgt i64 %t0, 3
  br i1 %cmp, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  %mul2 = mul nsw i64 %d, %c
  store i64 %mul2, i64* @g2, align 8
  br label %if.end

if.end:                                           ; preds = %if.then, %entry
  %0 = phi i64 [ %mul2, %if.then ], [ %mul, %entry ]
  store i64 %0, i64* @g3, align 8
  ret void
}
  1. I suspect you will have issues with backedges with this approach

I'm pretty sure it will fail in some cases to fixpoint (but maybe gvn is not strong enough to make that happen)

At the very least, you don't want this to transform:

a = phi(0, a + 1)
use a
to
a = phi(0, a + 1)
b = phi(1, a + 2)

etc
It will peel loops ivs if you let it.
In gcc, we specifically check for this case.

This also needs compile time and performance measurements.

wmi added a comment.Apr 19 2017, 10:41 PM

Daniel, thanks for the comments.

There is, strangely, a doPhiTranslation function that Value has, you can use that :)

Yes, I can use it to simplify phiTranslateImpl a little.

Newgvn gets this example case without PRE with my latest patch. It understand the equivalence between op of phis and phi of ops

Good job for newgvn :-).

I suspect you will have issues with backedges with this approach. I'm pretty sure it will fail in some cases to fixpoint (but maybe gvn is not strong enough to make that happen)
At the very least, you don't want this to transform:

There is code in performScalarPRE checking whether the predecessor is the same as current bb and giving up if it is true. I tried small testcase and it seemed fine. I will pay attention to it when exercise it with more tests.

This also needs compile time and performance measurements.

Sure. I will do it.

I suspect you will have issues with backedges with this approach. I'm pretty sure it will fail in some cases to fixpoint (but maybe gvn is not strong enough to make that happen)
At the very least, you don't want this to transform:

There is code in performScalarPRE checking whether the predecessor is the same as current bb and giving up if it is true. I tried small testcase and it seemed fine. I will pay attention to it when exercise it with more tests.

This code will only work for the trivialest of cases :)

GVN computes the RPO ordering.
The easiest way to detect backedges is to densemap that.
An edge x->y is a backedge if RPO[x] >= RPO[y]

(This is because in a postorder traversal, you are guaranteed it will find the edges that lead to a block, before that block, since it will process all children to completion first
So in a PO, PO[x] <= PO[y], since it will process all such children of y before processing y.
In RPO, it's just reversed)

davide requested changes to this revision.Apr 25 2017, 1:31 PM

Marking request changes so it gets off of my queue while you're taking performance/compile time numbers

This revision now requires changes to proceed.Apr 25 2017, 1:31 PM
wmi updated this revision to Diff 97974.May 5 2017, 9:42 AM
wmi edited edge metadata.

Forgot to update the patch.

prevent phi-translate from going through back edge.

wmi updated this revision to Diff 98361.May 9 2017, 3:59 PM

I change the DenseMap numberingExpression (mapping from value numbering to GVN::Expression) to std::vector. Although there is still some minor increase but better than before.

400.perlbench 0.81%
401.bzip2 -0.11%
403.gcc -0.42%
429.mcf 1.58%
433.milc -0.33%
445.gobmk 0.85%
456.hmmer 0.07%
458.sjeng -0.42%
462.libquantum -0.75%
464.h264ref -1.24%
470.lbm -0.65%
482.sphinx3 0.44%
998.specrand 0.00%
999.specrand -1.74%
444.namd 0.28%
447.dealII 1.56%
450.soplex 0.96%
453.povray -0.02%
471.omnetpp 0.02%
473.astar 0.18%
483.xalancbmk 0.85%

In D32252#750282, @wmi wrote:

I change the DenseMap numberingExpression (mapping from value numbering to GVN::Expression) to std::vector. Although there is still some minor increase but better than before.

400.perlbench 0.81%
401.bzip2 -0.11%
403.gcc -0.42%
429.mcf 1.58%
433.milc -0.33%
445.gobmk 0.85%
456.hmmer 0.07%
458.sjeng -0.42%
462.libquantum -0.75%
464.h264ref -1.24%
470.lbm -0.65%
482.sphinx3 0.44%
998.specrand 0.00%
999.specrand -1.74%
444.namd 0.28%
447.dealII 1.56%
450.soplex 0.96%
453.povray -0.02%
471.omnetpp 0.02%
473.astar 0.18%
483.xalancbmk 0.85%

The compile time numbers don't look particularly bad. Thanks.
Can you try on large testcases/LTO (maybe of clang itself) to confirm that scales in a sane way?

I'm a little surprised that bootstrapping without LTO takes more time than bootstrapping with LTO, but assuming numbers are sane, the compile time impact is not terrible.

Pretty good at this point.
I don't understand why you changed the CHECK-NEXT lines to CHECK lines though.
Can you explain?

lib/Transforms/Scalar/GVN.cpp
239 ↗(On Diff #98361)

Just initialize it to false in the constructor

wmi updated this revision to Diff 99586.May 19 2017, 10:01 AM

Initialize commutative in Expression's constructor.
Fix a bug related with commutative: need to swap cmp predicate at the same time when we swap the operands.

dberlin accepted this revision.May 23 2017, 8:38 PM

At this point, i think this is reasonable. we know what the cost is, it seems reasonable given the benefits, and we know it's not long term.
the code seems clean etc.
i"m going to accept this, but i'd just wait a day to see if anyone else has comments before committing.

Thank you for working through this with folks :)

Fine by me. Thanks!

wmi added a comment.May 25 2017, 2:34 PM

No more comments seen so prepare to commit the patch. Thanks for the review!

This revision was automatically updated to reflect the committed changes.