This is an archive of the discontinued LLVM Phabricator instance.

SCEV: handle constant condition for select/branch
ClosedPublic

Authored by mehdi_amini on Oct 2 2015, 10:27 AM.

Details

Summary

This patch builds on top of D13378 to handle constant condition.

With this patch, clang -O3 optimizes correctly providing > 1000x speedup on this artificial benchmark):

for (a=0; a<n; a++)
    for (b=0; b<n; b++)
        for (c=0; c<n; c++)
            for (d=0; d<n; d++)
                for (e=0; e<n; e++)
                    for (f=0; f<n; f++)
                        x++;

From test-suite/SingleSource/Benchmarks/Shootout/nestedloop.c

Diff Detail

Event Timeline

mehdi_amini retitled this revision from to SCEV: handle constant condition for select/branch.
mehdi_amini added a reviewer: sanjoy.
mehdi_amini added a subscriber: llvm-commits.
mehdi_amini updated this object.Oct 2 2015, 10:31 AM
mehdi_amini added a subscriber: atrick.
atrick accepted this revision.Oct 2 2015, 12:59 PM
atrick added a reviewer: atrick.

Well, ideally the IR is canonical before SCEV, but this condition is simple enough to catch anyway. Thanks.

This revision is now accepted and ready to land.Oct 2 2015, 12:59 PM

I guess if this comes up when handling loops for instance, you may not necessarily be able to easily simplify the CFG in a loop pass.

manmanren added inline comments.
lib/Analysis/ScalarEvolution.cpp
3894

Please add a period at end of the comment.

3899

Should this be simplified to
return getSCEV(CI->isOne() ? TrueValue : FalseVal);

mehdi_amini updated this revision to Diff 36396.Oct 2 2015, 1:53 PM
mehdi_amini edited edge metadata.

Thanks Manman, I updated the revision accordingly.

sanjoy accepted this revision.Oct 3 2015, 2:25 PM
sanjoy edited edge metadata.
sanjoy added inline comments.
lib/Analysis/ScalarEvolution.cpp
3894

Minor stylistic issue, since you'll have to rebase anyway: instead of putting the check for ConstantInt inside the if block, I'd put it before the dyn_cast<ICmpInst>. Shouldn't make any semantic difference, but it will reduce the nesting on return getSCEV(CI->isOne() ? TrueVal : FalseVal);.

In D13390#258904, @joker.eph wrote:

I guess if this comes up when handling loops for instance, you may not necessarily be able to easily simplify the CFG in a loop pass.

Do we not currently catch this in SimplifyCFG (or similar)? I don't object to this change, but we really should catch this elsewhere too.

mehdi_amini closed this revision.Oct 27 2015, 6:48 PM

We surely catch it elsewhere. The problem is loop optimization like induction variable substitution that will operate on the inner loop, exposes this case, and then move the outer loop leaving the IR in this state. To solve this with simpifycfg I'm not sure that 1) it can be ran on a subpart of the CFG (you don't want to reprocess the full function between the two loops) 2) won't mess up the loop structure/canonicalization.

It seems to me that there is a kind of tension between optimizing for the "canonical IR" and being able to handle properly "non-canonical IR" because it is hard or expensive to always guarantee that your IR is canonical. I see this change as pushing a little bit SCEV toward the second case.

Any opinion?

Note: this landed a few week ago in r249431.

In D13390#276849, @joker.eph wrote:

We surely catch it elsewhere. The problem is loop optimization like induction variable substitution that will operate on the inner loop, exposes this case, and then move the outer loop leaving the IR in this state. To solve this with simpifycfg I'm not sure that 1) it can be ran on a subpart of the CFG (you don't want to reprocess the full function between the two loops) 2) won't mess up the loop structure/canonicalization.

It seems to me that there is a kind of tension between optimizing for the "canonical IR" and being able to handle properly "non-canonical IR" because it is hard or expensive to always guarantee that your IR is canonical. I see this change as pushing a little bit SCEV toward the second case.

Any opinion?

This is certainly true; there will always be these kinds of phase ordering problems. Especially since we don't have a LoopSimplifyCFG pass ;)

Note: this landed a few week ago in r249431.