This is an archive of the discontinued LLVM Phabricator instance.

[LV] Switch to using canonical induction variables.
ClosedPublic

Authored by jmolloy on Aug 24 2015, 8:59 AM.

Details

Summary

Vectorized loops only ever have one induction variable. All induction PHIs from the scalar loop are rewritten to be in terms of this single indvar.

We were trying very hard to pick an indvar that already existed, even if that indvar wasn't canonical (didn't start at zero). But trying so hard is really fruitless - creating a new, canonical, indvar only results in one extra add in the worst case and that add is trivially easy to push through the PHI out of the loop by instcombine.

If we try and be less clever here and instead let instcombine clean up our mess (as we do in many other places in LV), we can remove unneeded complexity.

Diff Detail

Repository
rL LLVM

Event Timeline

jmolloy updated this revision to Diff 32963.Aug 24 2015, 8:59 AM
jmolloy retitled this revision from to [LV] Switch to using canonical induction variables..
jmolloy updated this object.
jmolloy added reviewers: anemet, mzolotukhin.
jmolloy set the repository for this revision to rL LLVM.
jmolloy added a subscriber: llvm-commits.

I am OK with this but I'd like someone else to sign off on this one too. Alternatively showing that there is no relevant asm diff on the testsuite (including externals) is probably also a good indication that this case is over-engineered locally.

When you say InstCombine do you mean LSR?

We should probably also add a comment that OldInduction is now a canonical induction variable. And remove ExtendedIdx, which I think is unused now.

Hi Adam,

I ran the test-suite and got 66 files changed out of 1733 (0.05%). Of the ones that changed, I didn't see anything obvious but there was quite a bit of register allocation churn in many of them which made it very difficult to spot real differences.

I have a patch in my queue to remove ExtendedIdx later - I can merge that into this one.

Cheers,

James

Also, FWIW, I ran all our testing on this and got a slight improvement (2.5%) in ammp in spec2000. No other significant changes.

I have a patch in my queue to remove ExtendedIdx later - I can merge that into this one.

Up to you, just wanted to make sure it does not stay there.

Can you please also address my question regarding LSR vs. InstCombine?

Sure. If we have a single, non canonical phi, and we create a canonical
one, then we'll end up with one phi with a constant add inside the loop.

I would have expected it to be instcombine that would take that add and
thread it up before the loop.

Having said that, instcombine doesn't look at loop depths so yes, it would
probably be LSR (or indvars?)

James

jmolloy updated this revision to Diff 33524.Aug 30 2015, 3:04 AM
jmolloy edited edge metadata.

Hi Adam,

Rebased and uploaded with full context.

Would it be acceptable to commit this patch and monitor for performance regressions? Without a specific regression test that shows a pattern being pessimized, and with my own testing showing no regressions, to me it makes sense to proceed and watch performance numbers carefully for a regression.

Is this acceptable to you?

Cheers,

James

nadav edited edge metadata.Aug 31 2015, 5:08 PM

I think that this is a good change. In many other places in the vectorizer the design was that we let other passes (such as CSE, InstCombine and LSR) clean up after us. I am totally okay with letting LSR do the cleanup. If I remember correctly we have always relied on LSR to do the cleanup and I don't remember why we have the logic for searching other induction variables.

LGTM.

anemet accepted this revision.Aug 31 2015, 8:55 PM
anemet edited edge metadata.
This revision is now accepted and ready to land.Aug 31 2015, 8:55 PM
jmolloy closed this revision.Sep 2 2015, 5:00 AM

r246630.