Page MenuHomePhabricator

[InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806)
ClosedPublic

Authored by spatel on Jun 22 2018, 7:04 AM.

Details

Summary

This is an enhancement to D48401 that was discussed in:
https://bugs.llvm.org/show_bug.cgi?id=37806

We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other direction because that's generally better of course). This allows us to remove the shuffle as we do in the regular opcodes-are-the-same cases.

This requires a small hack to make sure we don't mistakenly introduce any extra poison:
https://rise4fun.com/Alive/ZGv

The other examples of opcodes where this would work are add+sub and fadd+fsub, but we already canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. Are there other opcode pairs where we can do this kind of transform?

Note that there's a different fold needed if we've already managed to simplify away a binop as seen in the test based on PR37806, but we manage to get that one case here because the fold is positioned above the demanded elements fold currently.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Jun 22 2018, 7:04 AM
spatel accepted this revision.Jun 22 2018, 7:11 AM

Oops - just noticed a typo that makes this patch wrong.

This revision is now accepted and ready to land.Jun 22 2018, 7:11 AM
spatel planned changes to this revision.Jun 22 2018, 7:11 AM
spatel updated this revision to Diff 152473.Jun 22 2018, 7:18 AM

Patch updated:
In the last rev, I forgot to remove the use of the original opcode when we create the new binop.
So we could have shifts when we should have multiplies (and the tests showed that).

This revision is now accepted and ready to land.Jun 22 2018, 7:18 AM
spatel requested review of this revision.Jun 22 2018, 7:19 AM
RKSimon added inline comments.Jun 22 2018, 7:39 AM
lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1177 ↗(On Diff #152473)

Is this going to scale well? There's likely to be a lot of 'similar' cases (ADD x,x -> SHL x,1 etc.)

spatel added inline comments.Jun 22 2018, 9:22 AM
lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1177 ↗(On Diff #152473)

That's what I wasn't sure about. I was guessing that add/sub was the common case, and we already canonicalize those. Can you list others? We can make some kind of map if there are a lot, but each case requires its own constant adjustment, so we'd end up with a switch I think.

RKSimon added inline comments.Jun 22 2018, 9:58 AM
lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1177 ↗(On Diff #152473)

ADD x,x -> SHL x,1 (or MUL x, 2)

AND x,0 -> MUL x,0 might happen (not sure - it probably disappears too early)

Also, merging OR x, c1 and ADD x, c2 if the carry bits don't clash (sorry, I've forgotten what this is called....) - similarly for OR x,c1 and XOR x,c2

UDIV and LSHR maybe (tricky.....)

spatel added inline comments.Jun 22 2018, 10:13 AM
lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1177 ↗(On Diff #152473)

ADD x, x is...different. We won't have a constant operand. Not sure yet what that patch looks like.

The cases that simplify (only 1 binop) will be handled in another patch.

Cases where we reverse a logic op into add (no common bits set?) - yes, that should slot in here. Same with udiv (I'm assuming that's a rare case though).

Is this waiting for a review, or are there changes planned?

The logic here seems sound.

lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1202–1203 ↗(On Diff #152473)

This is going to be confusing later on, given that we already have Opc0 and Opc1.

Is this waiting for a review, or are there changes planned?

It's waiting for further review. (I screwed up the Phab state/history by clicking the wrong 'Add Action...' when I made the first revision.)
As Simon noted, there are other opcode pairs that we can handle. I think they can be added individually as follow-up patches, and the code will evolve into different shapes as needed.

lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1202–1203 ↗(On Diff #152473)

Is it just the abbreviated variable naming (damn that 80-col limit!)? If so, I could spell out 'Opcode' vs. 'Operand'. Or rearrange the logic some way?

Is this waiting for a review, or are there changes planned?

The logic here seems sound.

My concern is how much this will get refactored as more cases detailed in PR37806 are added

lebedev.ri added inline comments.Jun 27 2018, 5:55 AM
lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1202–1203 ↗(On Diff #152473)

damn that 80-col limit!

It's actually great :)

If so, I could spell out 'Opcode' vs. 'Operand'. Or rearrange the logic some way?

It is perfectly clear that Opc0 means op code 0.
What i was talking about is that we have two of them, and yet use the same one in both cases.

So maybe doing

assert(Opc0 == Opc1);
unsigned Opc = Opc0;

and using it would be cleanest.

spatel updated this revision to Diff 153116.Jun 27 2018, 10:21 AM
spatel marked an inline comment as done.

Patch updated:
This is just the readability improvement suggested by Roman - make it clear that the opcodes are the same when we do the transform.

The implementation could be substantially different to handle other opcodes. For example, we'll need to call value tracking to determine when 'or' can become 'add'.

Not sure how far we'll go in that direction, but I'll post a proposal that includes the add/or case as a separate patch, and we can decide if we want to build it up in pieces or add the generalization to allow more folds first.

RKSimon accepted this revision.Jun 28 2018, 8:12 AM

LGTM

This revision is now accepted and ready to land.Jun 28 2018, 8:12 AM
This revision was automatically updated to reflect the committed changes.