This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] enhance shuffle-of-binops to allow different variable ops (PR37806)
ClosedPublic

Authored by spatel on Jun 27 2018, 3:05 PM.

Details

Summary

This is an enhancement to D48401 that was discussed in:
https://bugs.llvm.org/show_bug.cgi?id=37806

If we have 2 different variable values, then we shuffle (select) those lanes, shuffle (select) the constants, and then perform the binop. This eliminates a binop.

The new shuffle uses the same shuffle mask as the existing shuffle, so there's no danger of creating a difficult shuffle.

All of the earlier constraints still apply, but we also check for extra uses to avoid creating more instructions than we'll remove.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Jun 27 2018, 3:05 PM
lebedev.ri added inline comments.Jun 28 2018, 8:00 AM
lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1186 ↗(On Diff #153188)

What about getSafeVectorConstantForIntDivRem()?

RKSimon accepted this revision.Jun 28 2018, 8:07 AM

LGTM with one minor

lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1185 ↗(On Diff #153188)

Explicitly mention that the shuffle mask is safe?

This revision is now accepted and ready to land.Jun 28 2018, 8:07 AM
spatel added inline comments.Jun 28 2018, 8:09 AM
lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1186 ↗(On Diff #153188)

We still call that below here, and the translation from undef to 1 is covered by the "srem_2_vars" test.

Am I understanding the question?

lebedev.ri added inline comments.Jun 28 2018, 8:48 AM
test/Transforms/InstCombine/shuffle_select.ll
411–423 ↗(On Diff #153188)

%TMP1's second element is undef, since the mask is undef, no?
So we are srem'ing by the vector with one known undef.
That's ok?

spatel added inline comments.Jun 28 2018, 9:48 AM
test/Transforms/InstCombine/shuffle_select.ll
411–423 ↗(On Diff #153188)

Ah, good point. I don't think we currently have the analysis to cause trouble there, but we might in the future.

So we need to replace the undef in the shuffle mask with the proper lane that selects from one of the source vectors. That might be forbidden because we'd be creating a more specific shuffle mask than already existed. Should we just give up on trying to fold div/rem in this case?

lebedev.ri requested changes to this revision.Jun 28 2018, 9:59 AM
lebedev.ri added inline comments.
test/Transforms/InstCombine/shuffle_select.ll
411–423 ↗(On Diff #153188)

Should we just give up on trying to fold div/rem in this case?

I would think so, at least for now.

This revision now requires changes to proceed.Jun 28 2018, 9:59 AM
spatel updated this revision to Diff 153394.Jun 28 2018, 2:27 PM

Patch updated:

  1. Rebased to trunk now that D48485 is committed (r335888).
  2. Disallow the fold for any integer div/rem for now (see TODOs for follow-ups)
  3. Added a comment about the safety of the shuffle mask.
  4. Adjusted the tests - use 'xor' to demonstrate the multi-use limitation since we're not folding div/rem.
lebedev.ri accepted this revision.Jun 28 2018, 2:30 PM

Looks good, thank you!

This revision is now accepted and ready to land.Jun 28 2018, 2:30 PM
This revision was automatically updated to reflect the committed changes.