This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] fold vector select of binops with constant ops to 1 binop (PR37806)
ClosedPublic

Authored by spatel on Jun 20 2018, 3:51 PM.

Details

Summary

This is the simplest case from PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806

If we have a common variable operand used in a pair of binops with vector constants that are vector selected together, then we can constant shuffle the constant vectors to eliminate the shuffle instruction.

This has some tricky parts that are hopefully addressed in the tests and their respective comments:

  1. If the shuffle mask contains an undef element, then that lane of the result is undef:

http://llvm.org/docs/LangRef.html#shufflevector-instruction

Therefore, we can replace the constant in that lane with an undef value except for div/rem. With div/rem, an undef in the divisor would cause the whole op to be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'. (Making the code from that patch into an InstCombine utility function could be a preliminary NFC patch if that's desired.)

  1. Intersect the wrapping and FMF of the original binops for the new binop. There should be no extra poison or fast-math potential in the new binop that wasn't possible in the original code.
  1. Disregard other uses. Given that we're eliminating uses (shortening the dependency chain), I think that's always the right IR canonicalization. But I purposely chose the udiv test to demonstrate the scenario where both intermediate values have other uses because that seems likely worse for codegen with an expensive math op. This seems like a very rare possibility to me, so I don't think it requires a backend patch first, but if we must avoid that, then we can limit the transform based on uses.

Diff Detail

Event Timeline

spatel created this revision.Jun 20 2018, 3:51 PM
lebedev.ri added inline comments.Jun 20 2018, 11:45 PM
lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1152

Do you envision adding other top-level if's later?
Otherwise you could do early return.

1170–1171

Hm, you are sure there is a test for this in test/Transforms/InstCombine/shuffle_select.ll?

spatel added inline comments.Jun 21 2018, 5:25 AM
lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1152

Yes, there are 2 potential near-term follow-ups discussed in the bug report:

  1. Match 2 variable vectors rather than 1 repeated vector.
  2. Match different opcodes with special constants (example: mul and shl).

Assuming these are all good combines, they might be big enough that they each deserve their own helper, so I'll adjust this.

1170–1171

Yes - the sdiv and urem tests have undef elements in the shuffle mask, so they would both fail without this condition.

spatel updated this revision to Diff 152263.Jun 21 2018, 5:41 AM

Patch updated:
No functional change from the previous rev, but use early exits to minimize indents and add TODO comments for the planned enhancements.

lebedev.ri accepted this revision.Jun 21 2018, 5:50 AM

This doesn't look immediately broken to me,
but you probably want to wait a bit for more definitive review.

test/Transforms/InstCombine/shuffle_select.ll
160–199

Oh, somehow i did not notice this..
No udiv+undef test it seems though.

This revision is now accepted and ready to land.Jun 21 2018, 5:50 AM
RKSimon accepted this revision.Jun 21 2018, 6:46 AM

LGTM

lib/Transforms/InstCombine/InstructionCombining.cpp
1428

Might as well separate this off as an NFC pre-commit.

spatel updated this revision to Diff 152307.Jun 21 2018, 8:07 AM

Patch updated:
Committed the helper function change with rL335242, so this is only the shuffle fold now.

This revision was automatically updated to reflect the committed changes.