Page MenuHomePhabricator

[SDAG] try harder to fold casts into vector compare

Authored by spatel on Thu, May 27, 12:58 PM.



sext (vsetcc X, Y) --> vsetcc (zext X), (zext Y) -- (when the zexts are free and a bunch of other conditions)
We have a couple of similar folds to this already for vector selects, but this pattern slips through because it is only a setcc.

The tests are based on the motivating case from:
...but we need extra logic to get that example, so I've left that as a TODO for now.

Diff Detail

Event Timeline

spatel created this revision.Thu, May 27, 12:58 PM
spatel requested review of this revision.Thu, May 27, 12:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptThu, May 27, 12:58 PM
lebedev.ri added inline comments.Thu, May 27, 1:54 PM

You already have CC


SVT is the result type of comparison of values with N00VT type.
Shouldn't this be !TLI.isOperationLegalOrCustom(ISD::SETCC, N00VT) ?


You already have N00 and N01


I think we can also accept ZEXTLOAD, since we'd just increase the desired number of leading zeros?

pengfei added inline comments.Thu, May 27, 8:55 PM

I think this is used for preventing the legal cases of the result type of comparison, e.g. cmp_slt_load_const?

spatel added inline comments.Fri, May 28, 4:48 AM

This is a bit tricky. I missed adding a test for this predicate, but this is what I intended.
That check is there to prevent transforming a legal vector op into a wider op because that's not guaranteed to be a win. Let me add a test and explain further.

And I think we do want to check the SVT type because that is how target handling for setcc is specified. For example, with AVX512, we have:

for (auto VT : { MVT::v2i1, MVT::v4i1, MVT::v8i1, MVT::v16i1 }) {
  setOperationAction(ISD::SETCC,            VT, Custom);

Let me know if I'm misunderstanding it. I'm not finding an example where I can show a difference because other transforms always seem to change the type before we get here.

spatel updated this revision to Diff 348497.Fri, May 28, 5:06 AM
spatel marked 5 inline comments as done.

Patch updated:

  1. Removed redundant local variables.
  2. Added (currently negative) test for widening of a legal compare.
  3. Added TODO comment for allowing widening of an existing zext-load.
spatel added inline comments.Fri, May 28, 5:16 AM

Probably, but I don't have a test example yet for that pattern. So ok to add one more TODO?


If we allow widening of a legal type, this test will show it. We would convert to:

vpmovzxwd	(%rdi), %ymm0   
vpmovzxwd	(%rsi), %ymm1
vpcmpeqd	%ymm1, %ymm0, %ymm0
vpcmpeqd	%ymm1, %ymm1, %ymm1
vpxor	%ymm1, %ymm0, %ymm0

I don't think we can say with certainty that's a win for all targets (although I don't have any non-x86 examples to compare). It's probably ok, but if we want to allow it, then it can be a one-line follow-up patch?

One more random thought: should this be guarded with a check that getBooleanContents(N00VT) == ZeroOrNegativeOneBooleanContent ?

One more random thought: should this be guarded with a check that getBooleanContents(N00VT) == ZeroOrNegativeOneBooleanContent ?

Yes, we need that check. It's already there with some others at line 10913:

if (VT.isVector() && !LegalOperations &&
    TLI.getBooleanContents(N00VT) ==
        TargetLowering::ZeroOrNegativeOneBooleanContent) {

...this comment suggests that I should move this all into a helper because it's too big to read. That's why I had the redundant local variables in the earlier draft - I started by pulling it out, but got lazy. :)

lebedev.ri accepted this revision.Fri, May 28, 5:43 AM

TODOs sound fine to me.
LGTM assuming there are no other comments.


I strongly suspect you might want /*NoOpaques=*/true, and the default is wrong.

This revision is now accepted and ready to land.Fri, May 28, 5:43 AM
spatel updated this revision to Diff 348520.Fri, May 28, 7:07 AM
spatel marked an inline comment as done.

Patch updated:

  1. Restrict to non-opaque constants.
  2. Fix lambda name to be clang-tidy compliant.

Good catch! Not sure how to expose it in a test, but I'll add that.

This revision was automatically updated to reflect the committed changes.