This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] match vector compare and select sizes with extload operand (PR37427)
ClosedPublic

Authored by spatel on May 24 2018, 8:57 AM.

Details

Summary

This patch started off much more general and ambitious, but it's been a nightmare seeing all the ways x86 vector codegen can go wrong.

So the code is still structured to allow extending easily, but it's currently limited in several ways:

  1. Only handle cases with an extending load.
  2. Only handle cases with a zero constant compare.
  3. Ignore setcc with vector bitmask (SetCCWidth != 1) - so AVX512 should be unaffected.

The motivating case from PR37427:
https://bugs.llvm.org/show_bug.cgi?id=37427
...is the 1st test, and that shows the expected win - we eliminated the unnecessary intermediate cast.

There's a clear regression in the last test (sgt_zero_fp_select) because we longer recognize a 'SHRUNKBLEND' opportunity. I think that general problem is also present in sgt_zero, so I'm hoping we can fix that more generally in a follow-up. We need to match a sign-bit setcc from a sign-extended operand and remove it.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.May 24 2018, 8:57 AM
craig.topper added inline comments.May 24 2018, 12:30 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7271 ↗(On Diff #148423)

Shouldn't this be isLoadExtLegalOrCustom? I think ZEXTLOAD/SEXTLOAD will always be considered Legal for isOperationLegalOrCustom. Nothing makes them illegal in the OpActions array. isLoadExtLegalOrCustom uses a different array.

craig.topper added inline comments.May 24 2018, 12:34 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7271 ↗(On Diff #148423)

Maybe worse than that. ISD::SEXTLOAD/ISD::ZEXTLOAD aren't opcodes. They're in their own enum. I think this aliased to asking about EntryToken and TokenFactor?

spatel added inline comments.May 24 2018, 12:41 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7271 ↗(On Diff #148423)

Oops...lousy old C enum values auto-completed, and I didn't think to double-check. Let me correct that.

spatel updated this revision to Diff 148465.May 24 2018, 1:11 PM

Patch updated:
Use isLoadExtLegalOrCustom() to check if the load is allowed. This means less diffs for AVX1 because it has a smaller set of extending loads than AVX2.

I need to step back through some of these tests and see if the earlier rev of the patch was miscompiling or just producing weird sequences.

Patch updated:
Use isLoadExtLegalOrCustom() to check if the load is allowed. This means less diffs for AVX1 because it has a smaller set of extending loads than AVX2.

I need to step back through some of these tests and see if the earlier rev of the patch was miscompiling or just producing weird sequences.

I think the AVX1 changes in the old rev were actually improvements; what we see here now (ie, AVX1 unchanged tests, so either with or without this patch), is worse. But I'll just make that another potential follow-up. Hopefully, now I'm checking the extload availability correctly, so there won't be as many oddities to investigate all at once.

RKSimon accepted this revision.Jun 8 2018, 10:54 AM

LGTM (as long as you don't forget about the lost AVX1 improvements and the AVX2 slt_zero_fp_select regression!)

This revision is now accepted and ready to land.Jun 8 2018, 10:54 AM
This revision was automatically updated to reflect the committed changes.