This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX512] Improve lowering of AVX512 test intrinsics
ClosedPublic

Authored by uriel.k on Oct 10 2017, 7:28 AM.

Details

Summary

Added TESTM and TESTNM to the list of instructions that already zeroing unused upper bits and does not need the redundant shift left and shift right instructions afterwards.

Added a pattern for TESTM and TESTNM in iselLowering, so now icmp(neq,and(X,Y), 0) goes folds into TESTM and icmp(eq,and(X,Y), 0) goes folds into TESTNM

This commit is a preparation for lowering the test and testn X86 intrinsics to IR.

Diff Detail

Repository
rL LLVM

Event Timeline

uriel.k created this revision.Oct 10 2017, 7:28 AM
uriel.k added inline comments.Oct 10 2017, 7:32 AM
test/CodeGen/X86/avx512vl-vec-test-testn.ll
1 ↗(On Diff #118386)

This test also governs the case that D38689 revision talked about where in i386 machine it would crash.

RKSimon edited edge metadata.Oct 10 2017, 8:06 AM

Add the new test files to trunk with current codegen and then rebase to show the diff from this patch.

lib/Target/X86/X86ISelLowering.cpp
17191 ↗(On Diff #118386)

Drop this

17194 ↗(On Diff #118386)

Use DAG.getBitcast

RKSimon added inline comments.Oct 10 2017, 8:20 AM
lib/Target/X86/X86ISelDAGToDAG.cpp
457 ↗(On Diff #118386)

Pull out repeated N->getOpcode()

unsigned Opcode = N->getOpcode();
if (Opcode == X86ISD::PCMPEQM || Opcode == X86ISD::PCMPGTM ||
    Opcode == X86ISD::TESTM || Opcode == X86ISD::TESTNM ||
    Opcode == X86ISD::CMPM || Opcode == X86ISD::CMPMU) {
462 ↗(On Diff #118386)
EVT OpVT = N->getOperand(0).getValueType();
if (OpVT == MVT::v8i32 || OpVT == MVT::v8f32)
craig.topper added inline comments.Oct 10 2017, 10:19 PM
lib/Target/X86/X86ISelLowering.cpp
17188 ↗(On Diff #118386)

isBuildVectorAllZeros already peeks through bitcasts so you can just check on Op1 directly.

uriel.k marked 3 inline comments as done.EditedOct 11 2017, 12:35 AM

Add the new test files to trunk with current codegen and then rebase to show the diff from this patch.

You mean to show how the 'CHECK' lines would look like if this tests were to run on trunk and pass but without committing them?

thanks

uriel.k updated this revision to Diff 118566.Oct 11 2017, 2:16 AM
uriel.k marked 2 inline comments as done and 2 inline comments as done.

fixed requested changes by Simon and Craig and as you can see added the new test files to trunk as NFC commit and showed the diff.

RKSimon added inline comments.Oct 11 2017, 3:31 AM
lib/Target/X86/X86ISelDAGToDAG.cpp
457 ↗(On Diff #118386)

Cheers, you should be able to clang-format this on to fewer lines

lib/Target/X86/X86ISelLowering.cpp
17188 ↗(On Diff #118386)

Do we need to support X86ISD::FAND as well?

test/CodeGen/X86/setcc-lowering.ll
28 ↗(On Diff #118566)

Should the vptest instructions be in the Integer domain? That should force a vpbroadcastd.

Kind of a pity that the broadcast doesn't fold, but since you're messing with subregs it's not that surprising.

craig.topper added inline comments.Oct 11 2017, 2:28 PM
lib/Target/X86/X86ISelLowering.cpp
17186 ↗(On Diff #118566)

Add a space between "if" and opening paren.

uriel.k marked 4 inline comments as done.Oct 15 2017, 3:58 AM
uriel.k added inline comments.
test/CodeGen/X86/setcc-lowering.ll
28 ↗(On Diff #118566)

care to explain more what do you mean by messing with subregs? Isn't the vpbroadcast here is essential to create a <8 x i23> vector?

and yes, vptest is considered to be in the integer domain so I don't know why it chose vbroadcastss.

RKSimon added inline comments.Oct 16 2017, 5:47 AM
test/CodeGen/X86/setcc-lowering.ll
28 ↗(On Diff #118566)

The vptest is being performed on the zmm register, not just the ymm we care about. I guess because KNL doesn't support the VL variants? So usually this means there is ymm <-> zmm subreg manipulations going on that will interfere with other patterns such as broadcast folding (which we should be safe to do, although naturally full size load folding would be a no-no).

uriel.k updated this revision to Diff 120777.Oct 30 2017, 1:51 AM

Added the case of the pattern with X86::FAND. Hopefully this is the last update. Please let me know if there is something else I missed.

RKSimon accepted this revision.Nov 4 2017, 10:53 AM

LGTM with a couple of minors

lib/Target/X86/X86ISelDAGToDAG.cpp
462 ↗(On Diff #118386)

This is an NFC - commit it separately first.

lib/Target/X86/X86ISelLowering.cpp
17259 ↗(On Diff #120777)

Remove empty line

This revision is now accepted and ready to land.Nov 4 2017, 10:53 AM
This revision was automatically updated to reflect the committed changes.
uriel.k marked an inline comment as done.