This is an archive of the discontinued LLVM Phabricator instance.

[X86] Make memcmp() use PTEST if possible and also enable AVX1
ClosedPublic

Authored by davezarzycki on Oct 8 2019, 4:09 AM.

Details

Summary

Hi Craig – I'm not an expert in this area. Feedback would be appreciated. In particular, the way PTEST was wired up. Thanks!

Diff Detail

Event Timeline

davezarzycki created this revision.Oct 8 2019, 4:09 AM
craig.topper added inline comments.Oct 8 2019, 9:40 AM
lib/Target/X86/X86ISelLowering.cpp
20991

Remove this. It doesn't make sense.

42584

Why are these machine opcodes and not ISD::XOR?

42606

PTEST returns an i32 representing EFLAGS. You can't pass it to ISD::SETCC it doesn't mean anything. The DAG.getSetcc call needs to call the X86ISelLowering.cpp copy of getSetcc that creates an X86ISD::SETCC. You'll need to pass it the X86::COND_E/X86::COND_NE here. This will return an MVT::i8 so you'll need to emit an ISD::TRUNCATE to VT after it.

davezarzycki marked 2 inline comments as done.Oct 10 2019, 6:56 AM
davezarzycki added inline comments.
lib/Target/X86/X86ISelLowering.cpp
42584

My first attempt at this patch used ISD::XOR after the vector bitcasts, but scalar instructions were often (but not always) were emitted. I later noticed the comment at the top of the function which suggests/implies that explicit vector instructions are the right solution and indeed that worked reliably.

42606

Thanks for the PTEST feedback. I think I have it working the way you describe.

Incorporated PTEST feedback.

In general we don't use auto for anything other than casts or iterators - see the llvm coding style guidelines

lib/Target/X86/X86ISelLowering.cpp
42551

bool HasAVX = Subtarget.hasAVX();

42558

I'd prefer not to nest ternary operators like this - either nest them in brackets or if-else

42575

(style guide) Don't use auto - use unsigned

craig.topper added inline comments.Oct 10 2019, 9:53 AM
lib/Target/X86/X86ISelLowering.cpp
42606

80 columns

craig.topper added inline comments.Oct 10 2019, 10:19 AM
lib/Target/X86/X86ISelLowering.cpp
42605

You already used PT as a variable name earlier. The earlier one should probably be UsePTEST

craig.topper added inline comments.Oct 10 2019, 10:32 AM
test/CodeGen/X86/setcc-wide-types.ll
2–3

Need an sse4.1 command line since we changed that behavior too.

148

The AVX512 checks are missing here?

This should fix the scaliarization issue seen with ISD::XOR and ISD::OR

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 377c608..92d1166 100644

  • a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -4344,7 +4344,9 @@ SDValue DAGCombiner::hoistLogicOpWithSameOpcodeHands(SDNode *N) {

if ((HandOpcode == ISD::BITCAST || HandOpcode == ISD::SCALAR_TO_VECTOR) &&
     Level <= AfterLegalizeTypes) {
  // Input types must be integer and the same.
  • if (XVT.isInteger() && XVT == Y.getValueType()) {

+ if (XVT.isInteger() && XVT == Y.getValueType() &&
+ !(VT.isVector() && TLI.isTypeLegal(VT) &&
+ !XVT.isVector() && !TLI.isTypeLegal(XVT))) {

  SDValue Logic = DAG.getNode(LogicOpcode, DL, XVT, X, Y);
  return DAG.getNode(HandOpcode, DL, VT, Logic);
}
davezarzycki marked an inline comment as done and 2 inline comments as not done.

I believe I've incorporated all of the feedback so far.

Hi @craig.topper – Ping. Is this ready to land? Any other requests?

RKSimon added inline comments.Oct 15 2019, 5:32 AM
lib/Target/X86/X86ISelLowering.cpp
42604

SDValue Mask = DAG.getAllOnesConstant(DL, MVT::i64)

42607

DAG.getAllOnesConstant(DL, MVT::i16)

Now using DAG.getAllOnesConstant()

This revision is now accepted and ready to land.Oct 15 2019, 10:21 AM