This is an archive of the discontinued LLVM Phabricator instance.

[X86] Improve (select carry C1+1 C1)
ClosedPublic

Authored by kazu on Feb 11 2023, 6:58 PM.

Details

Summary

Without this patch:

return X < 4 ? 3 : 2;

return X < 9 ? 7 : 6;

are compiled as:

31 c0                   xor    %eax,%eax
83 ff 04                cmp    $0x4,%edi
0f 93 c0                setae  %al
83 f0 03                xor    $0x3,%eax

31 c0                   xor    %eax,%eax
83 ff 09                cmp    $0x9,%edi
0f 92 c0                setb   %al
83 c8 06                or     $0x6,%eax

respectively. With this patch, we generate:

31 c0                   xor    %eax,%eax
83 ff 04                cmp    $0x4,%edi
83 d0 02                adc    $0x2,%eax

31 c0                   xor    %eax,%eax
83 ff 04                cmp    $0x4,%edi
83 d0 02                adc    $0x2,%eax

respectively, saving 3 bytes while reducing the tree height.

This patch recognizes the equivalence of OR and ADD
(if bits do not overlap) and delegates to combineAddOrSubToADCOrSBB
for further processing. The same applies to the equivalence of XOR
and SUB.

Diff Detail

Event Timeline

kazu created this revision.Feb 11 2023, 6:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2023, 6:58 PM
kazu requested review of this revision.Feb 11 2023, 6:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2023, 6:58 PM
craig.topper added inline comments.
llvm/lib/Target/X86/X86ISelLowering.cpp
50054

80 columns

kazu updated this revision to Diff 496735.Feb 11 2023, 7:14 PM

Fixed formatting.

kazu marked an inline comment as done.Feb 11 2023, 7:15 PM
goldstein.w.n added inline comments.
llvm/lib/Target/X86/X86ISelLowering.cpp
52553

This block of code is duplicated in CombineOr/CombineXor. Maybe add a helper?

static SDValue combineOrXorWithSETCC(SDNode *N, SDValue N0, SDValue N1) {
  // Delegate to combineAddOrSubToADCOrSBB if we have:
  //
  //   (xor/or (zero_extend (setcc)) imm)
  //
  // where imm has its LSB set, in which case the XOR is equivalent to SUB
  // with the operands swapped.
  SDLoc DL(N);
  if (N0.getOpcode() == ISD::ZERO_EXTEND && N0.hasOneUse() &&
      N0.getOperand(0).getOpcode() == X86ISD::SETCC) {
    if (ConstantSDNode *N1C = dyn_cast<ConstantSDNode>(N1)) {
      bool N1COdd = N1C->getZExtValue() & 1;
      if (N->getOpcode() == ISD::OR ? !N1COdd : N1COdd) {
        if (SDValue R = combineAddOrSubToADCOrSBB(N->getOpcode() == ISD::XOR,
                                                  DL, VT, N1, N0, DAG))
          return R;
      }
    }
  }
}
llvm/test/CodeGen/X86/select_const.ll
548

Two things regarding the tests.

  1. there should be some tests for the sbb case.
  2. can you split this into two patches, one for adding the tests (w.o the code change to isellowering) and another with the isellowering changes so we can see the changes more easily.
RKSimon added inline comments.Feb 12 2023, 2:37 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
49895

Instead of adding a forward declaration - could we not just move the 2 existing combineAddOrSubToADCOrSBB implementations up here as a NFC?

RKSimon added inline comments.Feb 12 2023, 3:33 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
50051

Move hasOneUse() to the end of the checks - its a slow function and we should avoid it if we can.

if (N0.getOpcode() == ISD::ZERO_EXTEND &&
    N0.getOperand(0).getOpcode() == X86ISD::SETCC &&
    N0.hasOneUse()) {
52543

Move hasOneUse() to the end of the checks

kazu updated this revision to Diff 496829.Feb 12 2023, 7:37 PM

Adjusted hasOneUsed.

Regenerated this patch relative to the pre-committed tests.

kazu updated this revision to Diff 496830.Feb 12 2023, 8:02 PM

Regenerated a patch after moving up combineAddOrSubToADCOrSBB.

kazu marked 4 inline comments as done.Feb 12 2023, 8:04 PM

Thank you for all the feedback. Please take a look.

RKSimon added inline comments.Feb 13 2023, 1:36 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
50105

assert((N->getOpcode() == ISD::OR || N->getOpcode() == ISD::XOR) && "Unexpected opcode");

50114

auto *N1C

kazu updated this revision to Diff 497027.Feb 13 2023, 9:47 AM

Added an assert expecting XOR or OR.

Replaced ConstantSDNode* with auto*.

kazu marked 3 inline comments as done.Feb 13 2023, 9:48 AM

Please take a look. Thanks!

kazu added a comment.Feb 14 2023, 9:01 PM

Please take a look. Thanks!

goldstein.w.n added inline comments.Feb 14 2023, 10:14 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
50119

Do you need to check that N0 is getting zero-extended to the same VT as N1?

Also can you do sign-extend as well and swap sbb/adc?

craig.topper added inline comments.Feb 14 2023, 10:52 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
50119

As long as N0 and N1 both are operands of the XOR/OR they should always have the same type. We don't allow mixed types on XOR/OR.

sign-extend wouldn't work because the X86ISD::SETCC node returns an i8 where bits 7:1 are 0. It corresponds to the SETcc instruction. So sign-extend would be sign extending bit 7 which isn't useful. It should have already been turned into zero-extend.

kazu marked 2 inline comments as done.Feb 18 2023, 1:59 PM

I think I've addressed all the feedback so far, but I am happy to address if there is more. Thanks!

RKSimon accepted this revision.Feb 20 2023, 1:57 PM

LGTM with one minor

llvm/test/CodeGen/X86/select_const.ll
571–572

ugt3 ?

This revision is now accepted and ready to land.Feb 20 2023, 1:57 PM
kazu updated this revision to Diff 498976.Feb 20 2023, 4:31 PM

Fixed a typo in the test.

kazu marked an inline comment as done.Feb 20 2023, 4:31 PM
This revision was landed with ongoing or failed builds.Feb 20 2023, 4:38 PM
This revision was automatically updated to reflect the committed changes.