This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] Reuse known zeros/ones after zero-extension of i1.
AbandonedPublic

Authored by jonpa on Mar 18 2021, 4:02 PM.

Download Raw Diff

Details

Reviewers

Summary

This is an optimization for zero extensions of i1:s, which resulted from looking into the perl regression against GCC. I noticed a lot of LHI 0, LHI 1, LOC sequences, which gcc did not seem to have.

Basically, after an ICMP NE 0, or an ICMP EQ 1, those known constants are already the ones needed as the CC result, so there is no need to load them with LHI (LGHI).

The i32 cases where quite straightforward, but then doing the same for i64 was a bit more of an effort. Since we always return i32 from getSetCCResultType(), these cases needed different handling depending on the user. If an i64 use is needed, then I chose to promote the setcc result in combineZERO_EXTEND(). For the i32 user (of i64 comparison), the reused operand was instead truncated.

For the case of i64 user, a lot of llgfr:s remained without the handling in combineZERO_EXTEND():

Optimized legalized selection DAG: %bb.0 'prototype_p:bb'
SelectionDAG has 15 nodes:
  t0: ch = EntryToken
  t5: i64,ch = load<(load 8 from `%0** undef`)> t0, undef:i64, undef:i64
        t25: i32 = truncate t5
        t24: i32 = SystemZISD::ICMP t5, Constant:i64<0>, TargetConstant:i32<0>
      t29: i32 = SystemZISD::SELECT_CCMASK Constant:i32<1>, t25, TargetConstant:i32<14>, TargetConstant:i32<6>, t24
    t20: i64 = zero_extend t29
  t12: ch,glue = CopyToReg t0, Register:i64 $r2d, t20
  t13: ch = SystemZISD::RET_FLAG t12, Register:i64 $r2d, t12:1

->
        ltg     %r0, 0(%r1)
        lochilh %r0, 1
        llgfr   %r2, %r0

The general effects on SPEC'17 output:

Just the i32/i32 cases of NE 0:

lhi            :               225081               221486    -3595
lghi           :               445509               444420    -1089
locghilh       :                 3717                 2676    -1041
chsi           :                57297                56385     -912
lt             :                13672                14348     +676
lochilh        :                 8796                 9424     +628
llgfr          :                90010                90533     +523
risbgn         :               137540               137980     +440
tmll           :                53266                53693     +427
ltr            :                 6140                 6550     +410
lgr            :               849527               849890     +363
llc            :                39671                39994     +323
locrlh         :                 1492                 1807     +315
...

Also the i64 cases, compared to (1):

lghi           :               444420               441828    -2592
lhi            :               221486               219782    -1704
lr             :                62223                62878     +655
cghsi          :                32665                32175     -490
tmll           :                53693                54181     +488
llgfr          :                90533                90066     -467
ltg            :               157760               158133     +373
risbgn         :               137980               138334     +354
jne            :                42684                42990     +306
lg             :               982786               982931     +145
je             :               335154               335281     +127
cije           :               107363               107237     -126
lgfr           :                91442                91565     +123
lgr            :               849890               850006     +116
ltgr           :                10951                11067     +116
...

Also EQ 1 (both i32 and i64), compared to (2):

lochilh        :                 9492                 9787     +295
lhi            :               219782               219567     -215
lochie         :                14183                13975     -208
chi            :                53350                53448      +98
chsi           :                56337                56259      -78
lr             :                62878                62950      +72
tmll           :                54181                54243      +62
lghi           :               441828               441770      -58
locghie        :                 7174                 7116      -58
risbgn         :               138334               138383      +49
...

In total, master <> (3):

lhi            :               225081               219567    -5514
lghi           :               445509               441770    -3739
locghilh       :                 3717                 2673    -1044
chsi           :                57297                56259    -1038
lochilh        :                 8796                 9787     +991
tmll           :                53266                54243     +977
risbgn         :               137540               138383     +843
lr             :                62152                62950     +798
lt             :                13672                14343     +671
jne            :                42430                43028     +598
cghsi          :                32644                32170     -474
lgr            :               849527               849994     +467
ltr            :                 6140                 6557     +417
...

I see some more LR:s, which I think is when the reused constant also has another user. I did not manage to avoid these cases when working with the DAGs (local only), so some kind of pseudo-expander might be more powerful here. That probably requires more work, and I am not sure if trading an LHI for an LR is bad, since the comparison does not clobber the register...

There are less comparisons with memory - the value is now loaded, compared and reused (see fun9 below).

Does this seem like a good idea to try?

New tests, master <> patched (skipped functions identical):

fun0:                                   fun0:
        chi     %r2, 0                          chi     %r2, 0
        lhi     %r2, 0                <
        lochilh %r2, 1                          lochilh %r2, 1
                                      >
        br      %r14                            br      %r14


fun4:                                   fun4:
        cghi    %r2, 0                          cghi    %r2, 0
        lhi     %r2, 0                <
        lochilh %r2, 1                          lochilh %r2, 1
                                      >
        br      %r14                            br      %r14


fun5:                                   fun5:
        cghsi   0(%r1), 0             |         ltg     %r0, 0(%r1)
        lghi    %r0, 0                <
        locghilh        %r0, 1                  locghilh        %r0, 1
        stg     %r0, 0(%r1)                     stg     %r0, 0(%r1)
        br      %r14                            br      %r14


fun6:                                   fun6:
        chi     %r2, 1                          chi     %r2, 1
        lhi     %r2, 0                |         lochilh %r2, 0
        lochie  %r2, 1                |
        br      %r14                            br      %r14


fun8:                                   fun8:
        cghi    %r2, 1                          cghi    %r2, 1
        lhi     %r2, 0                |         lochilh %r2, 0
        lochie  %r2, 1                |
        br      %r14                            br      %r14


fun9:                                   fun9:
        cghsi   0(%r1), 1             |         lg      %r0, 0(%r1)
        lghi    %r0, 0                |         cghi    %r0, 1
        locghie %r0, 1                |         locghilh        %r0, 0
        stg     %r0, 0(%r1)                     stg     %r0, 0(%r1)
        br      %r14                            br      %r14

Diff Detail

Event Timeline

jonpa created this revision.Mar 18 2021, 4:02 PM

Herald added a subscriber: hiraditya. · View Herald TranscriptMar 18 2021, 4:02 PM

jonpa requested review of this revision.Mar 18 2021, 4:02 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 18 2021, 4:02 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B94578: Diff 331705.Mar 18 2021, 4:55 PM

jonpa mentioned this in D100039: [SystemZ] Isel cleanup pass: Reuse known zeros/ones after zero-extension of i1..Apr 7 2021, 8:20 AM

Patch improved:

Avoid transforming single-use loads - it is probably better to do a compare with memory.
Add an AssertZext node on the reused register so that it is known that this is an i1.
Avoid some cases where i32 setcc is zero-extended.
Option to try single user only

Benchmarking (also showing the sibling patch "ISelCleanup.cpp"):

master <> SETCC (DAG patch)

lhi            :               225040               220226    -4814
lghi           :               445603               443127    -2476
lr             :                61869                62835     +966
lgr            :               853946               854446     +500
...

master <> SETCC (DAG patch), no multiple users

lhi            :               225040               221382    -3658
lghi           :               445603               443220    -2383
lgr            :               853946               854483     +537
lr             :                61869                62170     +301
...

master <> ISelCleanup pass

lhi            :               225040               220233    -4807
lghi           :               445603               443111    -2492
lr             :                61869                62836     +967
lgr            :               853946               854446     +500
...

master <> ISelCleanup pass, no multiple users

lhi            :               225040               221740    -3300
lghi           :               445603               443562    -2041
lgr            :               853946               854384     +438
lr             :                61869                61961      +92
...

Static improvements above show reduced number of immediate loads and some extra register moves. Limiting to only the cases of one user of compare operand gives less new register moves, but still some... (not sure why).

This patch is now closer to the ISelCleanup pass in the output differences above.

Measurements (quick nightly run):
Not much changes - have seen some 1% improvements but also some regressions.

Xalancbmk:
Z14: regresses with a couple of percent except with "IselCleanup, no multiple users", which is the least aggressive change.
Z15: improves with one percent with multiple users, regresses with one percent with no multiple users. Random effect, it seems...

Deepsjeng, Z15: regresses with "DAG patch", but only with "no multiple users", which indicates some random effect since that is a lesser change than with "multiple users"...

The only version so far that has no regressions on both machines (and is if anything a very slight improvement) is the last one "IselCleanup, no multiple users".

Conclusions:

There seems to be no clear direct benchmark improvement, although there are a few thousand less immediate loads. This is probably because the immediate loads should generally be invisible on the OOO machine.
The patches are non-trivial so it seems perhaps difficult to motivate them given these measurements.

Remaining ideas:

Perhaps the ISelCleanup code could be used as a basis to use with the PeepholeOptimizer. I will try that next.
Maybe look into more in detail where the remaining extra register moves come from.

Harbormaster completed remote builds in B97942: Diff 336398.Apr 9 2021, 5:31 AM

jonpa mentioned this in D100242: [SystemZ / TII] Peephole optimization of zero-extension of i1..Apr 10 2021, 3:12 AM

Seems better to do this with PeeholeOptimizer instead (https://reviews.llvm.org/D100242).

Revision Contents

Path

Size

llvm/

lib/

Target/

SystemZ/

SystemZISelLowering.h

1 line

SystemZISelLowering.cpp

125 lines

test/

CodeGen/

SystemZ/

setcc-05.ll

130 lines

Diff 336398

llvm/lib/Target/SystemZ/SystemZISelLowering.h

Show First 20 Lines • Show All 630 Lines • ▼ Show 20 Lines	private:
SDValue lowerShift(SDValue Op, SelectionDAG &DAG, unsigned ByScalar) const;		SDValue lowerShift(SDValue Op, SelectionDAG &DAG, unsigned ByScalar) const;

bool canTreatAsByteVector(EVT VT) const;		bool canTreatAsByteVector(EVT VT) const;
SDValue combineExtract(const SDLoc &DL, EVT ElemVT, EVT VecVT, SDValue OrigOp,		SDValue combineExtract(const SDLoc &DL, EVT ElemVT, EVT VecVT, SDValue OrigOp,
unsigned Index, DAGCombinerInfo &DCI,		unsigned Index, DAGCombinerInfo &DCI,
bool Force) const;		bool Force) const;
SDValue combineTruncateExtract(const SDLoc &DL, EVT TruncVT, SDValue Op,		SDValue combineTruncateExtract(const SDLoc &DL, EVT TruncVT, SDValue Op,
DAGCombinerInfo &DCI) const;		DAGCombinerInfo &DCI) const;
		bool isSETCCWithReusable64BitOp(SDValue Op, SelectionDAG &DAG) const;
SDValue combineZERO_EXTEND(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineZERO_EXTEND(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineSIGN_EXTEND(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineSIGN_EXTEND(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineSIGN_EXTEND_INREG(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineSIGN_EXTEND_INREG(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineMERGE(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineMERGE(SDNode *N, DAGCombinerInfo &DCI) const;
bool canLoadStoreByteSwapped(EVT VT) const;		bool canLoadStoreByteSwapped(EVT VT) const;
SDValue combineLOAD(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineLOAD(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineSTORE(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineSTORE(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineVECTOR_SHUFFLE(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineVECTOR_SHUFFLE(SDNode *N, DAGCombinerInfo &DCI) const;
▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

llvm/lib/Target/SystemZ/SystemZISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,686 Lines • ▼ Show 20 Lines	static void lowerGR128Binary(SelectionDAG &DAG, const SDLoc &DL, EVT VT,
unsigned Opcode, SDValue Op0, SDValue Op1,		unsigned Opcode, SDValue Op0, SDValue Op1,
SDValue &Even, SDValue &Odd) {		SDValue &Even, SDValue &Odd) {
SDValue Result = DAG.getNode(Opcode, DL, MVT::Untyped, Op0, Op1);		SDValue Result = DAG.getNode(Opcode, DL, MVT::Untyped, Op0, Op1);
bool Is32Bit = is32Bit(VT);		bool Is32Bit = is32Bit(VT);
Even = DAG.getTargetExtractSubreg(SystemZ::even128(Is32Bit), DL, VT, Result);		Even = DAG.getTargetExtractSubreg(SystemZ::even128(Is32Bit), DL, VT, Result);
Odd = DAG.getTargetExtractSubreg(SystemZ::odd128(Is32Bit), DL, VT, Result);		Odd = DAG.getTargetExtractSubreg(SystemZ::odd128(Is32Bit), DL, VT, Result);
}		}

// Return an i32 value that is 1 if the CC value produced by CCReg is		// Return a value that is 1 if the CC value produced by CCReg is in the mask
// in the mask CCMask and 0 otherwise. CC is known to have a value		// CCMask and 0 otherwise. CC is known to have a value in CCValid, so other
// in CCValid, so other values can be ignored.		// values can be ignored.
static SDValue emitSETCC(SelectionDAG &DAG, const SDLoc &DL, SDValue CCReg,		static SDValue emitSETCC(SelectionDAG &DAG, const SDLoc &DL, SDValue CCReg,
unsigned CCValid, unsigned CCMask) {		unsigned CCValid, unsigned CCMask,
SDValue Ops[] = {DAG.getConstant(1, DL, MVT::i32),		SDValue ZeroOp = SDValue(), SDValue OneOp = SDValue()) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SDValue ZeroOp = SDValue(), SDValue OneOp = SDValue()) { - EVT VT = ZeroOp != SDValue() ? ZeroOp->getValueType(0) - : OneOp != SDValue() ? OneOp->getValueType(0) - : MVT::i32; - - SDValue Ops[] = {OneOp == SDValue() ? DAG.getConstant(1, DL, VT) - : OneOp, - ZeroOp == SDValue() ? DAG.getConstant(0, DL, VT) - : ZeroOp, + SDValue ZeroOp = SDValue(), 7 diff lines are omitted. See full path. Lint: Pre-merge checks: clang-format: please reformat the code ``` - SDValue ZeroOp = SDValue()…
DAG.getConstant(0, DL, MVT::i32),		EVT VT = ZeroOp != SDValue() ? ZeroOp->getValueType(0)
		: OneOp != SDValue() ? OneOp->getValueType(0)
		: MVT::i32;

		SDValue Ops[] = {OneOp == SDValue() ? DAG.getConstant(1, DL, VT)
		: OneOp,
		ZeroOp == SDValue() ? DAG.getConstant(0, DL, VT)
		: ZeroOp,
DAG.getTargetConstant(CCValid, DL, MVT::i32),		DAG.getTargetConstant(CCValid, DL, MVT::i32),
DAG.getTargetConstant(CCMask, DL, MVT::i32), CCReg};		DAG.getTargetConstant(CCMask, DL, MVT::i32), CCReg};
return DAG.getNode(SystemZISD::SELECT_CCMASK, DL, MVT::i32, Ops);		return DAG.getNode(SystemZISD::SELECT_CCMASK, DL, VT, Ops);
}		}

// Return the SystemISD vector comparison operation for CC, or 0 if it cannot		// Return the SystemISD vector comparison operation for CC, or 0 if it cannot
// be done directly. Mode is CmpMode::Int for integer comparisons, CmpMode::FP		// be done directly. Mode is CmpMode::Int for integer comparisons, CmpMode::FP
// for regular floating-point comparisons, CmpMode::StrictFP for strict (quiet)		// for regular floating-point comparisons, CmpMode::StrictFP for strict (quiet)
// floating-point comparisons, and CmpMode::SignalingFP for strict signaling		// floating-point comparisons, and CmpMode::SignalingFP for strict signaling
// floating-point comparisons.		// floating-point comparisons.
enum class CmpMode { Int, FP, StrictFP, SignalingFP };		enum class CmpMode { Int, FP, StrictFP, SignalingFP };
▲ Show 20 Lines • Show All 191 Lines • ▼ Show 20 Lines	SDValue SystemZTargetLowering::lowerVectorSETCC(SelectionDAG &DAG,
}		}
if (Chain && Chain.getNode() != Cmp.getNode()) {		if (Chain && Chain.getNode() != Cmp.getNode()) {
SDValue Ops[2] = { Cmp, Chain };		SDValue Ops[2] = { Cmp, Chain };
Cmp = DAG.getMergeValues(Ops, DL);		Cmp = DAG.getMergeValues(Ops, DL);
}		}
return Cmp;		return Cmp;
}		}

		static bool isZeroExtended(SDValue Op) {
		for (SDNode *Node : Op->uses())
		if (Node->getOpcode() == ISD::ZERO_EXTEND)
		return true;
		return false;
		}

		// Need to avoid truncating an AssertSext to a more narrow type.
		static SDValue getTruncatedCmpOp(SDValue Op, SelectionDAG &DAG) {
		if (Op->getOpcode() == ISD::AssertSext)
		Op = Op->getOperand(0);
		return DAG.getNode(ISD::TRUNCATE, SDLoc(Op), MVT::i32, Op);
		}

		// EXPERIMENTAL
		#include "llvm/Support/CommandLine.h"
		static cl::opt<bool> MultipleCmpOpUsers("setcc-multiple-users", cl::init(true));

		static bool cmpOpHasOneUser(SDValue Op) {
		if (Op.hasOneUse() && isa<LoadSDNode>(Op))
		return false;
		return (Op.hasOneUse() \|\| MultipleCmpOpUsers);
		}

SDValue SystemZTargetLowering::lowerSETCC(SDValue Op,		SDValue SystemZTargetLowering::lowerSETCC(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
SDValue CmpOp0 = Op.getOperand(0);		SDValue CmpOp0 = Op.getOperand(0);
SDValue CmpOp1 = Op.getOperand(1);		SDValue CmpOp1 = Op.getOperand(1);
ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(2))->get();		ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(2))->get();
		bool CmpOpOneUser = cmpOpHasOneUser(CmpOp0);
SDLoc DL(Op);		SDLoc DL(Op);
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();
if (VT.isVector())		if (VT.isVector())
return lowerVectorSETCC(DAG, DL, VT, CC, CmpOp0, CmpOp1);		return lowerVectorSETCC(DAG, DL, VT, CC, CmpOp0, CmpOp1);

Comparison C(getCmp(DAG, CmpOp0, CmpOp1, CC, DL));		Comparison C(getCmp(DAG, CmpOp0, CmpOp1, CC, DL));
SDValue CCReg = emitCmp(DAG, DL, C);		SDValue CCReg = emitCmp(DAG, DL, C);
return emitSETCC(DAG, DL, CCReg, C.CCValid, C.CCMask);
		// Detemine if either 0 or 1 is the known the value after the comparison
		// and see if it can be reused instead of loading it again. For instance,
		// in the case of a comparison against 0 for inequality, there is no need
		// to load the zero constant into a register.
		SDValue ZeroOp = SDValue();
		SDValue OneOp = SDValue();
		auto *Imm = dyn_cast<ConstantSDNode>(CmpOp1);
		if (Imm != nullptr && C.Opcode == SystemZISD::ICMP && CmpOpOneUser) {
		EVT CmpVT = CmpOp0->getValueType(0);
		if (VT == MVT::i32 && !isZeroExtended(Op)) {
		if (CC == ISD::SETNE && Imm->getZExtValue() == 0) {
		if (CmpVT == MVT::i32)
		ZeroOp = CmpOp0;
		else if (CmpVT == MVT::i64)
		// i64 comparison: all users i32.
		ZeroOp = getTruncatedCmpOp(CmpOp0, DAG);
		} else if (CC == ISD::SETEQ && Imm->getZExtValue() == 1) {
		if (CmpVT == MVT::i32)
		OneOp = CmpOp0;
		else if (CmpVT == MVT::i64)
		// i64 comparison: all users i32.
		OneOp = getTruncatedCmpOp(CmpOp0, DAG);
		}
		} else if (VT == MVT::i64) {
		// i64 comparison: all users i64.
		if (CC == ISD::SETNE && Imm->getZExtValue() == 0)
		ZeroOp = CmpOp0;
		else if (CC == ISD::SETEQ && Imm->getZExtValue() == 1)
		OneOp = CmpOp0;
		}
		if (ZeroOp != SDValue())
		ZeroOp = DAG.getNode(ISD::AssertZext, DL, VT, ZeroOp,
		DAG.getValueType(MVT::i1));
		if (OneOp != SDValue())
		OneOp = DAG.getNode(ISD::AssertZext, DL, VT, OneOp,
		DAG.getValueType(MVT::i1));
		}
		assert((VT == MVT::i32 \|\|
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - assert((VT == MVT::i32 \|\| - (VT == MVT::i64 && ((ZeroOp != SDValue()) != (OneOp != SDValue())))) && + assert((VT == MVT::i32 \|\| (VT == MVT::i64 && ((ZeroOp != SDValue()) != + (OneOp != SDValue())))) && Lint: Pre-merge checks: clang-format: please reformat the code ``` - assert((VT == MVT::i32 \|\| - (VT == MVT…
		(VT == MVT::i64 && ((ZeroOp != SDValue()) != (OneOp != SDValue())))) &&
		"Unexpected i64 setcc");

		return emitSETCC(DAG, DL, CCReg, C.CCValid, C.CCMask, ZeroOp, OneOp);
}		}

SDValue SystemZTargetLowering::lowerSTRICT_FSETCC(SDValue Op,		SDValue SystemZTargetLowering::lowerSTRICT_FSETCC(SDValue Op,
SelectionDAG &DAG,		SelectionDAG &DAG,
bool IsSignaling) const {		bool IsSignaling) const {
SDValue Chain = Op.getOperand(0);		SDValue Chain = Op.getOperand(0);
SDValue CmpOp0 = Op.getOperand(1);		SDValue CmpOp0 = Op.getOperand(1);
SDValue CmpOp1 = Op.getOperand(2);		SDValue CmpOp1 = Op.getOperand(2);
▲ Show 20 Lines • Show All 2,922 Lines • ▼ Show 20 Lines	if (canTreatAsByteVector(VecVT)) {
return combineExtract(DL, ResVT, VecVT, Vec, NewIndex, DCI, true);		return combineExtract(DL, ResVT, VecVT, Vec, NewIndex, DCI, true);
}		}
}		}
}		}
}		}
return SDValue();		return SDValue();
}		}

		bool SystemZTargetLowering::isSETCCWithReusable64BitOp(SDValue Op,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -bool SystemZTargetLowering::isSETCCWithReusable64BitOp(SDValue Op, - SelectionDAG &DAG) const { - SDValue CmpOp0 = Op.getOperand(0); - SDValue CmpOp1 = Op.getOperand(1); +bool SystemZTargetLowering::isSETCCWithReusable64BitOp( + SDValue Op, SelectionDAG &DAG) const { + SDValue CmpOp0 = Op.getOperand(0); + SDValue CmpOp1 = Op.getOperand(1); Lint: Pre-merge checks: clang-format: please reformat the code ``` -bool SystemZTargetLowering…
		SelectionDAG &DAG) const {
		SDValue CmpOp0 = Op.getOperand(0);
		SDValue CmpOp1 = Op.getOperand(1);
		ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(2))->get();
		if (!cmpOpHasOneUser(CmpOp0))
		return false;
		SDLoc DL(Op);
		EVT CmpVT = CmpOp0->getValueType(0);
		if (CmpVT != MVT::i64)
		return false;
		auto *Imm = dyn_cast<ConstantSDNode>(CmpOp1);
		if (!Imm)
		return false;

		Comparison C(getCmp(DAG, CmpOp0, CmpOp1, CC, DL));
		if (C.Opcode != SystemZISD::ICMP \|\| !Op.hasOneUse())
		return false;

		if ((CC == ISD::SETNE && Imm->getZExtValue() == 0) \|\|
		(CC == ISD::SETEQ && Imm->getZExtValue() == 1))
		return true;

		return false;
		}

SDValue SystemZTargetLowering::combineZERO_EXTEND(		SDValue SystemZTargetLowering::combineZERO_EXTEND(
SDNode *N, DAGCombinerInfo &DCI) const {		SDNode *N, DAGCombinerInfo &DCI) const {
// Convert (zext (select_ccmask C1, C2)) into (select_ccmask C1', C2')		// Convert (zext (select_ccmask C1, C2)) into (select_ccmask C1', C2')
SelectionDAG &DAG = DCI.DAG;		SelectionDAG &DAG = DCI.DAG;
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
if (N0.getOpcode() == SystemZISD::SELECT_CCMASK) {		if (N0.getOpcode() == SystemZISD::SELECT_CCMASK) { // XXX still needed?
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (N0.getOpcode() == SystemZISD::SELECT_CCMASK) { // XXX still needed? + if (N0.getOpcode() == SystemZISD::SELECT_CCMASK) { // XXX still needed? Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (N0.getOpcode() == SystemZISD::SELECT_CCMASK)…
auto *TrueOp = dyn_cast<ConstantSDNode>(N0.getOperand(0));		auto *TrueOp = dyn_cast<ConstantSDNode>(N0.getOperand(0));
auto *FalseOp = dyn_cast<ConstantSDNode>(N0.getOperand(1));		auto *FalseOp = dyn_cast<ConstantSDNode>(N0.getOperand(1));
if (TrueOp && FalseOp) {		if (TrueOp && FalseOp) {
SDLoc DL(N0);		SDLoc DL(N0);
SDValue Ops[] = { DAG.getConstant(TrueOp->getZExtValue(), DL, VT),		SDValue Ops[] = { DAG.getConstant(TrueOp->getZExtValue(), DL, VT),
DAG.getConstant(FalseOp->getZExtValue(), DL, VT),		DAG.getConstant(FalseOp->getZExtValue(), DL, VT),
N0.getOperand(2), N0.getOperand(3), N0.getOperand(4) };		N0.getOperand(2), N0.getOperand(3), N0.getOperand(4) };
SDValue NewSelect = DAG.getNode(SystemZISD::SELECT_CCMASK, DL, VT, Ops);		SDValue NewSelect = DAG.getNode(SystemZISD::SELECT_CCMASK, DL, VT, Ops);
// If N0 has multiple uses, change other uses as well.		// If N0 has multiple uses, change other uses as well.
if (!N0.hasOneUse()) {		if (!N0.hasOneUse()) {
SDValue TruncSelect =		SDValue TruncSelect =
DAG.getNode(ISD::TRUNCATE, DL, N0.getValueType(), NewSelect);		DAG.getNode(ISD::TRUNCATE, DL, N0.getValueType(), NewSelect);
DCI.CombineTo(N0.getNode(), TruncSelect);		DCI.CombineTo(N0.getNode(), TruncSelect);
}		}
return NewSelect;		return NewSelect;
}		}
}		}

		if (N0.getOpcode() == ISD::SETCC &&
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (N0.getOpcode() == ISD::SETCC && - VT == MVT::i64 && N0->getValueType(0) == MVT::i32 && - isSETCCWithReusable64BitOp(N0, DAG)) - return DAG.getNode(ISD::SETCC, SDLoc(N0), MVT::i64, - N0->getOperand(0), N0->getOperand(1), N0->getOperand(2)); + if (N0.getOpcode() == ISD::SETCC && VT == MVT::i64 && + N0->getValueType(0) == MVT::i32 && isSETCCWithReusable64BitOp(N0, DAG)) + return DAG.getNode(ISD::SETCC, SDLoc(N0), MVT::i64, N0->getOperand(0), + N0->getOperand(1), N0->getOperand(2)); Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (N0.getOpcode() == ISD::SETCC && - VT ==…
		VT == MVT::i64 && N0->getValueType(0) == MVT::i32 &&
		isSETCCWithReusable64BitOp(N0, DAG))
		return DAG.getNode(ISD::SETCC, SDLoc(N0), MVT::i64,
		N0->getOperand(0), N0->getOperand(1), N0->getOperand(2));

return SDValue();		return SDValue();
}		}

SDValue SystemZTargetLowering::combineSIGN_EXTEND_INREG(		SDValue SystemZTargetLowering::combineSIGN_EXTEND_INREG(
SDNode *N, DAGCombinerInfo &DCI) const {		SDNode *N, DAGCombinerInfo &DCI) const {
// Convert (sext_in_reg (setcc LHS, RHS, COND), i1)		// Convert (sext_in_reg (setcc LHS, RHS, COND), i1)
// and (sext_in_reg (any_extend (setcc LHS, RHS, COND)), i1)		// and (sext_in_reg (any_extend (setcc LHS, RHS, COND)), i1)
// into (select_cc LHS, RHS, -1, 0, COND)		// into (select_cc LHS, RHS, -1, 0, COND)
▲ Show 20 Lines • Show All 2,555 Lines • Show Last 20 Lines

llvm/test/CodeGen/SystemZ/setcc-05.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; Test SETCC for an integer comparison against 0. The 0 does not need to be
				; loaded if the condition is NE.
				;
				; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| FileCheck %s

				; ICMP NE 0: no need to load 0.
				define i32 @fun0(i8 zeroext %b) {
				; CHECK-LABEL: fun0:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: chi %r2, 0
				; CHECK-NEXT: lochilh %r2, 1
				; CHECK-NEXT: # kill: def $r2l killed $r2l killed $r2d
				; CHECK-NEXT: br %r14
				entry:
				%cc = icmp ne i8 %b, 0
				%conv = zext i1 %cc to i32
				ret i32 %conv
				}

				; ICMP EQ 0: need to load 0.
				define i32 @fun2(i8 zeroext %b) {
				; CHECK-LABEL: fun2:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: chi %r2, 0
				; CHECK-NEXT: lhi %r2, 0
				; CHECK-NEXT: lochie %r2, 1
				; CHECK-NEXT: br %r14
				entry:
				%cc = icmp eq i8 %b, 0
				%conv = zext i1 %cc to i32
				ret i32 %conv
				}

				; ICMP NE 0: The whole register is not checked, so need to load 0.
				define i32 @fun3(i32 %b) {
				; CHECK-LABEL: fun3:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: tmll %r2, 255
				; CHECK-NEXT: lhi %r2, 0
				; CHECK-NEXT: lochine %r2, 1
				; CHECK-NEXT: br %r14
				entry:
				%t = trunc i32 %b to i8
				%cc = icmp ne i8 %t, 0
				%conv = zext i1 %cc to i32
				ret i32 %conv
				}

				; ICMP NE 0: i64 with i32 use
				define i32 @fun4(i64 %b) {
				; CHECK-LABEL: fun4:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: cghi %r2, 0
				; CHECK-NEXT: lochilh %r2, 1
				; CHECK-NEXT: # kill: def $r2l killed $r2l killed $r2d
				; CHECK-NEXT: br %r14
				entry:
				%cc = icmp ne i64 %b, 0
				%conv = zext i1 %cc to i32
				ret i32 %conv
				}

				; ICMP NE 0: i64 with i64 use.
				define i64 @fun5(i64 %b) {
				; CHECK-LABEL: fun5:
				; CHECK: # %bb.0: # %bb
				; CHECK-NEXT: cghi %r2, 0
				; CHECK-NEXT: locghilh %r2, 1
				; CHECK-NEXT: br %r14
				bb:
				%cc = icmp ne i64 %b, 0
				%conv = zext i1 %cc to i64
				ret i64 %conv
				}

				; ICMP EQ 1: no need to load 1.
				define i32 @fun6(i8 zeroext %b) {
				; CHECK-LABEL: fun6:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: chi %r2, 1
				; CHECK-NEXT: lochilh %r2, 0
				; CHECK-NEXT: # kill: def $r2l killed $r2l killed $r2d
				; CHECK-NEXT: br %r14
				entry:
				%cc = icmp eq i8 %b, 1
				%conv = zext i1 %cc to i32
				ret i32 %conv
				}

				; ICMP NE 1: need to load 1.
				define i32 @fun7(i8 zeroext %b) {
				; CHECK-LABEL: fun7:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: chi %r2, 1
				; CHECK-NEXT: lhi %r2, 0
				; CHECK-NEXT: lochilh %r2, 1
				; CHECK-NEXT: br %r14
				entry:
				%cc = icmp ne i8 %b, 1
				%conv = zext i1 %cc to i32
				ret i32 %conv
				}

				; ICMP EQ 1: i64 with i32 use
				define i32 @fun8(i64 %b) {
				; CHECK-LABEL: fun8:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: cghi %r2, 1
				; CHECK-NEXT: lochilh %r2, 0
				; CHECK-NEXT: # kill: def $r2l killed $r2l killed $r2d
				; CHECK-NEXT: br %r14
				entry:
				%cc = icmp eq i64 %b, 1
				%conv = zext i1 %cc to i32
				ret i32 %conv
				}

				; ICMP EQ 1: i64 with i64 use
				define i64 @fun9(i64 %b) {
				; CHECK-LABEL: fun9:
				; CHECK: # %bb.0: # %bb
				; CHECK-NEXT: cghi %r2, 1
				; CHECK-NEXT: locghilh %r2, 0
				; CHECK-NEXT: br %r14
				bb:
				%cc = icmp eq i64 %b, 1
				%conv = zext i1 %cc to i64
				ret i64 %conv
				}