This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes a User and OpIdx. Stop using it in AMDGPU target for simplifyI24.
ClosedPublic

Authored by craig.topper on Dec 26 2018, 11:19 AM.

Download Raw Diff

Details

Reviewers

RKSimon
spatel
arsenm
tstellar

Commits

rG826f44b55048: [TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes a…
rL350560: [TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes a…

Summary

As we saw in D56057 when we tried to use this function on X86, it's unsafe. It allows the operand node to have multiple users, but doesn't prevent recursing past the first node when it does have multiple users. This can cause other simplifications earlier in the graph without regard to what bits are needed by the other users of the first node. Ideally all we should do to the first node if it has multiple uses is bypass it when its not needed by the user we started from. Doing any other transformation that SimplifyDemandedBits can do like turning ZEXT/SEXT into AEXT would result in an increase in instructions.

Fortunately, we already have a function that can do just that, GetDemandedBits. It will only make transformations that involve bypassing a node.

This patch changes AMDGPU's simplifyI24, to use a combination of GetDemandedBits to handle the multiple use simpflications. And then uses the regular SimplifyDemandedBits on each operand to handle simplifications allowed when the operand only has a single use. Unfortunately, GetDemandedBits simplifies constants more aggressively than SimplifyDemandedBits. This caused the -7 constant in the changed test to be simplified to remove the upper bits. I had to modify computeKnownBits to account for this by ignoring the upper 8 bits of the input.

Diff Detail

Repository

rL LLVM

Build Status

Buildable 26260
Build 26259: arc lint + arc unit

Event Timeline

craig.topper created this revision.Dec 26 2018, 11:19 AM

Herald added subscribers: t-tye, tpr, dstuttard and 5 others. · View Herald TranscriptDec 26 2018, 11:19 AM

Harbormaster completed remote builds in B26260: Diff 179517.Dec 26 2018, 11:19 AM

Adding @tstellar as IIRC he originally added that version of SimplifyDemandedBits

lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2723	Worth putting this after the LHS/RHS so we don't call getOperand more than necessary?
4325	This looks (mostly) like an NFC? Just commit the bits that you can?

craig.topper marked 2 inline comments as done.Dec 26 2018, 2:52 PM

craig.topper added inline comments.

lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2723	Will do
4325	The part I'm most concerned about is that MUL_U24 is now using countMinLeadingZeros. I couldn't figure out why it was using countMinSignBits for both signed and unsigned before

Reduce calls to getOperand

Harbormaster completed remote builds in B26270: Diff 179532.Dec 26 2018, 3:14 PM

LGTM but we need a AMDGPU guru to check it over as well.

@arsenm @tstellar Any comments?

Any AMDGPU comments? Else I think Craig can go ahead and commit.

LGTM. The immediate change is a little worse though, since the -7 is free and 0xfffff9 is not

This revision is now accepted and ready to land.Jan 7 2019, 6:03 AM

arsenm added inline comments.Jan 7 2019, 6:05 AM

lib/Target/AMDGPU/AMDGPUISelLowering.cpp
4325	There's this comment that I've never fully understood: // We need to use sext even for MUL_U24, because MUL_U24 is used // for signed multiply of 8 and 16-bit types. return DAG.getSExtOrTrunc(Mul, DL, VT);

In D56087#1348138, @arsenm wrote:

LGTM. The immediate change is a little worse though, since the -7 is free and 0xfffff9 is not

I'm not certain whether we need the constant simplification in GetDemandedBits at all to be honest, but that is a separate issue.

Closed by commit rL350560: [TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes a… (authored by ctopper). · Explain WhyJan 7 2019, 11:34 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

include/

llvm/

CodeGen/

TargetLowering.h

10 lines

lib/

CodeGen/

SelectionDAG/

TargetLowering.cpp

50 lines

Target/

AMDGPU/

AMDGPUISelLowering.cpp

81 lines

test/

CodeGen/

AMDGPU/

lshl64-to-32.ll

2 lines

Diff 179517

include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 2,883 Lines • ▼ Show 20 Lines	public:
}		}

/// Convert x+y to (VT)((SmallVT)x+(SmallVT)y) if the casts are free. This		/// Convert x+y to (VT)((SmallVT)x+(SmallVT)y) if the casts are free. This
/// uses isZExtFree and ZERO_EXTEND for the widening cast, but it could be		/// uses isZExtFree and ZERO_EXTEND for the widening cast, but it could be
/// generalized for targets with other types of implicit widening casts.		/// generalized for targets with other types of implicit widening casts.
bool ShrinkDemandedOp(SDValue Op, unsigned BitWidth, const APInt &Demanded,		bool ShrinkDemandedOp(SDValue Op, unsigned BitWidth, const APInt &Demanded,
TargetLoweringOpt &TLO) const;		TargetLoweringOpt &TLO) const;

/// Helper for SimplifyDemandedBits that can simplify an operation with
/// multiple uses. This function simplifies operand \p OpIdx of \p User and
/// then updates \p User with the simplified version. No other uses of
/// \p OpIdx are updated. If \p User is the only user of \p OpIdx, this
/// function behaves exactly like function SimplifyDemandedBits declared
/// below except that it also updates the DAG by calling
/// DCI.CommitTargetLoweringOpt.
bool SimplifyDemandedBits(SDNode *User, unsigned OpIdx, const APInt &Demanded,
DAGCombinerInfo &DCI, TargetLoweringOpt &TLO) const;

/// Look at Op. At this point, we know that only the DemandedBits bits of the		/// Look at Op. At this point, we know that only the DemandedBits bits of the
/// result of Op are ever used downstream. If we can use this information to		/// result of Op are ever used downstream. If we can use this information to
/// simplify Op, create a new simplified DAG node and return true, returning		/// simplify Op, create a new simplified DAG node and return true, returning
/// the original and new nodes in Old and New. Otherwise, analyze the		/// the original and new nodes in Old and New. Otherwise, analyze the
/// expression and return a mask of KnownOne and KnownZero bits for the		/// expression and return a mask of KnownOne and KnownZero bits for the
/// expression (used to simplify the caller). The KnownZero/One bits may only		/// expression (used to simplify the caller). The KnownZero/One bits may only
/// be accurate for those bits in the Demanded masks.		/// be accurate for those bits in the Demanded masks.
/// \p AssumeSingleUse When this parameter is true, this function will		/// \p AssumeSingleUse When this parameter is true, this function will
▲ Show 20 Lines • Show All 1,002 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/TargetLowering.cpp

Show First 20 Lines • Show All 425 Lines • ▼ Show 20 Lines	if (TLI.isTruncateFree(Op.getValueType(), SmallVT) &&
assert(DemandedSize <= SmallVTBits && "Narrowed below demanded bits?");		assert(DemandedSize <= SmallVTBits && "Narrowed below demanded bits?");
SDValue Z = DAG.getNode(ISD::ANY_EXTEND, dl, Op.getValueType(), X);		SDValue Z = DAG.getNode(ISD::ANY_EXTEND, dl, Op.getValueType(), X);
return TLO.CombineTo(Op, Z);		return TLO.CombineTo(Op, Z);
}		}
}		}
return false;		return false;
}		}

bool
TargetLowering::SimplifyDemandedBits(SDNode *User, unsigned OpIdx,
const APInt &DemandedBits,
DAGCombinerInfo &DCI,
TargetLoweringOpt &TLO) const {
SDValue Op = User->getOperand(OpIdx);
KnownBits Known;

if (!SimplifyDemandedBits(Op, DemandedBits, Known, TLO, 0, true))
return false;


// Old will not always be the same as Op. For example:
//
// Demanded = 0xffffff
// Op = i64 truncate (i32 and x, 0xffffff)
// In this case simplify demand bits will want to replace the 'and' node
// with the value 'x', which will give us:
// Old = i32 and x, 0xffffff
// New = x
if (TLO.Old.hasOneUse()) {
// For the one use case, we just commit the change.
DCI.CommitTargetLoweringOpt(TLO);
return true;
}

// If Old has more than one use then it must be Op, because the
// AssumeSingleUse flag is not propogated to recursive calls of
// SimplifyDemanded bits, so the only node with multiple use that
// it will attempt to combine will be Op.
assert(TLO.Old == Op);

SmallVector <SDValue, 4> NewOps;
for (unsigned i = 0, e = User->getNumOperands(); i != e; ++i) {
if (i == OpIdx) {
NewOps.push_back(TLO.New);
continue;
}
NewOps.push_back(User->getOperand(i));
}
User = TLO.DAG.UpdateNodeOperands(User, NewOps);
// Op has less users now, so we may be able to perform additional combines
// with it.
DCI.AddToWorklist(Op.getNode());
// User's operands have been updated, so we may be able to do new combines
// with it.
DCI.AddToWorklist(User);
return true;
}

bool TargetLowering::SimplifyDemandedBits(SDValue Op, const APInt &DemandedBits,		bool TargetLowering::SimplifyDemandedBits(SDValue Op, const APInt &DemandedBits,
DAGCombinerInfo &DCI) const {		DAGCombinerInfo &DCI) const {
SelectionDAG &DAG = DCI.DAG;		SelectionDAG &DAG = DCI.DAG;
TargetLoweringOpt TLO(DAG, !DCI.isBeforeLegalize(),		TargetLoweringOpt TLO(DAG, !DCI.isBeforeLegalize(),
!DCI.isBeforeLegalizeOps());		!DCI.isBeforeLegalizeOps());
KnownBits Known;		KnownBits Known;

bool Simplified = SimplifyDemandedBits(Op, DemandedBits, Known, TLO);		bool Simplified = SimplifyDemandedBits(Op, DemandedBits, Known, TLO);
▲ Show 20 Lines • Show All 4,934 Lines • Show Last 20 Lines

lib/Target/AMDGPU/AMDGPUISelLowering.cpp

Show First 20 Lines • Show All 2,711 Lines • ▼ Show 20 Lines

static bool isI24(SDValue Op, SelectionDAG &DAG) {		static bool isI24(SDValue Op, SelectionDAG &DAG) {
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();
return VT.getSizeInBits() >= 24 && // Types less than 24-bit should be treated		return VT.getSizeInBits() >= 24 && // Types less than 24-bit should be treated
// as unsigned 24-bit values.		// as unsigned 24-bit values.
AMDGPUTargetLowering::numBitsSigned(Op, DAG) < 24;		AMDGPUTargetLowering::numBitsSigned(Op, DAG) < 24;
}		}

static bool simplifyI24(SDNode *Node24, unsigned OpIdx,		static SDValue simplifyI24(SDNode *Node24,
TargetLowering::DAGCombinerInfo &DCI) {		TargetLowering::DAGCombinerInfo &DCI) {

SelectionDAG &DAG = DCI.DAG;		SelectionDAG &DAG = DCI.DAG;
SDValue Op = Node24->getOperand(OpIdx);		EVT VT = Node24->getOperand(0).getValueType();
		RKSimonUnsubmitted Not Done Reply Inline Actions Worth putting this after the LHS/RHS so we don't call getOperand more than necessary? RKSimon: Worth putting this after the LHS/RHS so we don't call getOperand more than necessary?
		craig.topperAuthorUnsubmitted Done Reply Inline Actions Will do craig.topper: Will do
const TargetLowering &TLI = DAG.getTargetLoweringInfo();
EVT VT = Op.getValueType();

APInt Demanded = APInt::getLowBitsSet(VT.getSizeInBits(), 24);		APInt Demanded = APInt::getLowBitsSet(VT.getSizeInBits(), 24);
APInt KnownZero, KnownOne;
TargetLowering::TargetLoweringOpt TLO(DAG, true, true);
if (TLI.SimplifyDemandedBits(Node24, OpIdx, Demanded, DCI, TLO))
return true;

return false;		// First try to simplify using GetDemandedBits which allows the operands to
		// have other uses, but will only perform simplifications that involve
		// bypassing some nodes for this user.
		SDValue LHS = Node24->getOperand(0);
		SDValue RHS = Node24->getOperand(1);
		SDValue DemandedLHS = DAG.GetDemandedBits(LHS, Demanded);
		SDValue DemandedRHS = DAG.GetDemandedBits(RHS, Demanded);
		if (DemandedLHS \|\| DemandedRHS)
		return DAG.getNode(Node24->getOpcode(), SDLoc(Node24), Node24->getVTList(),
		DemandedLHS ? DemandedLHS : LHS,
		DemandedRHS ? DemandedRHS : RHS);

		// Now try SimplifyDemandedBits which can simplify the nodes used by our
		// operands if this node is the only user.
		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
		if (TLI.SimplifyDemandedBits(LHS, Demanded, DCI))
		return SDValue(Node24, 0);
		if (TLI.SimplifyDemandedBits(RHS, Demanded, DCI))
		return SDValue(Node24, 0);

		return SDValue();
}		}

template <typename IntTy>		template <typename IntTy>
static SDValue constantFoldBFE(SelectionDAG &DAG, IntTy Src0, uint32_t Offset,		static SDValue constantFoldBFE(SelectionDAG &DAG, IntTy Src0, uint32_t Offset,
uint32_t Width, const SDLoc &DL) {		uint32_t Width, const SDLoc &DL) {
if (Width + Offset < 32) {		if (Width + Offset < 32) {
uint32_t Shl = static_cast<uint32_t>(Src0) << (32 - Offset - Width);		uint32_t Shl = static_cast<uint32_t>(Src0) << (32 - Offset - Width);
IntTy Result = static_cast<IntTy>(Shl) >> (32 - Width);		IntTy Result = static_cast<IntTy>(Shl) >> (32 - Width);
▲ Show 20 Lines • Show All 531 Lines • ▼ Show 20 Lines	SDValue AMDGPUTargetLowering::performMulhuCombine(SDNode *N,
return DAG.getZExtOrTrunc(Mulhi, DL, VT);		return DAG.getZExtOrTrunc(Mulhi, DL, VT);
}		}

SDValue AMDGPUTargetLowering::performMulLoHi24Combine(		SDValue AMDGPUTargetLowering::performMulLoHi24Combine(
SDNode *N, DAGCombinerInfo &DCI) const {		SDNode *N, DAGCombinerInfo &DCI) const {
SelectionDAG &DAG = DCI.DAG;		SelectionDAG &DAG = DCI.DAG;

// Simplify demanded bits before splitting into multiple users.		// Simplify demanded bits before splitting into multiple users.
if (simplifyI24(N, 0, DCI) \|\| simplifyI24(N, 1, DCI))		if (SDValue V = simplifyI24(N, DCI))
return SDValue();		return V;

SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);		SDValue N1 = N->getOperand(1);

bool Signed = (N->getOpcode() == AMDGPUISD::MUL_LOHI_I24);		bool Signed = (N->getOpcode() == AMDGPUISD::MUL_LOHI_I24);

unsigned MulLoOpc = Signed ? AMDGPUISD::MUL_I24 : AMDGPUISD::MUL_U24;		unsigned MulLoOpc = Signed ? AMDGPUISD::MUL_I24 : AMDGPUISD::MUL_U24;
unsigned MulHiOpc = Signed ? AMDGPUISD::MULHI_I24 : AMDGPUISD::MULHI_U24;		unsigned MulHiOpc = Signed ? AMDGPUISD::MULHI_I24 : AMDGPUISD::MULHI_U24;
▲ Show 20 Lines • Show All 569 Lines • ▼ Show 20 Lines	SDValue AMDGPUTargetLowering::PerformDAGCombine(SDNode *N,
case ISD::MULHS:		case ISD::MULHS:
return performMulhsCombine(N, DCI);		return performMulhsCombine(N, DCI);
case ISD::MULHU:		case ISD::MULHU:
return performMulhuCombine(N, DCI);		return performMulhuCombine(N, DCI);
case AMDGPUISD::MUL_I24:		case AMDGPUISD::MUL_I24:
case AMDGPUISD::MUL_U24:		case AMDGPUISD::MUL_U24:
case AMDGPUISD::MULHI_I24:		case AMDGPUISD::MULHI_I24:
case AMDGPUISD::MULHI_U24: {		case AMDGPUISD::MULHI_U24: {
// If the first call to simplify is successfull, then N may end up being		if (SDValue V = simplifyI24(N, DCI))
// deleted, so we shouldn't call simplifyI24 again.		return V;
simplifyI24(N, 0, DCI) \|\| simplifyI24(N, 1, DCI);
return SDValue();		return SDValue();
}		}
case AMDGPUISD::MUL_LOHI_I24:		case AMDGPUISD::MUL_LOHI_I24:
case AMDGPUISD::MUL_LOHI_U24:		case AMDGPUISD::MUL_LOHI_U24:
return performMulLoHi24Combine(N, DCI);		return performMulLoHi24Combine(N, DCI);
case ISD::SELECT:		case ISD::SELECT:
return performSelectCombine(N, DCI);		return performSelectCombine(N, DCI);
case ISD::FNEG:		case ISD::FNEG:
▲ Show 20 Lines • Show All 411 Lines • ▼ Show 20 Lines	void AMDGPUTargetLowering::computeKnownBitsForTargetNode(
case AMDGPUISD::MUL_U24:		case AMDGPUISD::MUL_U24:
case AMDGPUISD::MUL_I24: {		case AMDGPUISD::MUL_I24: {
KnownBits LHSKnown = DAG.computeKnownBits(Op.getOperand(0), Depth + 1);		KnownBits LHSKnown = DAG.computeKnownBits(Op.getOperand(0), Depth + 1);
KnownBits RHSKnown = DAG.computeKnownBits(Op.getOperand(1), Depth + 1);		KnownBits RHSKnown = DAG.computeKnownBits(Op.getOperand(1), Depth + 1);
unsigned TrailZ = LHSKnown.countMinTrailingZeros() +		unsigned TrailZ = LHSKnown.countMinTrailingZeros() +
RHSKnown.countMinTrailingZeros();		RHSKnown.countMinTrailingZeros();
Known.Zero.setLowBits(std::min(TrailZ, 32u));		Known.Zero.setLowBits(std::min(TrailZ, 32u));

unsigned LHSValBits = 32 - std::max(LHSKnown.countMinSignBits(), 8u);		// Truncate to 24 bits.
unsigned RHSValBits = 32 - std::max(RHSKnown.countMinSignBits(), 8u);		LHSKnown = LHSKnown.trunc(24);
		RHSKnown = RHSKnown.trunc(24);

		bool Negative = false;
		if (Opc == AMDGPUISD::MUL_I24) {
		unsigned LHSValBits = 24 - LHSKnown.countMinSignBits();
		unsigned RHSValBits = 24 - RHSKnown.countMinSignBits();
unsigned MaxValBits = std::min(LHSValBits + RHSValBits, 32u);		unsigned MaxValBits = std::min(LHSValBits + RHSValBits, 32u);
if (MaxValBits >= 32)		if (MaxValBits >= 32)
break;		break;
bool Negative = false;		bool LHSNegative = LHSKnown.isNegative();
if (Opc == AMDGPUISD::MUL_I24) {		bool LHSPositive = LHSKnown.isNonNegative();
bool LHSNegative = !!(LHSKnown.One & (1 << 23));		bool RHSNegative = RHSKnown.isNegative();
bool LHSPositive = !!(LHSKnown.Zero & (1 << 23));		bool RHSPositive = RHSKnown.isNonNegative();
		RKSimonUnsubmitted Not Done Reply Inline Actions This looks (mostly) like an NFC? Just commit the bits that you can? RKSimon: This looks (mostly) like an NFC? Just commit the bits that you can?
		craig.topperAuthorUnsubmitted Done Reply Inline Actions The part I'm most concerned about is that MUL_U24 is now using countMinLeadingZeros. I couldn't figure out why it was using countMinSignBits for both signed and unsigned before craig.topper: The part I'm most concerned about is that MUL_U24 is now using countMinLeadingZeros. I couldn't…
		arsenmUnsubmitted Not Done Reply Inline Actions There's this comment that I've never fully understood: // We need to use sext even for MUL_U24, because MUL_U24 is used // for signed multiply of 8 and 16-bit types. return DAG.getSExtOrTrunc(Mul, DL, VT); arsenm: There's this comment that I've never fully understood: // We need to use sext even for…
bool RHSNegative = !!(RHSKnown.One & (1 << 23));
bool RHSPositive = !!(RHSKnown.Zero & (1 << 23));
if ((!LHSNegative && !LHSPositive) \|\| (!RHSNegative && !RHSPositive))		if ((!LHSNegative && !LHSPositive) \|\| (!RHSNegative && !RHSPositive))
break;		break;
Negative = (LHSNegative && RHSPositive) \|\| (LHSPositive && RHSNegative);		Negative = (LHSNegative && RHSPositive) \|\| (LHSPositive && RHSNegative);
}
if (Negative)		if (Negative)
Known.One.setHighBits(32 - MaxValBits);		Known.One.setHighBits(32 - MaxValBits);
else		else
Known.Zero.setHighBits(32 - MaxValBits);		Known.Zero.setHighBits(32 - MaxValBits);
		} else {
		unsigned LHSValBits = 24 - LHSKnown.countMinLeadingZeros();
		unsigned RHSValBits = 24 - RHSKnown.countMinLeadingZeros();
		unsigned MaxValBits = std::min(LHSValBits + RHSValBits, 32u);
		if (MaxValBits >= 32)
		break;
		Known.Zero.setHighBits(32 - MaxValBits);
		}
break;		break;
}		}
case AMDGPUISD::PERM: {		case AMDGPUISD::PERM: {
ConstantSDNode *CMask = dyn_cast<ConstantSDNode>(Op.getOperand(2));		ConstantSDNode *CMask = dyn_cast<ConstantSDNode>(Op.getOperand(2));
if (!CMask)		if (!CMask)
return;		return;

KnownBits LHSKnown = DAG.computeKnownBits(Op.getOperand(0), Depth + 1);		KnownBits LHSKnown = DAG.computeKnownBits(Op.getOperand(0), Depth + 1);
▲ Show 20 Lines • Show All 178 Lines • Show Last 20 Lines

test/CodeGen/AMDGPU/lshl64-to-32.ll

	Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines
	; GCN-NEXT: v_lshlrev_b32_e32 v1, 2, v0			; GCN-NEXT: v_lshlrev_b32_e32 v1, 2, v0
	; GCN-NEXT: v_mov_b32_e32 v2, 0			; GCN-NEXT: v_mov_b32_e32 v2, 0
	; GCN-NEXT: s_waitcnt lgkmcnt(0)			; GCN-NEXT: s_waitcnt lgkmcnt(0)
	; GCN-NEXT: s_mov_b64 s[4:5], s[2:3]			; GCN-NEXT: s_mov_b64 s[4:5], s[2:3]
	; GCN-NEXT: buffer_load_dword v1, v[1:2], s[4:7], 0 addr64			; GCN-NEXT: buffer_load_dword v1, v[1:2], s[4:7], 0 addr64
	; GCN-NEXT: s_mov_b64 s[2:3], s[6:7]			; GCN-NEXT: s_mov_b64 s[2:3], s[6:7]
	; GCN-NEXT: s_waitcnt vmcnt(0)			; GCN-NEXT: s_waitcnt vmcnt(0)
	; GCN-NEXT: v_or_b32_e32 v1, 0x800000, v1			; GCN-NEXT: v_or_b32_e32 v1, 0x800000, v1
	; GCN-NEXT: v_mul_i32_i24_e32 v1, -7, v1			; GCN-NEXT: v_mul_i32_i24_e32 v1, 0xfffff9, v1
	; GCN-NEXT: v_lshlrev_b32_e32 v1, 3, v1			; GCN-NEXT: v_lshlrev_b32_e32 v1, 3, v1
	; GCN-NEXT: v_lshlrev_b32_e32 v3, 3, v0			; GCN-NEXT: v_lshlrev_b32_e32 v3, 3, v0
	; GCN-NEXT: v_mov_b32_e32 v4, v2			; GCN-NEXT: v_mov_b32_e32 v4, v2
	; GCN-NEXT: buffer_store_dwordx2 v[1:2], v[3:4], s[0:3], 0 addr64			; GCN-NEXT: buffer_store_dwordx2 v[1:2], v[3:4], s[0:3], 0 addr64
	; GCN-NEXT: s_endpgm			; GCN-NEXT: s_endpgm
	bb:			bb:
	%tmp = tail call i32 @llvm.amdgcn.workitem.id.x()			%tmp = tail call i32 @llvm.amdgcn.workitem.id.x()
	%tmp2 = sext i32 %tmp to i64			%tmp2 = sext i32 %tmp to i64
	Show All 12 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes a User and OpIdx. Stop using it in AMDGPU target for simplifyI24.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 179517

include/llvm/CodeGen/TargetLowering.h

lib/CodeGen/SelectionDAG/TargetLowering.cpp

lib/Target/AMDGPU/AMDGPUISelLowering.cpp

test/CodeGen/AMDGPU/lshl64-to-32.ll

[TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes a User and OpIdx. Stop using it in AMDGPU target for simplifyI24.
ClosedPublic