llvm/lib/Target/PowerPC/PPCISelLowering.cpp
14137 ↗	(On Diff #157346)	Why the call to `getNode()`? An `SDValue` is implicitly convertible to `bool` and will be false if this is a default-constructed one.

This revision now requires changes to proceed.Aug 13 2018, 7:27 AM

Added test case.
Address reviewers comments.

nemanjai added inline comments.Aug 27 2018, 7:46 AM

lib/Target/PowerPC/PPCISelLowering.cpp
14192 ↗	(On Diff #161557)	Nit: indentation is off.
14195 ↗	(On Diff #161557)	Perhaps for brevity and readability, combine this into the if: if (SDValue CRTruncValue = DAGCombineTruncBoolExt(N, DCI)) return CRTruncValue;
14203 ↗	(On Diff #161557)	Nit: the more common way to declare references puts the ampersand immediately before the variable name.
14209 ↗	(On Diff #161557)	Why not just `Op0.getOpcode()` and `Op0.getOperand(0)`? Here as well as below.
lib/Target/PowerPC/PPCInstrInfo.td
237 ↗	(On Diff #161557)	I don't think it is OK to allow arbitrary integer types for the result nor arbitrary floating point types for operand 1. This really seems like it should use `SDTCisVT`. I realize that `PPCbuild_fp128` has the same issue and we missed it, but I think we should favour fixing that one (in a separate patch) rather than making this one match.
test/CodeGen/PowerPC/f128-bitcast.ll
1 ↗	(On Diff #161557)	I think once you add the BE portion to this patch, we should have another RUN to test that.

Since there are unaddressed comments, I'm moving this off of my Must Review queue until they are addressed.

This revision now requires changes to proceed.Sep 18 2018, 6:49 AM

Address Review Comments

Herald added a subscriber: jsji. · View Herald TranscriptOct 1 2018, 12:22 PM

Instead of generating a special EXTRACT_FP128 node, could you bitcast to v2i64 and use EXTRACT_VECTOR_ELT instead? (I'm not that familiar with Power9, so it's possible that doesn't actually work for some reason.)

Removed the custom PPCISD node and it simplified the patch considerably.

Added conditions to handle bitconverts from f128 to a number of vector sizes (v2i64 down to v16i8).
Did not add the code for the bitconvert from the vector sizes to f128 because I have not been able to find an example to trigger the condition. It seems that the f128 is lowered into a store followed by a load.

efriedma added inline comments.Oct 4 2018, 11:21 AM

lib/Target/PowerPC/PPCISelLowering.cpp
14324 ↗	(On Diff #168330)	80 columns. I assume you want something like `(uint32_t)DAG.getDataLayout().isBigEndian()` here, not "0".
14330 ↗	(On Diff #168330)	The second operand of the shift might not be a constant; getConstantOperandVal will crash in that case.

Addressed review comments.

Also added Big Endian to the test to cover that case.

nemanjai added inline comments.Oct 19 2018, 3:28 AM

lib/Target/PowerPC/PPCISelLowering.cpp
14314 ↗	(On Diff #168502)	This logic is fairly deeply nested. Can you convert some of the conditions to early exits. This one certainly looks to fit the bill.
14331 ↗	(On Diff #168502)	For clarity, please do the null pointer check first.

Simplified some of the conditionals (easier to read) and added an early exit.

nemanjai added inline comments.Oct 22 2018, 8:00 AM

lib/Target/PowerPC/PPCISelLowering.cpp
14331 ↗	(On Diff #170207)	I think this can be further simplified and the duplicated code can be removed by essentially "looking through" the right shift and updating the element you need to extract.

Simplified the code further.

I think the changes that are needed are clear enough that this doesn't require another review cycle. Approving this with the assumption that the required change will be made on the commit.

lib/Target/PowerPC/PPCISelLowering.cpp
14325 ↗	(On Diff #170442)	This seems kind of misleading - makes the reader wonder what this "constant bit" is. I think something like `// Switch the element number to extract.` is a lot more descriptive.
14329 ↗	(On Diff #170442)	There needs to be an `else` here that will `return SDValue()` since it is not OK to keep going if we have an `SRL` with a non-constant shift amount or a shift amount other than 64. However, rather than an `else` I think it would be nicer to flip the condition and exit early... // The right shift has to be by 64 bits. if (!ConstNode \|\| ConstNode->getZextValue() != 64) return SDValue();
14336 ↗	(On Diff #170442)	The comment is redundant - basically just says that we're inside the if block.

This revision is now accepted and ready to land.Oct 23 2018, 2:53 AM

Closed by commit rL345053: [Power9] Add __float128 support in the backend for bitcast to a i128 (authored by stefanp). · Explain WhyOct 23 2018, 10:13 AM

This revision was automatically updated to reflect the committed changes.

Diff 170689

llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h

Show First 20 Lines • Show All 1,087 Lines • ▼ Show 20 Lines	private:
SDValue DAGCombineBuildVector(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue DAGCombineBuildVector(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue DAGCombineTruncBoolExt(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue DAGCombineTruncBoolExt(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineStoreFPToInt(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineStoreFPToInt(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineFPToIntToFP(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineFPToIntToFP(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineSHL(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineSHL(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineSRA(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineSRA(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineSRL(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineSRL(SDNode *N, DAGCombinerInfo &DCI) const;
SDValue combineADD(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue combineADD(SDNode *N, DAGCombinerInfo &DCI) const;
		SDValue combineTRUNCATE(SDNode *N, DAGCombinerInfo &DCI) const;

/// ConvertSETCCToSubtract - looks at SETCC that compares ints. It replaces		/// ConvertSETCCToSubtract - looks at SETCC that compares ints. It replaces
/// SETCC with integer subtraction when (1) there is a legal way of doing it		/// SETCC with integer subtraction when (1) there is a legal way of doing it
/// (2) keeping the result of comparison in GPR has performance benefit.		/// (2) keeping the result of comparison in GPR has performance benefit.
SDValue ConvertSETCCToSubtract(SDNode *N, DAGCombinerInfo &DCI) const;		SDValue ConvertSETCCToSubtract(SDNode *N, DAGCombinerInfo &DCI) const;

SDValue getSqrtEstimate(SDValue Operand, SelectionDAG &DAG, int Enabled,		SDValue getSqrtEstimate(SDValue Operand, SelectionDAG &DAG, int Enabled,
int &RefinementSteps, bool &UseOneConstNR,		int &RefinementSteps, bool &UseOneConstNR,
▲ Show 20 Lines • Show All 67 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,064 Lines • ▼ Show 20 Lines	PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
setTargetDAGCombine(ISD::INTRINSIC_WO_CHAIN);		setTargetDAGCombine(ISD::INTRINSIC_WO_CHAIN);
setTargetDAGCombine(ISD::INTRINSIC_W_CHAIN);		setTargetDAGCombine(ISD::INTRINSIC_W_CHAIN);
setTargetDAGCombine(ISD::INTRINSIC_VOID);		setTargetDAGCombine(ISD::INTRINSIC_VOID);

setTargetDAGCombine(ISD::SIGN_EXTEND);		setTargetDAGCombine(ISD::SIGN_EXTEND);
setTargetDAGCombine(ISD::ZERO_EXTEND);		setTargetDAGCombine(ISD::ZERO_EXTEND);
setTargetDAGCombine(ISD::ANY_EXTEND);		setTargetDAGCombine(ISD::ANY_EXTEND);

		setTargetDAGCombine(ISD::TRUNCATE);

if (Subtarget.useCRBits()) {		if (Subtarget.useCRBits()) {
setTargetDAGCombine(ISD::TRUNCATE);		setTargetDAGCombine(ISD::TRUNCATE);
setTargetDAGCombine(ISD::SETCC);		setTargetDAGCombine(ISD::SETCC);
setTargetDAGCombine(ISD::SELECT_CC);		setTargetDAGCombine(ISD::SELECT_CC);
}		}

// Use reciprocal estimates.		// Use reciprocal estimates.
if (TM.Options.UnsafeFPMath) {		if (TM.Options.UnsafeFPMath) {
▲ Show 20 Lines • Show All 8,548 Lines • ▼ Show 20 Lines	void PPCTargetLowering::ReplaceNodeResults(SDNode *N,
}		}
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
// LowerFP_TO_INT() can only handle f32 and f64.		// LowerFP_TO_INT() can only handle f32 and f64.
if (N->getOperand(0).getValueType() == MVT::ppcf128)		if (N->getOperand(0).getValueType() == MVT::ppcf128)
return;		return;
Results.push_back(LowerFP_TO_INT(SDValue(N, 0), DAG, dl));		Results.push_back(LowerFP_TO_INT(SDValue(N, 0), DAG, dl));
return;		return;
		case ISD::BITCAST:
		// Don't handle bitcast here.
		return;
}		}
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Other Lowering Code		// Other Lowering Code
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

static Instruction* callIntrinsic(IRBuilder<> &Builder, Intrinsic::ID Id) {		static Instruction* callIntrinsic(IRBuilder<> &Builder, Intrinsic::ID Id) {
▲ Show 20 Lines • Show All 2,829 Lines • ▼ Show 20 Lines	if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(N->getOperand(0))) {
return N->getOperand(0);		return N->getOperand(0);
}		}
break;		break;
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
case ISD::ANY_EXTEND:		case ISD::ANY_EXTEND:
return DAGCombineExtBoolTrunc(N, DCI);		return DAGCombineExtBoolTrunc(N, DCI);
case ISD::TRUNCATE:		case ISD::TRUNCATE:
		return combineTRUNCATE(N, DCI);
case ISD::SETCC:		case ISD::SETCC:
case ISD::SELECT_CC:		case ISD::SELECT_CC:
return DAGCombineTruncBoolExt(N, DCI);		return DAGCombineTruncBoolExt(N, DCI);
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
return combineFPToIntToFP(N, DCI);		return combineFPToIntToFP(N, DCI);
case ISD::STORE: {		case ISD::STORE: {

▲ Show 20 Lines • Show All 1,758 Lines • ▼ Show 20 Lines

SDValue PPCTargetLowering::combineADD(SDNode *N, DAGCombinerInfo &DCI) const {		SDValue PPCTargetLowering::combineADD(SDNode *N, DAGCombinerInfo &DCI) const {
if (auto Value = combineADDToADDZE(N, DCI.DAG, Subtarget))		if (auto Value = combineADDToADDZE(N, DCI.DAG, Subtarget))
return Value;		return Value;

return SDValue();		return SDValue();
}		}

		// Detect TRUNCATE operations on bitcasts of float128 values.
		// What we are looking for here is the situtation where we extract a subset
		// of bits from a 128 bit float.
		// This can be of two forms:
		// 1) BITCAST of f128 feeding TRUNCATE
		// 2) BITCAST of f128 feeding SRL (a shift) feeding TRUNCATE
		// The reason this is required is because we do not have a legal i128 type
		// and so we want to prevent having to store the f128 and then reload part
		// of it.
		SDValue PPCTargetLowering::combineTRUNCATE(SDNode *N,
		DAGCombinerInfo &DCI) const {
		// If we are using CRBits then try that first.
		if (Subtarget.useCRBits()) {
		// Check if CRBits did anything and return that if it did.
		if (SDValue CRTruncValue = DAGCombineTruncBoolExt(N, DCI))
		return CRTruncValue;
		}

		SDLoc dl(N);
		SDValue Op0 = N->getOperand(0);

		// Looking for a truncate of i128 to i64.
		if (Op0.getValueType() != MVT::i128 \|\| N->getValueType(0) != MVT::i64)
		return SDValue();

		int EltToExtract = DCI.DAG.getDataLayout().isBigEndian() ? 1 : 0;

		// SRL feeding TRUNCATE.
		if (Op0.getOpcode() == ISD::SRL) {
		ConstantSDNode *ConstNode = dyn_cast<ConstantSDNode>(Op0.getOperand(1));
		// The right shift has to be by 64 bits.
		if (!ConstNode \|\| ConstNode->getZExtValue() != 64)
		return SDValue();

		// Switch the element number to extract.
		EltToExtract = EltToExtract ? 0 : 1;
		// Update Op0 past the SRL.
		Op0 = Op0.getOperand(0);
		}

		// BITCAST feeding a TRUNCATE possibly via SRL.
		if (Op0.getOpcode() == ISD::BITCAST &&
		Op0.getValueType() == MVT::i128 &&
		Op0.getOperand(0).getValueType() == MVT::f128) {
		SDValue Bitcast = DCI.DAG.getBitcast(MVT::v2i64, Op0.getOperand(0));
		return DCI.DAG.getNode(
		ISD::EXTRACT_VECTOR_ELT, dl, MVT::i64, Bitcast,
		DCI.DAG.getTargetConstant(EltToExtract, dl, MVT::i32));
		}
		return SDValue();
		}

bool PPCTargetLowering::mayBeEmittedAsTailCall(const CallInst *CI) const {		bool PPCTargetLowering::mayBeEmittedAsTailCall(const CallInst *CI) const {
// Only duplicate to increase tail-calls for the 64bit SysV ABIs.		// Only duplicate to increase tail-calls for the 64bit SysV ABIs.
if (!Subtarget.isSVR4ABI() \|\| !Subtarget.isPPC64())		if (!Subtarget.isSVR4ABI() \|\| !Subtarget.isPPC64())
return false;		return false;

// If not a tail call then no need to proceed.		// If not a tail call then no need to proceed.
if (!CI->isTailCall())		if (!CI->isTailCall())
return false;		return false;
▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/PowerPC/PPCInstrVSX.td

	Show First 20 Lines • Show All 1,034 Lines • ▼ Show 20 Lines
	def : Pat<(v2i64 (bitconvert v2f64:$A)),			def : Pat<(v2i64 (bitconvert v2f64:$A)),
	(COPY_TO_REGCLASS $A, VRRC)>;			(COPY_TO_REGCLASS $A, VRRC)>;

	def : Pat<(v2f64 (bitconvert v1i128:$A)),			def : Pat<(v2f64 (bitconvert v1i128:$A)),
	(COPY_TO_REGCLASS $A, VRRC)>;			(COPY_TO_REGCLASS $A, VRRC)>;
	def : Pat<(v1i128 (bitconvert v2f64:$A)),			def : Pat<(v1i128 (bitconvert v2f64:$A)),
	(COPY_TO_REGCLASS $A, VRRC)>;			(COPY_TO_REGCLASS $A, VRRC)>;

				def : Pat<(v2i64 (bitconvert f128:$A)),
				(COPY_TO_REGCLASS $A, VRRC)>;
				def : Pat<(v4i32 (bitconvert f128:$A)),
				(COPY_TO_REGCLASS $A, VRRC)>;
				def : Pat<(v8i16 (bitconvert f128:$A)),
				(COPY_TO_REGCLASS $A, VRRC)>;
				def : Pat<(v16i8 (bitconvert f128:$A)),
				(COPY_TO_REGCLASS $A, VRRC)>;

	def : Pat<(v2f64 (PPCsvec2fp v4i32:$C, 0)),			def : Pat<(v2f64 (PPCsvec2fp v4i32:$C, 0)),
	(v2f64 (XVCVSXWDP (v2i64 (XXMRGHW $C, $C))))>;			(v2f64 (XVCVSXWDP (v2i64 (XXMRGHW $C, $C))))>;
	def : Pat<(v2f64 (PPCsvec2fp v4i32:$C, 1)),			def : Pat<(v2f64 (PPCsvec2fp v4i32:$C, 1)),
	(v2f64 (XVCVSXWDP (v2i64 (XXMRGLW $C, $C))))>;			(v2f64 (XVCVSXWDP (v2i64 (XXMRGLW $C, $C))))>;

	def : Pat<(v2f64 (PPCuvec2fp v4i32:$C, 0)),			def : Pat<(v2f64 (PPCuvec2fp v4i32:$C, 0)),
	(v2f64 (XVCVUXWDP (v2i64 (XXMRGHW $C, $C))))>;			(v2f64 (XVCVUXWDP (v2i64 (XXMRGHW $C, $C))))>;
	def : Pat<(v2f64 (PPCuvec2fp v4i32:$C, 1)),			def : Pat<(v2f64 (PPCuvec2fp v4i32:$C, 1)),
	▲ Show 20 Lines • Show All 2,952 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/PowerPC/f128-bitcast.ll

				; RUN: llc -mcpu=pwr9 -mtriple=powerpc64le-unknown-unknown \
				; RUN: -enable-ppc-quad-precision -verify-machineinstrs \
				; RUN: -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr < %s \| FileCheck %s
				; RUN: llc -mcpu=pwr9 -mtriple=powerpc64-unknown-unknown \
				; RUN: -enable-ppc-quad-precision -verify-machineinstrs \
				; RUN: -ppc-asm-full-reg-names \
				; RUN: -ppc-vsr-nums-as-vr < %s \| FileCheck %s --check-prefix=CHECK-BE

				; Function Attrs: norecurse nounwind readnone
				define i64 @getPart1(fp128 %in) local_unnamed_addr {
				entry:
				%0 = bitcast fp128 %in to i128
				%a.sroa.0.0.extract.trunc = trunc i128 %0 to i64
				ret i64 %a.sroa.0.0.extract.trunc
				; CHECK-LABEL: getPart1
				; CHECK: mfvsrld r3, v2
				; CHECK-NEXT: blr
				; CHECK-BE-LABEL: getPart1
				; CHECK-BE: mfvsrld r3, v2
				; CHECK-BE-NEXT: blr
				}

				; Function Attrs: norecurse nounwind readnone
				define i64 @getPart2(fp128 %in) local_unnamed_addr {
				entry:
				%0 = bitcast fp128 %in to i128
				%a.sroa.0.8.extract.shift = lshr i128 %0, 64
				%a.sroa.0.8.extract.trunc = trunc i128 %a.sroa.0.8.extract.shift to i64
				ret i64 %a.sroa.0.8.extract.trunc
				; CHECK-LABEL: getPart2
				; CHECK: mfvsrd r3, v2
				; CHECK-NEXT: blr
				; CHECK-BE-LABEL: getPart2
				; CHECK-BE: mfvsrd r3, v2
				; CHECK-BE-NEXT: blr
				}

				; Function Attrs: norecurse nounwind readnone
				define i64 @checkBitcast(fp128 %in, <2 x i64> %in2, <2 x i64> *%out) local_unnamed_addr {
				entry:
				%0 = bitcast fp128 %in to <2 x i64>
				%1 = extractelement <2 x i64> %0, i64 0
				%2 = add <2 x i64> %0, %in2
				store <2 x i64> %2, <2 x i64> *%out, align 16
				ret i64 %1
				; CHECK-LABEL: checkBitcast
				; CHECK: mfvsrld r3, v2
				; CHECK: blr
				; CHECK-BE-LABEL: checkBitcast
				; CHECK-BE: mfvsrd r3, v2
				; CHECK-BE: blr
				}

This is an archive of the discontinued LLVM Phabricator instance.

[Power9] Add __float128 support in the backend for bitcast to a i128
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 170689

llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h

llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp

llvm/trunk/lib/Target/PowerPC/PPCInstrVSX.td

llvm/trunk/test/CodeGen/PowerPC/f128-bitcast.ll

This is an archive of the discontinued LLVM Phabricator instance.

[Power9] Add __float128 support in the backend for bitcast to a i128ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 170689

llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h

llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp

llvm/trunk/lib/Target/PowerPC/PPCInstrVSX.td

llvm/trunk/test/CodeGen/PowerPC/f128-bitcast.ll

[Power9] Add __float128 support in the backend for bitcast to a i128
ClosedPublic