Download Raw Diff

Details

Reviewers

uweigand
andrew.w.kaylor
hfinkel
craig.topper

Commits

rGf37bd01ddca9: [FPEnv] Expand constrained FP operations
rL334603: [FPEnv] Expand constrained FP operations

Summary

Add a helper function to expand constrained FP operations as needed.

Diff Detail

Repository: rL LLVM

Event Timeline

cameron.mcinally created this revision.May 29 2018, 11:50 AM

Herald added a subscriber: llvm-commits. · View Herald TranscriptMay 29 2018, 11:50 AM

Corrected a small whitespace problem.

Updated this patch to handle a constrained vector POWI. The arguments to that function do not have a uniform type.

I have not included a test case for POWI since the intrinsic declaration also needs to be updated to handle vectors. That is a separate issue and probably deserves its own Diff.

In D47491#1116472, @cameron.mcinally wrote:

Updated this patch to handle a constrained vector POWI. The arguments to that function do not have a uniform type.

I have not included a test case for POWI since the intrinsic declaration also needs to be updated to handle vectors. That is a separate issue and probably deserves its own Diff.

The LLVM LangRef states that the POWI intrinsic always takes a scalar i32 as the 2nd argument, even when the 1st argument is a vector. That seems wrong, but I'll save the llvm-dev discussion for another time.

I'll have to special case the POWI expansion in light of this...

Apologies for the churn. I've left out POWI for now since it's such an oddball. Will handle it in another Diff.

I'll let this Diff sit for review now...

I'm adding Craig Topper as a reviewer because he knows the vector selection DAG stuff better than I do. The constrained FP handling looks OK to me.

test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
1 ↗	(On Diff #149175)	I'm guessing you copied these run lines from somewhere else. It doesn't look like you need multiple runs or the extra check prefixes.
5 ↗	(On Diff #149175)	I think this check needs to do more than this. You should be verifying the complete expansion. Also, can you add checks for some other instructions. Some of these won't require expansion, right?

andrew.w.kaylor added a reviewer: craig.topper.Jun 7 2018, 10:50 AM

craig.topper added inline comments.Jun 7 2018, 12:16 PM

lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
1133 ↗	(On Diff #149175)	This makes an ArrayRef to a temporary std::initializer_list. This is very unsafe. You just want EVT ValuesVTs[] = {EltVT, MVT::Other};
1157 ↗	(On Diff #149175)	Add SDValue in front of this and remove the earlier declaration.
1163 ↗	(On Diff #149175)	Add SDValue in front of these and remvoe the earlier declaration.

Make updates as needed from the code review.

Thanks, guys.

I'd also like to point out that this Diff will slightly clobber Ulrich's D45576. But, I suspect that it will be a small change to accommodate the difference.

test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
5 ↗	(On Diff #149175)	I think this check needs to do more than this. You should be verifying the complete expansion. Thanks. I've added a check for the upper POW. Also, can you add checks for some other instructions. Some of these won't require expansion, right? POW is the only strict operation that I've found that currently needs expansion on X86. That said, I have a full set of tests that cover each strict operation. I could add them if you'd like, but it's a bit superfluous for this change. I do intend to add the full set once other patches are sent upstream. Thought?

I'm happy if @andrew.w.kaylor is happy.

andrew.w.kaylor added inline comments.Jun 11 2018, 4:25 PM

test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
5 ↗	(On Diff #149175)	I'd really like to see checks of the entire sequence to which this gets expanded. That seems like the only way to be certain it was expanded correctly. In some of the other tests I took a shortcut similar to what you did here because there was a 1-to-1 correspondence between the intrinsic and the instruction to which it was lowered. Even in that case it was probably not an ideal test. Regarding instructions that aren't supposed to be expanded, what I would like to see here is a test that uses a vector form of the constrained intrinsic and a check that verifies that we generated the corresponding vector instruction.

Non-string RINT and NEARBYINT should also be Expand on pre-SSE4.1 targets. So that would be mean we should expand the strict versions too.

Err that should have said non-strict not non-string

In D47491#1129202, @craig.topper wrote:

Non-string RINT and NEARBYINT should also be Expand on pre-SSE4.1 targets. So that would be mean we should expand the strict versions too.

Ah, you're right. I do not have tests for those locally. I'll add a few.

cameron.mcinally added inline comments.Jun 12 2018, 7:27 AM

test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
5 ↗	(On Diff #149175)	Andrew, just to be clear, you'd like to check for more than the Op+MOVH? There's not a lot to the expansion with these. The only thing I left out of the checks is the loads to feed the POW. The rest is epilogue code. I suspect you missed the change I made to check for the MOVH, but maybe I'm mistaken? Or maybe you're suggesting a more complicated vector operation? One where we have to generate two scalar operations and then shuffle them back together? I could see some value in that. %bb.0: # %entry pushq %rax .cfi_def_cfa_offset 16 movsd .LCPI0_0(%rip), %xmm0 # xmm0 = mem[0],zero movsd .LCPI0_1(%rip), %xmm1 # xmm1 = mem[0],zero callq pow movlhps %xmm0, %xmm0 # xmm0 = xmm0[0,0] popq %rax .cfi_def_cfa_offset 8 retq

In D47491#1129607, @cameron.mcinally wrote:

In D47491#1129202, @craig.topper wrote:

Non-string RINT and NEARBYINT should also be Expand on pre-SSE4.1 targets. So that would be mean we should expand the strict versions too.

Ah, you're right. I do not have tests for those locally. I'll add a few.

Looking closer... the math lib calls are also expanded.

New patch for conversation...

Added more vector tests: -> Arith ops remain vector -> Math lib ops are expanded -> Rounding ops are expanded

TODO: Work on the CHECKs to make sure everything is covered.

Are you aware of the update_llc_test_checks.py script that will autogenerate the CHECKs for you?

Update diff after running update_llc_test_checks.py.

In D47491#1129648, @craig.topper wrote:

Are you aware of the update_llc_test_checks.py script that will autogenerate the CHECKs for you?

Oh, nice. Thanks!

I don't get a lot of opportunities to upstream changes, so it's safe to assume my LLVM knowledge is at least 5 years old. ;)

It feels like we should be testing non-splat vectors here so we get two calls to the library functions in the output. Rather than one call and a reuse of the result.

In D47491#1129736, @craig.topper wrote:

It feels like we should be testing non-splat vectors here so we get two calls to the library functions in the output. Rather than one call and a reuse of the result.

Ok, that seems reasonable...

Would you be okay with somewhat random input to the upper operation? It should be okay functionally, since only the first trap is reported. I ask since it can be tricky to find inputs that can trap for each of the different operations. There are finicky details that are unique to each operation. Seems unnecessarily cumbersome IMO. Thoughts?

Update tests to remove splat inputs.

Notice that the input values are not checked, so there's a modicum of false security there. The low/high parts of the result could be incorrectly swapped and the test would not catch it. Other than that, the test is more robust since two operations are generated for expanded operations.

LGTM. And I spoke to Andrew at work and he says it looks good to him to.

This revision is now accepted and ready to land.Jun 12 2018, 2:59 PM

Closed by commit rL334603: [FPEnv] Expand constrained FP operations (authored by mcinally). · Explain WhyJun 13 2018, 7:36 AM

This revision was automatically updated to reflect the committed changes.

cameron.mcinally mentioned this in D48149: Expand constrained FP POWI.Jun 13 2018, 1:29 PM

Diff 151163

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

Show First 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	class VectorLegalizer {
SDValue ExpandSELECT(SDValue Op);		SDValue ExpandSELECT(SDValue Op);
SDValue ExpandLoad(SDValue Op);		SDValue ExpandLoad(SDValue Op);
SDValue ExpandStore(SDValue Op);		SDValue ExpandStore(SDValue Op);
SDValue ExpandFNEG(SDValue Op);		SDValue ExpandFNEG(SDValue Op);
SDValue ExpandFSUB(SDValue Op);		SDValue ExpandFSUB(SDValue Op);
SDValue ExpandBITREVERSE(SDValue Op);		SDValue ExpandBITREVERSE(SDValue Op);
SDValue ExpandCTLZ(SDValue Op);		SDValue ExpandCTLZ(SDValue Op);
SDValue ExpandCTTZ_ZERO_UNDEF(SDValue Op);		SDValue ExpandCTTZ_ZERO_UNDEF(SDValue Op);
		SDValue ExpandStrictFPOp(SDValue Op);

/// Implements vector promotion.		/// Implements vector promotion.
///		///
/// This is essentially just bitcasting the operands to a different type and		/// This is essentially just bitcasting the operands to a different type and
/// bitcasting the result back to the original type.		/// bitcasting the result back to the original type.
SDValue Promote(SDValue Op);		SDValue Promote(SDValue Op);

/// Implements [SU]INT_TO_FP vector promotion.		/// Implements [SU]INT_TO_FP vector promotion.
///		///
▲ Show 20 Lines • Show All 141 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::LegalizeOp(SDValue Op) {

for (SDNode::value_iterator J = Node->value_begin(), E = Node->value_end();		for (SDNode::value_iterator J = Node->value_begin(), E = Node->value_end();
J != E;		J != E;
++J)		++J)
HasVectorValue \|= J->isVector();		HasVectorValue \|= J->isVector();
if (!HasVectorValue)		if (!HasVectorValue)
return TranslateLegalizeResults(Op, Result);		return TranslateLegalizeResults(Op, Result);

EVT QueryType;		TargetLowering::LegalizeAction Action = TargetLowering::Legal;
switch (Op.getOpcode()) {		switch (Op.getOpcode()) {
default:		default:
return TranslateLegalizeResults(Op, Result);		return TranslateLegalizeResults(Op, Result);
		case ISD::STRICT_FSQRT:
		case ISD::STRICT_FMA:
		case ISD::STRICT_FPOW:
		case ISD::STRICT_FPOWI:
		case ISD::STRICT_FSIN:
		case ISD::STRICT_FCOS:
		case ISD::STRICT_FEXP:
		case ISD::STRICT_FEXP2:
		case ISD::STRICT_FLOG:
		case ISD::STRICT_FLOG10:
		case ISD::STRICT_FLOG2:
		case ISD::STRICT_FRINT:
		case ISD::STRICT_FNEARBYINT:
		// These pseudo-ops get legalized as if they were their non-strict
		// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT
		// is also legal, but if ISD::FSQRT requires expansion then so does
		// ISD::STRICT_FSQRT.
		Action = TLI.getStrictFPOperationAction(Node->getOpcode(),
		Node->getValueType(0));
		break;
case ISD::ADD:		case ISD::ADD:
case ISD::SUB:		case ISD::SUB:
case ISD::MUL:		case ISD::MUL:
case ISD::SDIV:		case ISD::SDIV:
case ISD::UDIV:		case ISD::UDIV:
case ISD::SREM:		case ISD::SREM:
case ISD::UREM:		case ISD::UREM:
case ISD::SDIVREM:		case ISD::SDIVREM:
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
case ISD::ZERO_EXTEND_VECTOR_INREG:		case ISD::ZERO_EXTEND_VECTOR_INREG:
case ISD::SMIN:		case ISD::SMIN:
case ISD::SMAX:		case ISD::SMAX:
case ISD::UMIN:		case ISD::UMIN:
case ISD::UMAX:		case ISD::UMAX:
case ISD::SMUL_LOHI:		case ISD::SMUL_LOHI:
case ISD::UMUL_LOHI:		case ISD::UMUL_LOHI:
case ISD::FCANONICALIZE:		case ISD::FCANONICALIZE:
QueryType = Node->getValueType(0);		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
case ISD::FP_ROUND_INREG:		case ISD::FP_ROUND_INREG:
QueryType = cast<VTSDNode>(Node->getOperand(1))->getVT();		Action = TLI.getOperationAction(Node->getOpcode(),
		cast<VTSDNode>(Node->getOperand(1))->getVT());
break;		break;
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
QueryType = Node->getOperand(0).getValueType();		Action = TLI.getOperationAction(Node->getOpcode(),
		Node->getOperand(0).getValueType());
break;		break;
case ISD::MSCATTER:		case ISD::MSCATTER:
QueryType = cast<MaskedScatterSDNode>(Node)->getValue().getValueType();		Action = TLI.getOperationAction(Node->getOpcode(),
		cast<MaskedScatterSDNode>(Node)->getValue().getValueType());
break;		break;
case ISD::MSTORE:		case ISD::MSTORE:
QueryType = cast<MaskedStoreSDNode>(Node)->getValue().getValueType();		Action = TLI.getOperationAction(Node->getOpcode(),
		cast<MaskedStoreSDNode>(Node)->getValue().getValueType());
break;		break;
}		}

LLVM_DEBUG(dbgs() << "\nLegalizing vector op: "; Node->dump(&DAG));		LLVM_DEBUG(dbgs() << "\nLegalizing vector op: "; Node->dump(&DAG));

switch (TLI.getOperationAction(Node->getOpcode(), QueryType)) {		switch (Action) {
default: llvm_unreachable("This action is not supported yet!");		default: llvm_unreachable("This action is not supported yet!");
case TargetLowering::Promote:		case TargetLowering::Promote:
Result = Promote(Op);		Result = Promote(Op);
Changed = true;		Changed = true;
break;		break;
case TargetLowering::Legal:		case TargetLowering::Legal:
LLVM_DEBUG(dbgs() << "Legal node: nothing to do\n");		LLVM_DEBUG(dbgs() << "Legal node: nothing to do\n");
break;		break;
▲ Show 20 Lines • Show All 296 Lines • ▼ Show 20 Lines	case ISD::SETCC:
return UnrollVSETCC(Op);		return UnrollVSETCC(Op);
case ISD::BITREVERSE:		case ISD::BITREVERSE:
return ExpandBITREVERSE(Op);		return ExpandBITREVERSE(Op);
case ISD::CTLZ:		case ISD::CTLZ:
case ISD::CTLZ_ZERO_UNDEF:		case ISD::CTLZ_ZERO_UNDEF:
return ExpandCTLZ(Op);		return ExpandCTLZ(Op);
case ISD::CTTZ_ZERO_UNDEF:		case ISD::CTTZ_ZERO_UNDEF:
return ExpandCTTZ_ZERO_UNDEF(Op);		return ExpandCTTZ_ZERO_UNDEF(Op);
		case ISD::STRICT_FSQRT:
		case ISD::STRICT_FMA:
		case ISD::STRICT_FPOW:
		case ISD::STRICT_FSIN:
		case ISD::STRICT_FCOS:
		case ISD::STRICT_FEXP:
		case ISD::STRICT_FEXP2:
		case ISD::STRICT_FLOG:
		case ISD::STRICT_FLOG10:
		case ISD::STRICT_FLOG2:
		case ISD::STRICT_FRINT:
		case ISD::STRICT_FNEARBYINT:
		return ExpandStrictFPOp(Op);
default:		default:
return DAG.UnrollVectorOp(Op.getNode());		return DAG.UnrollVectorOp(Op.getNode());
}		}
}		}

SDValue VectorLegalizer::ExpandSELECT(SDValue Op) {		SDValue VectorLegalizer::ExpandSELECT(SDValue Op) {
// Lower a select instruction where the condition is a scalar and the		// Lower a select instruction where the condition is a scalar and the
// operands are vectors. Lower this select to VSELECT and implement it		// operands are vectors. Lower this select to VSELECT and implement it
▲ Show 20 Lines • Show All 370 Lines • ▼ Show 20 Lines	if (TLI.isOperationLegalOrCustom(ISD::CTTZ, Op.getValueType())) {
SDLoc DL(Op);		SDLoc DL(Op);
return DAG.getNode(ISD::CTTZ, DL, Op.getValueType(), Op.getOperand(0));		return DAG.getNode(ISD::CTTZ, DL, Op.getValueType(), Op.getOperand(0));
}		}

// Otherwise go ahead and unroll.		// Otherwise go ahead and unroll.
return DAG.UnrollVectorOp(Op.getNode());		return DAG.UnrollVectorOp(Op.getNode());
}		}

		SDValue VectorLegalizer::ExpandStrictFPOp(SDValue Op) {
		EVT VT = Op.getValueType();
		EVT EltVT = VT.getVectorElementType();
		unsigned NumElems = VT.getVectorNumElements();
		unsigned NumOpers = Op.getNumOperands();
		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
		EVT ValueVTs[] = {EltVT, MVT::Other};
		SDValue Chain = Op.getOperand(0);
		SDLoc dl(Op);

		SmallVector<SDValue, 8> OpValues;
		SmallVector<SDValue, 8> OpChains;
		for (unsigned i = 0; i < NumElems; ++i) {
		SmallVector<SDValue, 4> Opers;
		SDValue Idx = DAG.getConstant(i, dl,
		TLI.getVectorIdxTy(DAG.getDataLayout()));

		// The Chain is the first operand.
		Opers.push_back(Chain);

		// Now process the remaining operands.
		for (unsigned j = 1; j < NumOpers; ++j) {
		SDValue Oper = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl,
		EltVT, Op.getOperand(j), Idx);
		Opers.push_back(Oper);
		}

		SDValue ScalarOp = DAG.getNode(Op->getOpcode(), dl, ValueVTs, Opers);

		OpValues.push_back(ScalarOp.getValue(0));
		OpChains.push_back(ScalarOp.getValue(1));
		}

		SDValue Result = DAG.getBuildVector(VT, dl, OpValues);
		SDValue NewChain = DAG.getNode(ISD::TokenFactor, dl, MVT::Other, OpChains);

		AddLegalizedOperand(Op.getValue(0), Result);
		AddLegalizedOperand(Op.getValue(1), NewChain);

		return NewChain;
		}

SDValue VectorLegalizer::UnrollVSETCC(SDValue Op) {		SDValue VectorLegalizer::UnrollVSETCC(SDValue Op) {
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();
unsigned NumElems = VT.getVectorNumElements();		unsigned NumElems = VT.getVectorNumElements();
EVT EltVT = VT.getVectorElementType();		EVT EltVT = VT.getVectorElementType();
SDValue LHS = Op.getOperand(0), RHS = Op.getOperand(1), CC = Op.getOperand(2);		SDValue LHS = Op.getOperand(0), RHS = Op.getOperand(1), CC = Op.getOperand(2);
EVT TmpEltVT = LHS.getValueType().getVectorElementType();		EVT TmpEltVT = LHS.getValueType().getVectorElementType();
SDLoc dl(Op);		SDLoc dl(Op);
SmallVector<SDValue, 8> Ops(NumElems);		SmallVector<SDValue, 8> Ops(NumElems);
Show All 22 Lines

llvm/trunk/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -O3 -mtriple=x86_64-pc-linux < %s \| FileCheck %s

				define <2 x double> @constrained_vector_fdiv() {
				; CHECK-LABEL: constrained_vector_fdiv:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: movapd {{.*#+}} xmm0 = [1.000000e+00,2.000000e+00]
				; CHECK-NEXT: divpd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: retq
				entry:
				%div = call <2 x double> @llvm.experimental.constrained.fdiv.v2f64(
				<2 x double> <double 1.000000e+00, double 2.000000e+00>,
				<2 x double> <double 1.000000e+01, double 1.000000e+01>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %div
				}

				define <2 x double> @constrained_vector_fmul(<2 x double> %a) {
				; CHECK-LABEL: constrained_vector_fmul:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: movapd {{.*#+}} xmm0 = [1.797693e+308,1.797693e+308]
				; CHECK-NEXT: mulpd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: retq
				entry:
				%mul = call <2 x double> @llvm.experimental.constrained.fmul.v2f64(
				<2 x double> <double 0x7FEFFFFFFFFFFFFF, double 0x7FEFFFFFFFFFFFFF>,
				<2 x double> <double 2.000000e+00, double 3.000000e+00>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %mul
				}

				define <2 x double> @constrained_vector_fadd() {
				; CHECK-LABEL: constrained_vector_fadd:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: movapd {{.*#+}} xmm0 = [1.797693e+308,1.797693e+308]
				; CHECK-NEXT: addpd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: retq
				entry:
				%add = call <2 x double> @llvm.experimental.constrained.fadd.v2f64(
				<2 x double> <double 0x7FEFFFFFFFFFFFFF, double 0x7FEFFFFFFFFFFFFF>,
				<2 x double> <double 1.000000e+00, double 1.000000e-01>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %add
				}

				define <2 x double> @constrained_vector_fsub() {
				; CHECK-LABEL: constrained_vector_fsub:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: movapd {{.*#+}} xmm0 = [-1.797693e+308,-1.797693e+308]
				; CHECK-NEXT: subpd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: retq
				entry:
				%sub = call <2 x double> @llvm.experimental.constrained.fsub.v2f64(
				<2 x double> <double 0xFFEFFFFFFFFFFFFF, double 0xFFEFFFFFFFFFFFFF>,
				<2 x double> <double 1.000000e+00, double 1.000000e-01>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %sub
				}

				define <2 x double> @constrained_vector_sqrt() {
				; CHECK-LABEL: constrained_vector_sqrt:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: sqrtpd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: retq
				entry:
				%sqrt = call <2 x double> @llvm.experimental.constrained.sqrt.v2f64(
				<2 x double> <double 42.0, double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %sqrt
				}

				define <2 x double> @constrained_vector_pow() {
				; CHECK-LABEL: constrained_vector_pow:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: movsd {{.*#+}} xmm1 = mem[0],zero
				; CHECK-NEXT: callq pow
				; CHECK-NEXT: movaps %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: movsd {{.*#+}} xmm1 = mem[0],zero
				; CHECK-NEXT: callq pow
				; CHECK-NEXT: unpcklpd (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%pow = call <2 x double> @llvm.experimental.constrained.pow.v2f64(
				<2 x double> <double 42.1, double 42.2>,
				<2 x double> <double 3.0, double 3.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %pow
				}

				define <2 x double> @constrained_vector_sin() {
				; CHECK-LABEL: constrained_vector_sin:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq sin
				; CHECK-NEXT: movaps %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq sin
				; CHECK-NEXT: unpcklpd (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%sin = call <2 x double> @llvm.experimental.constrained.sin.v2f64(
				<2 x double> <double 42.0, double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %sin
				}

				define <2 x double> @constrained_vector_cos() {
				; CHECK-LABEL: constrained_vector_cos:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq cos
				; CHECK-NEXT: movaps %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq cos
				; CHECK-NEXT: unpcklpd (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%cos = call <2 x double> @llvm.experimental.constrained.cos.v2f64(
				<2 x double> <double 42.0, double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %cos
				}

				define <2 x double> @constrained_vector_exp() {
				; CHECK-LABEL: constrained_vector_exp:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq exp
				; CHECK-NEXT: movaps %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq exp
				; CHECK-NEXT: unpcklpd (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%exp = call <2 x double> @llvm.experimental.constrained.exp.v2f64(
				<2 x double> <double 42.0, double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %exp
				}

				define <2 x double> @constrained_vector_exp2() {
				; CHECK-LABEL: constrained_vector_exp2:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq exp2
				; CHECK-NEXT: movaps %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq exp2
				; CHECK-NEXT: unpcklpd (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%exp2 = call <2 x double> @llvm.experimental.constrained.exp2.v2f64(
				<2 x double> <double 42.1, double 42.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %exp2
				}

				define <2 x double> @constrained_vector_log() {
				; CHECK-LABEL: constrained_vector_log:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq log
				; CHECK-NEXT: movaps %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq log
				; CHECK-NEXT: unpcklpd (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%log = call <2 x double> @llvm.experimental.constrained.log.v2f64(
				<2 x double> <double 42.0, double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %log
				}

				define <2 x double> @constrained_vector_log10() {
				; CHECK-LABEL: constrained_vector_log10:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq log10
				; CHECK-NEXT: movaps %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq log10
				; CHECK-NEXT: unpcklpd (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%log10 = call <2 x double> @llvm.experimental.constrained.log10.v2f64(
				<2 x double> <double 42.0, double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %log10
				}

				define <2 x double> @constrained_vector_log2() {
				; CHECK-LABEL: constrained_vector_log2:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq log2
				; CHECK-NEXT: movaps %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq log2
				; CHECK-NEXT: unpcklpd (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%log2 = call <2 x double> @llvm.experimental.constrained.log2.v2f64(
				<2 x double> <double 42.0, double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %log2
				}

				define <2 x double> @constrained_vector_rint() {
				; CHECK-LABEL: constrained_vector_rint:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq rint
				; CHECK-NEXT: movaps %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq rint
				; CHECK-NEXT: unpcklpd (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%rint = call <2 x double> @llvm.experimental.constrained.rint.v2f64(
				<2 x double> <double 42.1, double 42.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %rint
				}

				define <2 x double> @constrained_vector_nearbyint() {
				; CHECK-LABEL: constrained_vector_nearbyint:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq nearbyint
				; CHECK-NEXT: movaps %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq nearbyint
				; CHECK-NEXT: unpcklpd (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%nearby = call <2 x double> @llvm.experimental.constrained.nearbyint.v2f64(
				<2 x double> <double 42.1, double 42.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x double> %nearby
				}


				declare <2 x double> @llvm.experimental.constrained.fdiv.v2f64(<2 x double>, <2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.fmul.v2f64(<2 x double>, <2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.fadd.v2f64(<2 x double>, <2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.fsub.v2f64(<2 x double>, <2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.sqrt.v2f64(<2 x double>, metadata, metadata)
				declare <4 x double> @llvm.experimental.constrained.sqrt.v4f64(<4 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.pow.v2f64(<2 x double>, <2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.sin.v2f64(<2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.cos.v2f64(<2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.exp.v2f64(<2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.exp2.v2f64(<2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.log.v2f64(<2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.log10.v2f64(<2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.log2.v2f64(<2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.rint.v2f64(<2 x double>, metadata, metadata)
				declare <2 x double> @llvm.experimental.constrained.nearbyint.v2f64(<2 x double>, metadata, metadata)

This is an archive of the discontinued LLVM Phabricator instance.

Expand constrained FP operations
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 151163

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

llvm/trunk/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

This is an archive of the discontinued LLVM Phabricator instance.

Expand constrained FP operationsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 151163

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

llvm/trunk/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

Expand constrained FP operations
ClosedPublic