This is an archive of the discontinued LLVM Phabricator instance.

[Strict FP] Allow custom operation actions
ClosedPublic

Authored by uweigand on Jul 24 2019, 10:22 AM.

Download Raw Diff

Details

Reviewers

andrew.w.kaylor
cameron.mcinally
kpn
hfinkel
craig.topper

Commits

rG7b24dd741c6c: [Strict FP] Allow custom operation actions
rL368012: [Strict FP] Allow custom operation actions

Summary

This patch changes the DAG legalizer to respect the operation actions set by the target for strict floating-point operations. (Currently, the legalizer will usually fall back to mutate to the non-strict action (which is assumed to be legal), and only skip mutation if the strict operation is marked legal.)

With this patch, if whenever a strict operation is marked as Legal or Custom, it is passed to the target as usual. Only if it is marked as Expand will the legalizer attempt to mutate to the non-strict operation. Note that this will now fail if the non-strict operation is itself marked as Custom -- the target will have to provide a Custom definition for the strict operation then as well.

Diff Detail

Repository: rL LLVM

Event Timeline

uweigand created this revision.Jul 24 2019, 10:22 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2019, 10:23 AM

Herald added subscribers: llvm-commits, jsji, MaskRay, nemanjai. · View Herald Transcript

uweigand mentioned this in D63782: [FPEnv] Add fptosi and fptoui constrained intrinsics.Jul 24 2019, 10:26 AM

kpn added inline comments.Jul 24 2019, 11:39 AM

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2790 ↗	(On Diff #211537)	In what way does it not honor the strict properties? Also, wouldn't it be a bug if we were asked to expand a strict node when the non-strict is legal?

Herald added a subscriber: • wuzish. · View Herald TranscriptJul 24 2019, 11:39 AM

It looks like we're unrolling some vectors where before we weren't. That seems unfortunate. Is that the reason for the generated code quality regressions?

In D65226#1599743, @kpn wrote:

It looks like we're unrolling some vectors where before we weren't. That seems unfortunate. Is that the reason for the generated code quality regressions?

+1. What's the reason behind the scalarization?

In D65226#1600009, @cameron.mcinally wrote:

In D65226#1599743, @kpn wrote:

It looks like we're unrolling some vectors where before we weren't. That seems unfortunate. Is that the reason for the generated code quality regressions?

+1. What's the reason behind the scalarization?

Those are cases where the non-strict operation actually is not legal: e.g. the X86 target sets the operation action for FADD and FSUB vector operations to Custom. The old code simply ignored that and just emitted FADD and FSUB anyway as if they were legal, and only by chance did they match an isel pattern anyway.

The target can get the old behavior back by simply handling STRICT_FADD and STRICT_FSUB directly (presumably also via Custom operations similar to ones it uses for FADD/FSUB), so this is only a "regression" for the current fallback implementation.

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2790 ↗	(On Diff #211537)	In what way does it not honor the strict properties? Well, the expansion is done by using a truncating store followed by a load. The truncating store is a non-strict operation which will not raise exceptions. Also, wouldn't it be a bug if we were asked to expand a strict node when the non-strict is legal? Why would that be a bug? That's the current default behavior on all targets that don't (yet) support strict operations. (All strict operations default to expand, which is taken to mean replace by the non-strict version.)

In D65226#1600831, @uweigand wrote:

In D65226#1600009, @cameron.mcinally wrote:

In D65226#1599743, @kpn wrote:

It looks like we're unrolling some vectors where before we weren't. That seems unfortunate. Is that the reason for the generated code quality regressions?

+1. What's the reason behind the scalarization?

Those are cases where the non-strict operation actually is not legal: e.g. the X86 target sets the operation action for FADD and FSUB vector operations to Custom. The old code simply ignored that and just emitted FADD and FSUB anyway as if they were legal, and only by chance did they match an isel pattern anyway.

The target can get the old behavior back by simply handling STRICT_FADD and STRICT_FSUB directly (presumably also via Custom operations similar to ones it uses for FADD/FSUB), so this is only a "regression" for the current fallback implementation.

Ah, ok. So I think the x86 case is a little more subtle than just *not legal*. The backend checks for *some* Custom lowerings (horizontal ops), but ultimately the non-strict vector FADD/FSUB are legal on x86. The Custom lowering code just returns the original op.

In D65226#1601309, @cameron.mcinally wrote:

In D65226#1600831, @uweigand wrote:

In D65226#1600009, @cameron.mcinally wrote:

In D65226#1599743, @kpn wrote:

It looks like we're unrolling some vectors where before we weren't. That seems unfortunate. Is that the reason for the generated code quality regressions?

+1. What's the reason behind the scalarization?

Those are cases where the non-strict operation actually is not legal: e.g. the X86 target sets the operation action for FADD and FSUB vector operations to Custom. The old code simply ignored that and just emitted FADD and FSUB anyway as if they were legal, and only by chance did they match an isel pattern anyway.

The target can get the old behavior back by simply handling STRICT_FADD and STRICT_FSUB directly (presumably also via Custom operations similar to ones it uses for FADD/FSUB), so this is only a "regression" for the current fallback implementation.

Ah, ok. So I think the x86 case is a little more subtle than just *not legal*. The backend checks for *some* Custom lowerings (horizontal ops), but ultimately the non-strict vector FADD/FSUB are legal on x86. The Custom lowering code just returns the original op.

Yes, exactly. But there's no way common code can know that this is what the Custom lowering code does; in general, it is not OK to just pass an opcode to isel if the target classifies the op as Custom. But as I said, once the target actually handles the STRICT_ opcodes (either Custom or maybe even Legal, if that's what the target wants), then any regression will go away. (And as long as the target *doesn't* handle them, they don't really implement the strict semantics anyway and shouldn't be used for anything "real" anyway.)

I don't see a problem with scalarizing strict ops for Custom lowered nodes that would otherwise be legal. It's not ideal, I suppose, but if a target doesn't support the strict nodes in the backend, then it probably shouldn't be using the experimental intrinsics anyway.

I don't really have a strong opinion on this though, so will leave it to others that may have one...

So the PowerPC code regressions will be fixed once the strict tickets make it into the tree it sounds like.

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2790 ↗	(On Diff #211537)	In what way does it not honor the strict properties? Well, the expansion is done by using a truncating store followed by a load. The truncating store is a non-strict operation which will not raise exceptions. Is it documented anywhere that the truncating store will not raise exceptions? It seems weird to me since for IEEE targets a truncating store must change the format of the bits anyway. What if an exponent is out of range of the smaller type, for example? How many non-IEEE targets are there? I'm well aware that hardware is still very current that supports S/370's radix 16 FP, plus the radix 10 FP, but are there any other current forms of not-at-all-IEEE floating point? A different angle: if a target cannot use EmitStackConvert(), then isn't it that target's responsibility to make sure this code path is never used? In that case we wouldn't need this subtle code here. Do we even have any tests for EmitStackConvert()? I remember having a very hard time finding one. Also, wouldn't it be a bug if we were asked to expand a strict node when the non-strict is legal? Why would that be a bug? That's the current default behavior on all targets that don't (yet) support strict operations. (All strict operations default to expand, which is taken to mean replace by the non-strict version.) It sounds like we're changing what Expand means for strict operations. Previously it meant the same thing as it does for non-strict operations: use the fallback/default expansion. And there's plenty of code in place to do those expansions in ways that preserve the strictness. The strict and non-strict nodes have followed the same paths for the most part with the exception of strict { Custom, Promote } -> Expand. Long term, on targets that support strict floating point, do we want Expand to have different meanings for strict and non-strict nodes? It worries me if they're different.
3707 ↗	(On Diff #211537)	Say, we can't emit libcalls for any old random thing. If we simply return false here then it will try -- and fail -- to emit a libcall. And failing to emit a libcall is _not_ fatal I'm pretty sure. So no tricky returning true when we didn't actually expand anything is needed here.

In D65226#1601520, @kpn wrote:

So the PowerPC code regressions will be fixed once the strict tickets make it into the tree it sounds like.

Yes, exactly.

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2790 ↗	(On Diff #211537)	Is it documented anywhere that the truncating store will not raise exceptions? Common code assumes that the truncating store will not raise FP exceptions, just like it assumes any DAG operation, except for the STRICT_... ones, will not raise FP exceptions. (Now, since it is a store, common code assumes that it may raise an exception because of memory access faults, but then again, in this specific use case, common code will recognize that the store targets a stack slot which can never fault.) Therefore, common code might e.g. speculate the truncating store outside of a condition -- which would be incorrect if indeed FP exceptions are turned on. Similarly, common code might schedule this truncating store across a function call that might change rounding modes. I guess a way to correctly implement this transformation would be to expand a STRICT_FP_ROUND into a "strict" version of a truncating store (which currently does not exist). But I don't think it would be worthwhile to add this until and unless this is actually useful on some target. It sounds like we're changing what Expand means for strict operations. Well, I guess that's true in the sense that for now, Expand for strict operations allows for two implementations: either, implement the precise semantics of the operation in terms of other operations (this is the "traditional" meaning of Expand, and is in some cases doable for strict operations too, but usually only if you can implement a strict operations in terms of other strict operations); or, fall back to the non-strict semantics if strict semantics are not possible because the target doesn't (yet) implement any of those at all At some point in the future, we want the second option to go away, but we're not there yet. But on targets that do implement strict semantics in the backend, common code should never use that second option. And there's plenty of code in place to do those expansions in ways that preserve the strictness. Not really? All the existing expansions end up in solely non-strict DAG opcodes, so how could they preserve the strictness? Long term, on targets that support strict floating point, do we want Expand to have different meanings for strict and non-strict nodes? It worries me if they're different. No, long term the meaning should be the same again. The fallback is the short-term thing.

kpn added inline comments.Jul 26 2019, 9:18 AM

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2790 ↗	(On Diff #211537)	Common code assumes that the truncating store will not raise FP exceptions, just like it assumes any DAG operation, except for the STRICT_... ones, will not raise FP exceptions. (Now, since it is a store, common code assumes that it may raise an exception because of memory access faults, but then again, in this specific use case, common code will recognize that the store targets a stack slot which can never fault.) Therefore, common code might e.g. speculate the truncating store outside of a condition -- which would be incorrect if indeed FP exceptions are turned on. Similarly, common code might schedule this truncating store across a function call that might change rounding modes. In the STRICT EmitStackConvert() case the store+load are chained together _and_ that chain is spliced into the chain where the STRICT node was formerly located. Does common code do transformations to the SDAG that reorder the chain? I thought the point of the chain was to enforce ordering. I guess a way to correctly implement this transformation would be to expand a STRICT_FP_ROUND into a "strict" version of a truncating store (which currently does not exist). But I don't think it would be worthwhile to add this until and unless this is actually useful on some target. Agreed.

uweigand marked an inline comment as done.Jul 26 2019, 10:35 AM

uweigand added inline comments.

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2790 ↗	(On Diff #211537)	In the STRICT EmitStackConvert() case the store+load are chained together _and_ that chain is spliced into the chain where the STRICT node was formerly located. Does common code do transformations to the SDAG that reorder the chain? I thought the point of the chain was to enforce ordering. That's true on the SDAG, yes. But once the store is translated to MI, it will not be marked as potentially raising an FP exception (like strict MIs are), and therefore it might reordered by the MI schedulers. The difference to a (hypothetical) strict truncating store would be that the latter would get translated to MI with the mayRaiseFPException flag on.

kpn added inline comments.Jul 26 2019, 11:28 AM

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2790 ↗	(On Diff #211537)	Should this expansion be there at all, then? Right now if this expansion is used we silently emit non-strict code. Shouldn't it just outright fail, or fail to expand at least? Same deal with STRICT_FP_EXTEND below.

uweigand marked an inline comment as done.Jul 26 2019, 11:49 AM

uweigand added inline comments.

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2790 ↗	(On Diff #211537)	I guess it is there to ensure that the (currently still default) fallback to expand strict to non-strict continues to work on targets where FP_TRUNC is not legal. I agree this is somewhat questionable; I'd be fine with this expansion going away. Use of a constrained truncate/expand intrinsic on such target should then fail ...

pengfei added a subscriber: pengfei.Jul 31 2019, 6:48 PM

Ping?

Should we move ahead with this? I believe this is still a pre-req for D63782 ...

In D65226#1615137, @uweigand wrote:

Ping?

Should we move ahead with this? I believe this is still a pre-req for D63782 ...

I have no objections. I'll defer to one of the other reviewers.

I'm still not thrilled with the tricky doing nothing but returning true code like I mentioned, but I don't think that should be enough to hold things up.

In D65226#1615192, @kpn wrote:

In D65226#1615137, @uweigand wrote:

Ping?

Should we move ahead with this? I believe this is still a pre-req for D63782 ...

I have no objections. I'll defer to one of the other reviewers.

I'm still not thrilled with the tricky doing nothing but returning true code like I mentioned, but I don't think that should be enough to hold things up.

LGTM. It does not look like there are any outstanding objections.

This revision is now accepted and ready to land.Aug 5 2019, 11:16 AM

Closed by commit rL368012: [Strict FP] Allow custom operation actions (authored by uweigand). · Explain WhyAug 6 2019, 3:42 AM

This revision was automatically updated to reflect the committed changes.

Thanks, Hal. I agree we'll need to clean up the "fallback" Expand logic once targets have moved to actually supporting strict FP nodes.

pengfei mentioned this in D70226: Add an option to disable strict float node mutating to an normal float node.Nov 18 2019, 6:56 AM

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

TargetLowering.h

11 lines

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

57 lines

LegalizeVectorOps.cpp

23 lines

test/

CodeGen/

PowerPC/

vector-constrained-fp-intrinsics.ll

726 lines

X86/

vector-constrained-fp-intrinsics.ll

109 lines

Diff 213569

llvm/trunk/include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 924 Lines • ▼ Show 20 Lines	LegalizeAction getFixedPointOperationAction(unsigned Op, EVT VT,
case ISD::UMULFIX:		case ISD::UMULFIX:
Supported = isSupportedFixedPointOperation(Op, VT, Scale);		Supported = isSupportedFixedPointOperation(Op, VT, Scale);
break;		break;
}		}

return Supported ? Action : Expand;		return Supported ? Action : Expand;
}		}

		// If Op is a strict floating-point operation, return the result
		// of getOperationAction for the equivalent non-strict operation.
LegalizeAction getStrictFPOperationAction(unsigned Op, EVT VT) const {		LegalizeAction getStrictFPOperationAction(unsigned Op, EVT VT) const {
unsigned EqOpc;		unsigned EqOpc;
switch (Op) {		switch (Op) {
default: llvm_unreachable("Unexpected FP pseudo-opcode");		default: llvm_unreachable("Unexpected FP pseudo-opcode");
case ISD::STRICT_FADD: EqOpc = ISD::FADD; break;		case ISD::STRICT_FADD: EqOpc = ISD::FADD; break;
case ISD::STRICT_FSUB: EqOpc = ISD::FSUB; break;		case ISD::STRICT_FSUB: EqOpc = ISD::FSUB; break;
case ISD::STRICT_FMUL: EqOpc = ISD::FMUL; break;		case ISD::STRICT_FMUL: EqOpc = ISD::FMUL; break;
case ISD::STRICT_FDIV: EqOpc = ISD::FDIV; break;		case ISD::STRICT_FDIV: EqOpc = ISD::FDIV; break;
Show All 16 Lines	switch (Op) {
case ISD::STRICT_FCEIL: EqOpc = ISD::FCEIL; break;		case ISD::STRICT_FCEIL: EqOpc = ISD::FCEIL; break;
case ISD::STRICT_FFLOOR: EqOpc = ISD::FFLOOR; break;		case ISD::STRICT_FFLOOR: EqOpc = ISD::FFLOOR; break;
case ISD::STRICT_FROUND: EqOpc = ISD::FROUND; break;		case ISD::STRICT_FROUND: EqOpc = ISD::FROUND; break;
case ISD::STRICT_FTRUNC: EqOpc = ISD::FTRUNC; break;		case ISD::STRICT_FTRUNC: EqOpc = ISD::FTRUNC; break;
case ISD::STRICT_FP_ROUND: EqOpc = ISD::FP_ROUND; break;		case ISD::STRICT_FP_ROUND: EqOpc = ISD::FP_ROUND; break;
case ISD::STRICT_FP_EXTEND: EqOpc = ISD::FP_EXTEND; break;		case ISD::STRICT_FP_EXTEND: EqOpc = ISD::FP_EXTEND; break;
}		}

auto Action = getOperationAction(EqOpc, VT);		return getOperationAction(EqOpc, VT);

// We don't currently handle Custom or Promote for strict FP pseudo-ops.
// For now, we just expand for those cases.
if (Action != Legal)
Action = Expand;

return Action;
}		}

/// Return true if the specified operation is legal on this target or can be		/// Return true if the specified operation is legal on this target or can be
/// made legal with custom lowering. This is used to help guide high-level		/// made legal with custom lowering. This is used to help guide high-level
/// lowering decisions.		/// lowering decisions.
bool isOperationLegalOrCustom(unsigned Op, EVT VT) const {		bool isOperationLegalOrCustom(unsigned Op, EVT VT) const {
return (VT == MVT::Other \|\| isTypeLegal(VT)) &&		return (VT == MVT::Other \|\| isTypeLegal(VT)) &&
(getOperationAction(Op, VT) == Legal \|\|		(getOperationAction(Op, VT) == Legal \|\|
▲ Show 20 Lines • Show All 3,195 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 1,091 Lines • ▼ Show 20 Lines	if (Action == TargetLowering::Expand) {
SDValue NewVal;		SDValue NewVal;
NewVal = DAG.getNode(ISD::TRAP, SDLoc(Node), Node->getVTList(),		NewVal = DAG.getNode(ISD::TRAP, SDLoc(Node), Node->getVTList(),
Node->getOperand(0));		Node->getOperand(0));
ReplaceNode(Node, NewVal.getNode());		ReplaceNode(Node, NewVal.getNode());
LegalizeOp(NewVal.getNode());		LegalizeOp(NewVal.getNode());
return;		return;
}		}
break;		break;
case ISD::STRICT_FADD:
case ISD::STRICT_FSUB:
case ISD::STRICT_FMUL:
case ISD::STRICT_FDIV:
case ISD::STRICT_FREM:
case ISD::STRICT_FSQRT:
case ISD::STRICT_FMA:
case ISD::STRICT_FPOW:
case ISD::STRICT_FPOWI:
case ISD::STRICT_FSIN:
case ISD::STRICT_FCOS:
case ISD::STRICT_FEXP:
case ISD::STRICT_FEXP2:
case ISD::STRICT_FLOG:
case ISD::STRICT_FLOG10:
case ISD::STRICT_FLOG2:
case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:
case ISD::STRICT_FP_ROUND:
case ISD::STRICT_FP_EXTEND:
// These pseudo-ops get legalized as if they were their non-strict
// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT
// is also legal, but if ISD::FSQRT requires expansion then so does
// ISD::STRICT_FSQRT.
Action = TLI.getStrictFPOperationAction(Node->getOpcode(),
Node->getValueType(0));
break;
case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: {		case ISD::USUBSAT: {
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
}		}
case ISD::SMULFIX:		case ISD::SMULFIX:
▲ Show 20 Lines • Show All 1,669 Lines • ▼ Show 20 Lines	if (VT.isInteger())
Results.push_back(DAG.getConstant(0, dl, VT));		Results.push_back(DAG.getConstant(0, dl, VT));
else {		else {
assert(VT.isFloatingPoint() && "Unknown value type!");		assert(VT.isFloatingPoint() && "Unknown value type!");
Results.push_back(DAG.getConstantFP(0, dl, VT));		Results.push_back(DAG.getConstantFP(0, dl, VT));
}		}
break;		break;
}		}
case ISD::STRICT_FP_ROUND:		case ISD::STRICT_FP_ROUND:
		// This expansion does not honor the "strict" properties anyway,
		// so prefer falling back to the non-strict operation if legal.
		if (TLI.getStrictFPOperationAction(Node->getOpcode(),
		Node->getValueType(0))
		== TargetLowering::Legal)
		break;
Tmp1 = EmitStackConvert(Node->getOperand(1),		Tmp1 = EmitStackConvert(Node->getOperand(1),
Node->getValueType(0),		Node->getValueType(0),
Node->getValueType(0), dl, Node->getOperand(0));		Node->getValueType(0), dl, Node->getOperand(0));
ReplaceNode(Node, Tmp1.getNode());		ReplaceNode(Node, Tmp1.getNode());
LLVM_DEBUG(dbgs() << "Successfully expanded STRICT_FP_ROUND node\n");		LLVM_DEBUG(dbgs() << "Successfully expanded STRICT_FP_ROUND node\n");
return true;		return true;
case ISD::FP_ROUND:		case ISD::FP_ROUND:
case ISD::BITCAST:		case ISD::BITCAST:
Tmp1 = EmitStackConvert(Node->getOperand(0),		Tmp1 = EmitStackConvert(Node->getOperand(0),
Node->getValueType(0),		Node->getValueType(0),
Node->getValueType(0), dl);		Node->getValueType(0), dl);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
		// This expansion does not honor the "strict" properties anyway,
		// so prefer falling back to the non-strict operation if legal.
		if (TLI.getStrictFPOperationAction(Node->getOpcode(),
		Node->getValueType(0))
		== TargetLowering::Legal)
		break;
Tmp1 = EmitStackConvert(Node->getOperand(1),		Tmp1 = EmitStackConvert(Node->getOperand(1),
Node->getOperand(1).getValueType(),		Node->getOperand(1).getValueType(),
Node->getValueType(0), dl, Node->getOperand(0));		Node->getValueType(0), dl, Node->getOperand(0));
ReplaceNode(Node, Tmp1.getNode());		ReplaceNode(Node, Tmp1.getNode());
LLVM_DEBUG(dbgs() << "Successfully expanded STRICT_FP_EXTEND node\n");		LLVM_DEBUG(dbgs() << "Successfully expanded STRICT_FP_EXTEND node\n");
return true;		return true;
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
Tmp1 = EmitStackConvert(Node->getOperand(0),		Tmp1 = EmitStackConvert(Node->getOperand(0),
▲ Show 20 Lines • Show All 870 Lines • ▼ Show 20 Lines
case ISD::JumpTable:		case ISD::JumpTable:
case ISD::INTRINSIC_W_CHAIN:		case ISD::INTRINSIC_W_CHAIN:
case ISD::INTRINSIC_WO_CHAIN:		case ISD::INTRINSIC_WO_CHAIN:
case ISD::INTRINSIC_VOID:		case ISD::INTRINSIC_VOID:
// FIXME: Custom lowering for these operations shouldn't return null!		// FIXME: Custom lowering for these operations shouldn't return null!
break;		break;
}		}

		if (Results.empty() && Node->isStrictFPOpcode()) {
		// FIXME: We were asked to expand a strict floating-point operation,
		// but there is currently no expansion implemented that would preserve
		// the "strict" properties. For now, we just fall back to the non-strict
		// version if that is legal on the target. The actual mutation of the
		// operation will happen in SelectionDAGISel::DoInstructionSelection.
		if (TLI.getStrictFPOperationAction(Node->getOpcode(),
		Node->getValueType(0))
		== TargetLowering::Legal)
		return true;
		}

// Replace the original node with the legalized result.		// Replace the original node with the legalized result.
if (Results.empty()) {		if (Results.empty()) {
LLVM_DEBUG(dbgs() << "Cannot expand node\n");		LLVM_DEBUG(dbgs() << "Cannot expand node\n");
return false;		return false;
}		}

LLVM_DEBUG(dbgs() << "Successfully expanded node\n");		LLVM_DEBUG(dbgs() << "Successfully expanded node\n");
ReplaceNode(Node, Results.data());		ReplaceNode(Node, Results.data());
▲ Show 20 Lines • Show All 873 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

Show First 20 Lines • Show All 329 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
case ISD::STRICT_FP_ROUND:		case ISD::STRICT_FP_ROUND:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
// These pseudo-ops get legalized as if they were their non-strict		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT		// If we're asked to expand a strict vector floating-point operation,
// is also legal, but if ISD::FSQRT requires expansion then so does		// by default we're going to simply unroll it. That is usually the
// ISD::STRICT_FSQRT.		// best approach, except in the case where the resulting strict (scalar)
Action = TLI.getStrictFPOperationAction(Node->getOpcode(),		// operations would themselves use the fallback mutation to non-strict.
Node->getValueType(0));		// In that specific case, just do the fallback on the vector op.
		if (Action == TargetLowering::Expand &&
		TLI.getStrictFPOperationAction(Node->getOpcode(),
		Node->getValueType(0))
		== TargetLowering::Legal) {
		EVT EltVT = Node->getValueType(0).getVectorElementType();
		if (TLI.getOperationAction(Node->getOpcode(), EltVT)
		== TargetLowering::Expand &&
		TLI.getStrictFPOperationAction(Node->getOpcode(), EltVT)
		== TargetLowering::Legal)
		Action = TargetLowering::Legal;
		}
break;		break;
case ISD::ADD:		case ISD::ADD:
case ISD::SUB:		case ISD::SUB:
case ISD::MUL:		case ISD::MUL:
case ISD::MULHS:		case ISD::MULHS:
case ISD::MULHU:		case ISD::MULHU:
case ISD::SDIV:		case ISD::SDIV:
case ISD::UDIV:		case ISD::UDIV:
▲ Show 20 Lines • Show All 1,069 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,441 Lines • ▼ Show 20 Lines	%nearby = call <1 x float> @llvm.experimental.constrained.nearbyint.v1f32(
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <1 x float> %nearby		ret <1 x float> %nearby
}		}

define <2 x double> @constrained_vector_nearbyint_v2f64() {		define <2 x double> @constrained_vector_nearbyint_v2f64() {
; PC64LE-LABEL: constrained_vector_nearbyint_v2f64:		; PC64LE-LABEL: constrained_vector_nearbyint_v2f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
		; PC64LE-NEXT: mflr 0
		; PC64LE-NEXT: std 0, 16(1)
		; PC64LE-NEXT: stdu 1, -64(1)
		; PC64LE-NEXT: .cfi_def_cfa_offset 64
		; PC64LE-NEXT: .cfi_offset lr, 16
; PC64LE-NEXT: addis 3, 2, .LCPI81_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI81_0@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI81_0@toc@l		; PC64LE-NEXT: lfd 1, .LCPI81_0@toc@l(3)
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: bl nearbyint
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: nop
; PC64LE-NEXT: xvrdpic 34, 0		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
		; PC64LE-NEXT: addis 3, 2, .LCPI81_1@toc@ha
		; PC64LE-NEXT: lfs 1, .LCPI81_1@toc@l(3)
		; PC64LE-NEXT: bl nearbyint
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: xxmrghd 34, 1, 0
		; PC64LE-NEXT: addi 1, 1, 64
		; PC64LE-NEXT: ld 0, 16(1)
		; PC64LE-NEXT: mtlr 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_nearbyint_v2f64:		; PC64LE9-LABEL: constrained_vector_nearbyint_v2f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
		; PC64LE9-NEXT: mflr 0
		; PC64LE9-NEXT: std 0, 16(1)
		; PC64LE9-NEXT: stdu 1, -48(1)
		; PC64LE9-NEXT: .cfi_def_cfa_offset 48
		; PC64LE9-NEXT: .cfi_offset lr, 16
; PC64LE9-NEXT: addis 3, 2, .LCPI81_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI81_0@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI81_0@toc@l		; PC64LE9-NEXT: lfd 1, .LCPI81_0@toc@l(3)
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: bl nearbyint
; PC64LE9-NEXT: xvrdpic 34, 0		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: addis 3, 2, .LCPI81_1@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: lfs 1, .LCPI81_1@toc@l(3)
		; PC64LE9-NEXT: bl nearbyint
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 34, 1, 0
		; PC64LE9-NEXT: addi 1, 1, 48
		; PC64LE9-NEXT: ld 0, 16(1)
		; PC64LE9-NEXT: mtlr 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%nearby = call <2 x double> @llvm.experimental.constrained.nearbyint.v2f64(		%nearby = call <2 x double> @llvm.experimental.constrained.nearbyint.v2f64(
<2 x double> <double 42.1, double 42.0>,		<2 x double> <double 42.1, double 42.0>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <2 x double> %nearby		ret <2 x double> %nearby
}		}
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	entry:
ret <3 x float> %nearby		ret <3 x float> %nearby
}		}

define <3 x double> @constrained_vector_nearby_v3f64() {		define <3 x double> @constrained_vector_nearby_v3f64() {
; PC64LE-LABEL: constrained_vector_nearby_v3f64:		; PC64LE-LABEL: constrained_vector_nearby_v3f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: mflr 0		; PC64LE-NEXT: mflr 0
; PC64LE-NEXT: std 0, 16(1)		; PC64LE-NEXT: std 0, 16(1)
; PC64LE-NEXT: stdu 1, -32(1)		; PC64LE-NEXT: stdu 1, -80(1)
; PC64LE-NEXT: .cfi_def_cfa_offset 32		; PC64LE-NEXT: .cfi_def_cfa_offset 80
; PC64LE-NEXT: .cfi_offset lr, 16		; PC64LE-NEXT: .cfi_offset lr, 16
		; PC64LE-NEXT: .cfi_offset v31, -16
		; PC64LE-NEXT: li 3, 64
		; PC64LE-NEXT: stxvd2x 63, 1, 3 # 16-byte Folded Spill
; PC64LE-NEXT: addis 3, 2, .LCPI83_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI83_0@toc@ha
; PC64LE-NEXT: lfd 1, .LCPI83_0@toc@l(3)		; PC64LE-NEXT: lfd 1, .LCPI83_0@toc@l(3)
; PC64LE-NEXT: bl nearbyint		; PC64LE-NEXT: bl nearbyint
; PC64LE-NEXT: nop		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
; PC64LE-NEXT: addis 3, 2, .LCPI83_1@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI83_1@toc@ha
		; PC64LE-NEXT: lfs 1, .LCPI83_1@toc@l(3)
		; PC64LE-NEXT: bl nearbyint
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: addis 3, 2, .LCPI83_2@toc@ha
		; PC64LE-NEXT: xxmrghd 63, 0, 1
		; PC64LE-NEXT: lfd 1, .LCPI83_2@toc@l(3)
		; PC64LE-NEXT: bl nearbyint
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 64
; PC64LE-NEXT: fmr 3, 1		; PC64LE-NEXT: fmr 3, 1
; PC64LE-NEXT: addi 3, 3, .LCPI83_1@toc@l		; PC64LE-NEXT: xxlor 1, 63, 63
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: xxlor 2, 63, 63
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: lxvd2x 63, 1, 3 # 16-byte Folded Reload
; PC64LE-NEXT: xvrdpic 2, 0		; PC64LE-NEXT: addi 1, 1, 80
; PC64LE-NEXT: xxswapd 0, 2
; PC64LE-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE-NEXT: fmr 1, 0
; PC64LE-NEXT: addi 1, 1, 32
; PC64LE-NEXT: ld 0, 16(1)		; PC64LE-NEXT: ld 0, 16(1)
; PC64LE-NEXT: mtlr 0		; PC64LE-NEXT: mtlr 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_nearby_v3f64:		; PC64LE9-LABEL: constrained_vector_nearby_v3f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: mflr 0		; PC64LE9-NEXT: mflr 0
; PC64LE9-NEXT: std 0, 16(1)		; PC64LE9-NEXT: std 0, 16(1)
; PC64LE9-NEXT: stdu 1, -32(1)		; PC64LE9-NEXT: stdu 1, -64(1)
; PC64LE9-NEXT: .cfi_def_cfa_offset 32		; PC64LE9-NEXT: .cfi_def_cfa_offset 64
; PC64LE9-NEXT: .cfi_offset lr, 16		; PC64LE9-NEXT: .cfi_offset lr, 16
		; PC64LE9-NEXT: .cfi_offset v31, -16
; PC64LE9-NEXT: addis 3, 2, .LCPI83_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI83_0@toc@ha
; PC64LE9-NEXT: lfd 1, .LCPI83_0@toc@l(3)		; PC64LE9-NEXT: lfd 1, .LCPI83_0@toc@l(3)
		; PC64LE9-NEXT: stxv 63, 48(1) # 16-byte Folded Spill
; PC64LE9-NEXT: bl nearbyint		; PC64LE9-NEXT: bl nearbyint
; PC64LE9-NEXT: nop		; PC64LE9-NEXT: nop
; PC64LE9-NEXT: addis 3, 2, .LCPI83_1@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI83_1@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI83_1@toc@l		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
; PC64LE9-NEXT: xvrdpic 2, 0		; PC64LE9-NEXT: lfs 1, .LCPI83_1@toc@l(3)
		; PC64LE9-NEXT: bl nearbyint
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: addis 3, 2, .LCPI83_2@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 63, 0, 1
		; PC64LE9-NEXT: lfd 1, .LCPI83_2@toc@l(3)
		; PC64LE9-NEXT: bl nearbyint
		; PC64LE9-NEXT: nop
; PC64LE9-NEXT: fmr 3, 1		; PC64LE9-NEXT: fmr 3, 1
; PC64LE9-NEXT: xxswapd 1, 2		; PC64LE9-NEXT: xscpsgndp 1, 63, 63
; PC64LE9-NEXT: # kill: def $f1 killed $f1 killed $vsl1		; PC64LE9-NEXT: xscpsgndp 2, 63, 63
; PC64LE9-NEXT: # kill: def $f2 killed $f2 killed $vsl2		; PC64LE9-NEXT: lxv 63, 48(1) # 16-byte Folded Reload
; PC64LE9-NEXT: addi 1, 1, 32		; PC64LE9-NEXT: addi 1, 1, 64
; PC64LE9-NEXT: ld 0, 16(1)		; PC64LE9-NEXT: ld 0, 16(1)
; PC64LE9-NEXT: mtlr 0		; PC64LE9-NEXT: mtlr 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%nearby = call <3 x double> @llvm.experimental.constrained.nearbyint.v3f64(		%nearby = call <3 x double> @llvm.experimental.constrained.nearbyint.v3f64(
<3 x double> <double 42.0, double 42.1, double 42.2>,		<3 x double> <double 42.0, double 42.1, double 42.2>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <3 x double> %nearby		ret <3 x double> %nearby
}		}

define <4 x double> @constrained_vector_nearbyint_v4f64() {		define <4 x double> @constrained_vector_nearbyint_v4f64() {
; PC64LE-LABEL: constrained_vector_nearbyint_v4f64:		; PC64LE-LABEL: constrained_vector_nearbyint_v4f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
		; PC64LE-NEXT: mflr 0
		; PC64LE-NEXT: std 0, 16(1)
		; PC64LE-NEXT: stdu 1, -80(1)
		; PC64LE-NEXT: .cfi_def_cfa_offset 80
		; PC64LE-NEXT: .cfi_offset lr, 16
		; PC64LE-NEXT: .cfi_offset v31, -16
		; PC64LE-NEXT: li 3, 64
		; PC64LE-NEXT: stxvd2x 63, 1, 3 # 16-byte Folded Spill
; PC64LE-NEXT: addis 3, 2, .LCPI84_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI84_0@toc@ha
; PC64LE-NEXT: addis 4, 2, .LCPI84_1@toc@ha		; PC64LE-NEXT: lfd 1, .LCPI84_0@toc@l(3)
; PC64LE-NEXT: addi 3, 3, .LCPI84_0@toc@l		; PC64LE-NEXT: bl nearbyint
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: nop
; PC64LE-NEXT: addi 3, 4, .LCPI84_1@toc@l		; PC64LE-NEXT: li 3, 48
; PC64LE-NEXT: lxvd2x 1, 0, 3		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
; PC64LE-NEXT: xxswapd 1, 1		; PC64LE-NEXT: addis 3, 2, .LCPI84_1@toc@ha
; PC64LE-NEXT: xvrdpic 34, 0		; PC64LE-NEXT: lfd 1, .LCPI84_1@toc@l(3)
; PC64LE-NEXT: xvrdpic 35, 1		; PC64LE-NEXT: bl nearbyint
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: addis 3, 2, .LCPI84_2@toc@ha
		; PC64LE-NEXT: xxmrghd 63, 1, 0
		; PC64LE-NEXT: lfd 1, .LCPI84_2@toc@l(3)
		; PC64LE-NEXT: bl nearbyint
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
		; PC64LE-NEXT: addis 3, 2, .LCPI84_3@toc@ha
		; PC64LE-NEXT: lfd 1, .LCPI84_3@toc@l(3)
		; PC64LE-NEXT: bl nearbyint
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: vmr 2, 31
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: li 3, 64
		; PC64LE-NEXT: lxvd2x 63, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: xxmrghd 35, 1, 0
		; PC64LE-NEXT: addi 1, 1, 80
		; PC64LE-NEXT: ld 0, 16(1)
		; PC64LE-NEXT: mtlr 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_nearbyint_v4f64:		; PC64LE9-LABEL: constrained_vector_nearbyint_v4f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
		; PC64LE9-NEXT: mflr 0
		; PC64LE9-NEXT: std 0, 16(1)
		; PC64LE9-NEXT: stdu 1, -64(1)
		; PC64LE9-NEXT: .cfi_def_cfa_offset 64
		; PC64LE9-NEXT: .cfi_offset lr, 16
		; PC64LE9-NEXT: .cfi_offset v31, -16
; PC64LE9-NEXT: addis 3, 2, .LCPI84_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI84_0@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI84_0@toc@l		; PC64LE9-NEXT: lfd 1, .LCPI84_0@toc@l(3)
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: stxv 63, 48(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: bl nearbyint
		; PC64LE9-NEXT: nop
; PC64LE9-NEXT: addis 3, 2, .LCPI84_1@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI84_1@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI84_1@toc@l		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE9-NEXT: xvrdpic 34, 0		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: lfd 1, .LCPI84_1@toc@l(3)
; PC64LE9-NEXT: xvrdpic 35, 0		; PC64LE9-NEXT: bl nearbyint
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: addis 3, 2, .LCPI84_2@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 63, 1, 0
		; PC64LE9-NEXT: lfd 1, .LCPI84_2@toc@l(3)
		; PC64LE9-NEXT: bl nearbyint
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: addis 3, 2, .LCPI84_3@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: lfd 1, .LCPI84_3@toc@l(3)
		; PC64LE9-NEXT: bl nearbyint
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: vmr 2, 31
		; PC64LE9-NEXT: lxv 63, 48(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 35, 1, 0
		; PC64LE9-NEXT: addi 1, 1, 64
		; PC64LE9-NEXT: ld 0, 16(1)
		; PC64LE9-NEXT: mtlr 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%nearby = call <4 x double> @llvm.experimental.constrained.nearbyint.v4f64(		%nearby = call <4 x double> @llvm.experimental.constrained.nearbyint.v4f64(
<4 x double> <double 42.1, double 42.2,		<4 x double> <double 42.1, double 42.2,
double 42.3, double 42.4>,		double 42.3, double 42.4>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <4 x double> %nearby		ret <4 x double> %nearby
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	%max = call <1 x float> @llvm.experimental.constrained.maxnum.v1f32(
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <1 x float> %max		ret <1 x float> %max
}		}

define <2 x double> @constrained_vector_maxnum_v2f64() {		define <2 x double> @constrained_vector_maxnum_v2f64() {
; PC64LE-LABEL: constrained_vector_maxnum_v2f64:		; PC64LE-LABEL: constrained_vector_maxnum_v2f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
		; PC64LE-NEXT: mflr 0
		; PC64LE-NEXT: std 0, 16(1)
		; PC64LE-NEXT: stdu 1, -64(1)
		; PC64LE-NEXT: .cfi_def_cfa_offset 64
		; PC64LE-NEXT: .cfi_offset lr, 16
; PC64LE-NEXT: addis 3, 2, .LCPI86_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI86_0@toc@ha
; PC64LE-NEXT: addis 4, 2, .LCPI86_1@toc@ha		; PC64LE-NEXT: addis 4, 2, .LCPI86_1@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI86_0@toc@l		; PC64LE-NEXT: lfs 1, .LCPI86_0@toc@l(3)
; PC64LE-NEXT: addi 4, 4, .LCPI86_1@toc@l		; PC64LE-NEXT: lfs 2, .LCPI86_1@toc@l(4)
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: bl fmax
; PC64LE-NEXT: lxvd2x 1, 0, 4		; PC64LE-NEXT: nop
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: li 3, 48
; PC64LE-NEXT: xxswapd 1, 1		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE-NEXT: xvmaxdp 34, 1, 0		; PC64LE-NEXT: addis 4, 2, .LCPI86_3@toc@ha
		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
		; PC64LE-NEXT: addis 3, 2, .LCPI86_2@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI86_3@toc@l(4)
		; PC64LE-NEXT: lfs 1, .LCPI86_2@toc@l(3)
		; PC64LE-NEXT: bl fmax
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: xxmrghd 34, 1, 0
		; PC64LE-NEXT: addi 1, 1, 64
		; PC64LE-NEXT: ld 0, 16(1)
		; PC64LE-NEXT: mtlr 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_maxnum_v2f64:		; PC64LE9-LABEL: constrained_vector_maxnum_v2f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
		; PC64LE9-NEXT: mflr 0
		; PC64LE9-NEXT: std 0, 16(1)
		; PC64LE9-NEXT: stdu 1, -48(1)
		; PC64LE9-NEXT: .cfi_def_cfa_offset 48
		; PC64LE9-NEXT: .cfi_offset lr, 16
; PC64LE9-NEXT: addis 3, 2, .LCPI86_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI86_0@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI86_0@toc@l		; PC64LE9-NEXT: lfs 1, .LCPI86_0@toc@l(3)
; PC64LE9-NEXT: lxvx 0, 0, 3
; PC64LE9-NEXT: addis 3, 2, .LCPI86_1@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI86_1@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI86_1@toc@l		; PC64LE9-NEXT: lfs 2, .LCPI86_1@toc@l(3)
; PC64LE9-NEXT: lxvx 1, 0, 3		; PC64LE9-NEXT: bl fmax
; PC64LE9-NEXT: xvmaxdp 34, 1, 0		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: addis 3, 2, .LCPI86_2@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: lfs 1, .LCPI86_2@toc@l(3)
		; PC64LE9-NEXT: addis 3, 2, .LCPI86_3@toc@ha
		; PC64LE9-NEXT: lfs 2, .LCPI86_3@toc@l(3)
		; PC64LE9-NEXT: bl fmax
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 34, 1, 0
		; PC64LE9-NEXT: addi 1, 1, 48
		; PC64LE9-NEXT: ld 0, 16(1)
		; PC64LE9-NEXT: mtlr 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%max = call <2 x double> @llvm.experimental.constrained.maxnum.v2f64(		%max = call <2 x double> @llvm.experimental.constrained.maxnum.v2f64(
<2 x double> <double 43.0, double 42.0>,		<2 x double> <double 43.0, double 42.0>,
<2 x double> <double 41.0, double 40.0>,		<2 x double> <double 41.0, double 40.0>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <2 x double> %max		ret <2 x double> %max
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	entry:
ret <3 x float> %max		ret <3 x float> %max
}		}

define <3 x double> @constrained_vector_max_v3f64() {		define <3 x double> @constrained_vector_max_v3f64() {
; PC64LE-LABEL: constrained_vector_max_v3f64:		; PC64LE-LABEL: constrained_vector_max_v3f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: mflr 0		; PC64LE-NEXT: mflr 0
; PC64LE-NEXT: std 0, 16(1)		; PC64LE-NEXT: std 0, 16(1)
; PC64LE-NEXT: stdu 1, -32(1)		; PC64LE-NEXT: stdu 1, -80(1)
; PC64LE-NEXT: .cfi_def_cfa_offset 32		; PC64LE-NEXT: .cfi_def_cfa_offset 80
; PC64LE-NEXT: .cfi_offset lr, 16		; PC64LE-NEXT: .cfi_offset lr, 16
; PC64LE-NEXT: addis 3, 2, .LCPI88_0@toc@ha		; PC64LE-NEXT: .cfi_offset v31, -16
		; PC64LE-NEXT: li 3, 64
; PC64LE-NEXT: addis 4, 2, .LCPI88_1@toc@ha		; PC64LE-NEXT: addis 4, 2, .LCPI88_1@toc@ha
; PC64LE-NEXT: lfs 1, .LCPI88_0@toc@l(3)		; PC64LE-NEXT: stxvd2x 63, 1, 3 # 16-byte Folded Spill
		; PC64LE-NEXT: addis 3, 2, .LCPI88_0@toc@ha
; PC64LE-NEXT: lfs 2, .LCPI88_1@toc@l(4)		; PC64LE-NEXT: lfs 2, .LCPI88_1@toc@l(4)
		; PC64LE-NEXT: lfs 1, .LCPI88_0@toc@l(3)
; PC64LE-NEXT: bl fmax		; PC64LE-NEXT: bl fmax
; PC64LE-NEXT: nop		; PC64LE-NEXT: nop
; PC64LE-NEXT: addis 3, 2, .LCPI88_2@toc@ha		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE-NEXT: addis 4, 2, .LCPI88_3@toc@ha		; PC64LE-NEXT: addis 4, 2, .LCPI88_3@toc@ha
		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
		; PC64LE-NEXT: addis 3, 2, .LCPI88_2@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI88_3@toc@l(4)
		; PC64LE-NEXT: lfs 1, .LCPI88_2@toc@l(3)
		; PC64LE-NEXT: bl fmax
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: addis 4, 2, .LCPI88_5@toc@ha
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: addis 3, 2, .LCPI88_4@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI88_5@toc@l(4)
		; PC64LE-NEXT: xxmrghd 63, 1, 0
		; PC64LE-NEXT: lfs 1, .LCPI88_4@toc@l(3)
		; PC64LE-NEXT: bl fmax
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 64
; PC64LE-NEXT: fmr 3, 1		; PC64LE-NEXT: fmr 3, 1
; PC64LE-NEXT: addi 3, 3, .LCPI88_2@toc@l		; PC64LE-NEXT: xxlor 1, 63, 63
; PC64LE-NEXT: addi 4, 4, .LCPI88_3@toc@l		; PC64LE-NEXT: xxlor 2, 63, 63
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: lxvd2x 63, 1, 3 # 16-byte Folded Reload
; PC64LE-NEXT: lxvd2x 2, 0, 4		; PC64LE-NEXT: addi 1, 1, 80
; PC64LE-NEXT: xxswapd 0, 0
; PC64LE-NEXT: xxswapd 2, 2
; PC64LE-NEXT: xvmaxdp 2, 2, 0
; PC64LE-NEXT: xxswapd 0, 2
; PC64LE-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE-NEXT: fmr 1, 0
; PC64LE-NEXT: addi 1, 1, 32
; PC64LE-NEXT: ld 0, 16(1)		; PC64LE-NEXT: ld 0, 16(1)
; PC64LE-NEXT: mtlr 0		; PC64LE-NEXT: mtlr 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_max_v3f64:		; PC64LE9-LABEL: constrained_vector_max_v3f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: mflr 0		; PC64LE9-NEXT: mflr 0
; PC64LE9-NEXT: std 0, 16(1)		; PC64LE9-NEXT: std 0, 16(1)
; PC64LE9-NEXT: stdu 1, -32(1)		; PC64LE9-NEXT: stdu 1, -64(1)
; PC64LE9-NEXT: .cfi_def_cfa_offset 32		; PC64LE9-NEXT: .cfi_def_cfa_offset 64
; PC64LE9-NEXT: .cfi_offset lr, 16		; PC64LE9-NEXT: .cfi_offset lr, 16
		; PC64LE9-NEXT: .cfi_offset v31, -16
; PC64LE9-NEXT: addis 3, 2, .LCPI88_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI88_0@toc@ha
; PC64LE9-NEXT: lfs 1, .LCPI88_0@toc@l(3)		; PC64LE9-NEXT: lfs 1, .LCPI88_0@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI88_1@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI88_1@toc@ha
; PC64LE9-NEXT: lfs 2, .LCPI88_1@toc@l(3)		; PC64LE9-NEXT: lfs 2, .LCPI88_1@toc@l(3)
		; PC64LE9-NEXT: stxv 63, 48(1) # 16-byte Folded Spill
; PC64LE9-NEXT: bl fmax		; PC64LE9-NEXT: bl fmax
; PC64LE9-NEXT: nop		; PC64LE9-NEXT: nop
; PC64LE9-NEXT: addis 3, 2, .LCPI88_2@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI88_2@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI88_2@toc@l		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: lfs 1, .LCPI88_2@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI88_3@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI88_3@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI88_3@toc@l		; PC64LE9-NEXT: lfs 2, .LCPI88_3@toc@l(3)
		; PC64LE9-NEXT: bl fmax
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: addis 3, 2, .LCPI88_4@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 63, 1, 0
		; PC64LE9-NEXT: lfs 1, .LCPI88_4@toc@l(3)
		; PC64LE9-NEXT: addis 3, 2, .LCPI88_5@toc@ha
		; PC64LE9-NEXT: lfs 2, .LCPI88_5@toc@l(3)
		; PC64LE9-NEXT: bl fmax
		; PC64LE9-NEXT: nop
; PC64LE9-NEXT: fmr 3, 1		; PC64LE9-NEXT: fmr 3, 1
; PC64LE9-NEXT: lxvx 1, 0, 3		; PC64LE9-NEXT: xscpsgndp 1, 63, 63
; PC64LE9-NEXT: xvmaxdp 2, 1, 0		; PC64LE9-NEXT: xscpsgndp 2, 63, 63
; PC64LE9-NEXT: xxswapd 1, 2		; PC64LE9-NEXT: lxv 63, 48(1) # 16-byte Folded Reload
; PC64LE9-NEXT: # kill: def $f1 killed $f1 killed $vsl1		; PC64LE9-NEXT: addi 1, 1, 64
; PC64LE9-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE9-NEXT: addi 1, 1, 32
; PC64LE9-NEXT: ld 0, 16(1)		; PC64LE9-NEXT: ld 0, 16(1)
; PC64LE9-NEXT: mtlr 0		; PC64LE9-NEXT: mtlr 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%max = call <3 x double> @llvm.experimental.constrained.maxnum.v3f64(		%max = call <3 x double> @llvm.experimental.constrained.maxnum.v3f64(
<3 x double> <double 43.0, double 44.0, double 45.0>,		<3 x double> <double 43.0, double 44.0, double 45.0>,
<3 x double> <double 40.0, double 41.0, double 42.0>,		<3 x double> <double 40.0, double 41.0, double 42.0>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <3 x double> %max		ret <3 x double> %max
}		}

define <4 x double> @constrained_vector_maxnum_v4f64() {		define <4 x double> @constrained_vector_maxnum_v4f64() {
; PC64LE-LABEL: constrained_vector_maxnum_v4f64:		; PC64LE-LABEL: constrained_vector_maxnum_v4f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI89_0@toc@ha		; PC64LE-NEXT: mflr 0
		; PC64LE-NEXT: std 0, 16(1)
		; PC64LE-NEXT: stdu 1, -80(1)
		; PC64LE-NEXT: .cfi_def_cfa_offset 80
		; PC64LE-NEXT: .cfi_offset lr, 16
		; PC64LE-NEXT: .cfi_offset v31, -16
		; PC64LE-NEXT: li 3, 64
; PC64LE-NEXT: addis 4, 2, .LCPI89_1@toc@ha		; PC64LE-NEXT: addis 4, 2, .LCPI89_1@toc@ha
; PC64LE-NEXT: addis 5, 2, .LCPI89_2@toc@ha		; PC64LE-NEXT: stxvd2x 63, 1, 3 # 16-byte Folded Spill
; PC64LE-NEXT: addis 6, 2, .LCPI89_3@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI89_0@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI89_0@toc@l		; PC64LE-NEXT: lfs 2, .LCPI89_1@toc@l(4)
; PC64LE-NEXT: addi 4, 4, .LCPI89_1@toc@l		; PC64LE-NEXT: lfs 1, .LCPI89_0@toc@l(3)
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: bl fmax
; PC64LE-NEXT: lxvd2x 1, 0, 4		; PC64LE-NEXT: nop
; PC64LE-NEXT: addi 3, 5, .LCPI89_2@toc@l		; PC64LE-NEXT: li 3, 48
; PC64LE-NEXT: addi 4, 6, .LCPI89_3@toc@l		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE-NEXT: lxvd2x 2, 0, 3		; PC64LE-NEXT: addis 4, 2, .LCPI89_3@toc@ha
; PC64LE-NEXT: lxvd2x 3, 0, 4		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: addis 3, 2, .LCPI89_2@toc@ha
; PC64LE-NEXT: xxswapd 1, 1		; PC64LE-NEXT: lfs 2, .LCPI89_3@toc@l(4)
; PC64LE-NEXT: xxswapd 2, 2		; PC64LE-NEXT: lfs 1, .LCPI89_2@toc@l(3)
; PC64LE-NEXT: xxswapd 3, 3		; PC64LE-NEXT: bl fmax
; PC64LE-NEXT: xvmaxdp 34, 1, 0		; PC64LE-NEXT: nop
; PC64LE-NEXT: xvmaxdp 35, 3, 2		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: addis 4, 2, .LCPI89_5@toc@ha
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: addis 3, 2, .LCPI89_4@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI89_5@toc@l(4)
		; PC64LE-NEXT: xxmrghd 63, 1, 0
		; PC64LE-NEXT: lfs 1, .LCPI89_4@toc@l(3)
		; PC64LE-NEXT: bl fmax
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: addis 4, 2, .LCPI89_7@toc@ha
		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
		; PC64LE-NEXT: addis 3, 2, .LCPI89_6@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI89_7@toc@l(4)
		; PC64LE-NEXT: lfs 1, .LCPI89_6@toc@l(3)
		; PC64LE-NEXT: bl fmax
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: vmr 2, 31
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: li 3, 64
		; PC64LE-NEXT: lxvd2x 63, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: xxmrghd 35, 1, 0
		; PC64LE-NEXT: addi 1, 1, 80
		; PC64LE-NEXT: ld 0, 16(1)
		; PC64LE-NEXT: mtlr 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_maxnum_v4f64:		; PC64LE9-LABEL: constrained_vector_maxnum_v4f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
		; PC64LE9-NEXT: mflr 0
		; PC64LE9-NEXT: std 0, 16(1)
		; PC64LE9-NEXT: stdu 1, -64(1)
		; PC64LE9-NEXT: .cfi_def_cfa_offset 64
		; PC64LE9-NEXT: .cfi_offset lr, 16
		; PC64LE9-NEXT: .cfi_offset v31, -16
; PC64LE9-NEXT: addis 3, 2, .LCPI89_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI89_0@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI89_0@toc@l		; PC64LE9-NEXT: lfs 1, .LCPI89_0@toc@l(3)
; PC64LE9-NEXT: lxvx 0, 0, 3
; PC64LE9-NEXT: addis 3, 2, .LCPI89_1@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI89_1@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI89_1@toc@l		; PC64LE9-NEXT: lfs 2, .LCPI89_1@toc@l(3)
; PC64LE9-NEXT: lxvx 1, 0, 3		; PC64LE9-NEXT: stxv 63, 48(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: bl fmax
		; PC64LE9-NEXT: nop
; PC64LE9-NEXT: addis 3, 2, .LCPI89_2@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI89_2@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI89_2@toc@l		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE9-NEXT: xvmaxdp 34, 1, 0		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: lfs 1, .LCPI89_2@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI89_3@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI89_3@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI89_3@toc@l		; PC64LE9-NEXT: lfs 2, .LCPI89_3@toc@l(3)
; PC64LE9-NEXT: lxvx 1, 0, 3		; PC64LE9-NEXT: bl fmax
; PC64LE9-NEXT: xvmaxdp 35, 1, 0		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: addis 3, 2, .LCPI89_4@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 63, 1, 0
		; PC64LE9-NEXT: lfs 1, .LCPI89_4@toc@l(3)
		; PC64LE9-NEXT: addis 3, 2, .LCPI89_5@toc@ha
		; PC64LE9-NEXT: lfs 2, .LCPI89_5@toc@l(3)
		; PC64LE9-NEXT: bl fmax
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: addis 3, 2, .LCPI89_6@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: lfs 1, .LCPI89_6@toc@l(3)
		; PC64LE9-NEXT: addis 3, 2, .LCPI89_7@toc@ha
		; PC64LE9-NEXT: lfs 2, .LCPI89_7@toc@l(3)
		; PC64LE9-NEXT: bl fmax
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: vmr 2, 31
		; PC64LE9-NEXT: lxv 63, 48(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 35, 1, 0
		; PC64LE9-NEXT: addi 1, 1, 64
		; PC64LE9-NEXT: ld 0, 16(1)
		; PC64LE9-NEXT: mtlr 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%max = call <4 x double> @llvm.experimental.constrained.maxnum.v4f64(		%max = call <4 x double> @llvm.experimental.constrained.maxnum.v4f64(
<4 x double> <double 44.0, double 45.0,		<4 x double> <double 44.0, double 45.0,
double 46.0, double 47.0>,		double 46.0, double 47.0>,
<4 x double> <double 40.0, double 41.0,		<4 x double> <double 40.0, double 41.0,
double 42.0, double 43.0>,		double 42.0, double 43.0>,
metadata !"round.dynamic",		metadata !"round.dynamic",
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	%min = call <1 x float> @llvm.experimental.constrained.minnum.v1f32(
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <1 x float> %min		ret <1 x float> %min
}		}

define <2 x double> @constrained_vector_minnum_v2f64() {		define <2 x double> @constrained_vector_minnum_v2f64() {
; PC64LE-LABEL: constrained_vector_minnum_v2f64:		; PC64LE-LABEL: constrained_vector_minnum_v2f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
		; PC64LE-NEXT: mflr 0
		; PC64LE-NEXT: std 0, 16(1)
		; PC64LE-NEXT: stdu 1, -64(1)
		; PC64LE-NEXT: .cfi_def_cfa_offset 64
		; PC64LE-NEXT: .cfi_offset lr, 16
; PC64LE-NEXT: addis 3, 2, .LCPI91_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI91_0@toc@ha
; PC64LE-NEXT: addis 4, 2, .LCPI91_1@toc@ha		; PC64LE-NEXT: addis 4, 2, .LCPI91_1@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI91_0@toc@l		; PC64LE-NEXT: lfs 1, .LCPI91_0@toc@l(3)
; PC64LE-NEXT: addi 4, 4, .LCPI91_1@toc@l		; PC64LE-NEXT: lfs 2, .LCPI91_1@toc@l(4)
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: bl fmin
; PC64LE-NEXT: lxvd2x 1, 0, 4		; PC64LE-NEXT: nop
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: li 3, 48
; PC64LE-NEXT: xxswapd 1, 1		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE-NEXT: xvmindp 34, 1, 0		; PC64LE-NEXT: addis 4, 2, .LCPI91_3@toc@ha
		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
		; PC64LE-NEXT: addis 3, 2, .LCPI91_2@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI91_3@toc@l(4)
		; PC64LE-NEXT: lfs 1, .LCPI91_2@toc@l(3)
		; PC64LE-NEXT: bl fmin
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: xxmrghd 34, 1, 0
		; PC64LE-NEXT: addi 1, 1, 64
		; PC64LE-NEXT: ld 0, 16(1)
		; PC64LE-NEXT: mtlr 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_minnum_v2f64:		; PC64LE9-LABEL: constrained_vector_minnum_v2f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
		; PC64LE9-NEXT: mflr 0
		; PC64LE9-NEXT: std 0, 16(1)
		; PC64LE9-NEXT: stdu 1, -48(1)
		; PC64LE9-NEXT: .cfi_def_cfa_offset 48
		; PC64LE9-NEXT: .cfi_offset lr, 16
; PC64LE9-NEXT: addis 3, 2, .LCPI91_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI91_0@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI91_0@toc@l		; PC64LE9-NEXT: lfs 1, .LCPI91_0@toc@l(3)
; PC64LE9-NEXT: lxvx 0, 0, 3
; PC64LE9-NEXT: addis 3, 2, .LCPI91_1@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI91_1@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI91_1@toc@l		; PC64LE9-NEXT: lfs 2, .LCPI91_1@toc@l(3)
; PC64LE9-NEXT: lxvx 1, 0, 3		; PC64LE9-NEXT: bl fmin
; PC64LE9-NEXT: xvmindp 34, 1, 0		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: addis 3, 2, .LCPI91_2@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: lfs 1, .LCPI91_2@toc@l(3)
		; PC64LE9-NEXT: addis 3, 2, .LCPI91_3@toc@ha
		; PC64LE9-NEXT: lfs 2, .LCPI91_3@toc@l(3)
		; PC64LE9-NEXT: bl fmin
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 34, 1, 0
		; PC64LE9-NEXT: addi 1, 1, 48
		; PC64LE9-NEXT: ld 0, 16(1)
		; PC64LE9-NEXT: mtlr 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%min = call <2 x double> @llvm.experimental.constrained.minnum.v2f64(		%min = call <2 x double> @llvm.experimental.constrained.minnum.v2f64(
<2 x double> <double 43.0, double 42.0>,		<2 x double> <double 43.0, double 42.0>,
<2 x double> <double 41.0, double 40.0>,		<2 x double> <double 41.0, double 40.0>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <2 x double> %min		ret <2 x double> %min
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	entry:
ret <3 x float> %min		ret <3 x float> %min
}		}

define <3 x double> @constrained_vector_min_v3f64() {		define <3 x double> @constrained_vector_min_v3f64() {
; PC64LE-LABEL: constrained_vector_min_v3f64:		; PC64LE-LABEL: constrained_vector_min_v3f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: mflr 0		; PC64LE-NEXT: mflr 0
; PC64LE-NEXT: std 0, 16(1)		; PC64LE-NEXT: std 0, 16(1)
; PC64LE-NEXT: stdu 1, -32(1)		; PC64LE-NEXT: stdu 1, -80(1)
; PC64LE-NEXT: .cfi_def_cfa_offset 32		; PC64LE-NEXT: .cfi_def_cfa_offset 80
; PC64LE-NEXT: .cfi_offset lr, 16		; PC64LE-NEXT: .cfi_offset lr, 16
; PC64LE-NEXT: addis 3, 2, .LCPI93_0@toc@ha		; PC64LE-NEXT: .cfi_offset v31, -16
		; PC64LE-NEXT: li 3, 64
; PC64LE-NEXT: addis 4, 2, .LCPI93_1@toc@ha		; PC64LE-NEXT: addis 4, 2, .LCPI93_1@toc@ha
; PC64LE-NEXT: lfs 1, .LCPI93_0@toc@l(3)		; PC64LE-NEXT: stxvd2x 63, 1, 3 # 16-byte Folded Spill
		; PC64LE-NEXT: addis 3, 2, .LCPI93_0@toc@ha
; PC64LE-NEXT: lfs 2, .LCPI93_1@toc@l(4)		; PC64LE-NEXT: lfs 2, .LCPI93_1@toc@l(4)
		; PC64LE-NEXT: lfs 1, .LCPI93_0@toc@l(3)
; PC64LE-NEXT: bl fmin		; PC64LE-NEXT: bl fmin
; PC64LE-NEXT: nop		; PC64LE-NEXT: nop
; PC64LE-NEXT: addis 3, 2, .LCPI93_2@toc@ha		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE-NEXT: addis 4, 2, .LCPI93_3@toc@ha		; PC64LE-NEXT: addis 4, 2, .LCPI93_3@toc@ha
		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
		; PC64LE-NEXT: addis 3, 2, .LCPI93_2@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI93_3@toc@l(4)
		; PC64LE-NEXT: lfs 1, .LCPI93_2@toc@l(3)
		; PC64LE-NEXT: bl fmin
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: addis 4, 2, .LCPI93_5@toc@ha
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: addis 3, 2, .LCPI93_4@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI93_5@toc@l(4)
		; PC64LE-NEXT: xxmrghd 63, 1, 0
		; PC64LE-NEXT: lfs 1, .LCPI93_4@toc@l(3)
		; PC64LE-NEXT: bl fmin
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 64
; PC64LE-NEXT: fmr 3, 1		; PC64LE-NEXT: fmr 3, 1
; PC64LE-NEXT: addi 3, 3, .LCPI93_2@toc@l		; PC64LE-NEXT: xxlor 1, 63, 63
; PC64LE-NEXT: addi 4, 4, .LCPI93_3@toc@l		; PC64LE-NEXT: xxlor 2, 63, 63
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: lxvd2x 63, 1, 3 # 16-byte Folded Reload
; PC64LE-NEXT: lxvd2x 2, 0, 4		; PC64LE-NEXT: addi 1, 1, 80
; PC64LE-NEXT: xxswapd 0, 0
; PC64LE-NEXT: xxswapd 2, 2
; PC64LE-NEXT: xvmindp 2, 2, 0
; PC64LE-NEXT: xxswapd 0, 2
; PC64LE-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE-NEXT: fmr 1, 0
; PC64LE-NEXT: addi 1, 1, 32
; PC64LE-NEXT: ld 0, 16(1)		; PC64LE-NEXT: ld 0, 16(1)
; PC64LE-NEXT: mtlr 0		; PC64LE-NEXT: mtlr 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_min_v3f64:		; PC64LE9-LABEL: constrained_vector_min_v3f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: mflr 0		; PC64LE9-NEXT: mflr 0
; PC64LE9-NEXT: std 0, 16(1)		; PC64LE9-NEXT: std 0, 16(1)
; PC64LE9-NEXT: stdu 1, -32(1)		; PC64LE9-NEXT: stdu 1, -64(1)
; PC64LE9-NEXT: .cfi_def_cfa_offset 32		; PC64LE9-NEXT: .cfi_def_cfa_offset 64
; PC64LE9-NEXT: .cfi_offset lr, 16		; PC64LE9-NEXT: .cfi_offset lr, 16
		; PC64LE9-NEXT: .cfi_offset v31, -16
; PC64LE9-NEXT: addis 3, 2, .LCPI93_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI93_0@toc@ha
; PC64LE9-NEXT: lfs 1, .LCPI93_0@toc@l(3)		; PC64LE9-NEXT: lfs 1, .LCPI93_0@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI93_1@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI93_1@toc@ha
; PC64LE9-NEXT: lfs 2, .LCPI93_1@toc@l(3)		; PC64LE9-NEXT: lfs 2, .LCPI93_1@toc@l(3)
		; PC64LE9-NEXT: stxv 63, 48(1) # 16-byte Folded Spill
; PC64LE9-NEXT: bl fmin		; PC64LE9-NEXT: bl fmin
; PC64LE9-NEXT: nop		; PC64LE9-NEXT: nop
; PC64LE9-NEXT: addis 3, 2, .LCPI93_2@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI93_2@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI93_2@toc@l		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: lfs 1, .LCPI93_2@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI93_3@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI93_3@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI93_3@toc@l		; PC64LE9-NEXT: lfs 2, .LCPI93_3@toc@l(3)
		; PC64LE9-NEXT: bl fmin
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: addis 3, 2, .LCPI93_4@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 63, 1, 0
		; PC64LE9-NEXT: lfs 1, .LCPI93_4@toc@l(3)
		; PC64LE9-NEXT: addis 3, 2, .LCPI93_5@toc@ha
		; PC64LE9-NEXT: lfs 2, .LCPI93_5@toc@l(3)
		; PC64LE9-NEXT: bl fmin
		; PC64LE9-NEXT: nop
; PC64LE9-NEXT: fmr 3, 1		; PC64LE9-NEXT: fmr 3, 1
; PC64LE9-NEXT: lxvx 1, 0, 3		; PC64LE9-NEXT: xscpsgndp 1, 63, 63
; PC64LE9-NEXT: xvmindp 2, 1, 0		; PC64LE9-NEXT: xscpsgndp 2, 63, 63
; PC64LE9-NEXT: xxswapd 1, 2		; PC64LE9-NEXT: lxv 63, 48(1) # 16-byte Folded Reload
; PC64LE9-NEXT: # kill: def $f1 killed $f1 killed $vsl1		; PC64LE9-NEXT: addi 1, 1, 64
; PC64LE9-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE9-NEXT: addi 1, 1, 32
; PC64LE9-NEXT: ld 0, 16(1)		; PC64LE9-NEXT: ld 0, 16(1)
; PC64LE9-NEXT: mtlr 0		; PC64LE9-NEXT: mtlr 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%min = call <3 x double> @llvm.experimental.constrained.minnum.v3f64(		%min = call <3 x double> @llvm.experimental.constrained.minnum.v3f64(
<3 x double> <double 43.0, double 44.0, double 45.0>,		<3 x double> <double 43.0, double 44.0, double 45.0>,
<3 x double> <double 40.0, double 41.0, double 42.0>,		<3 x double> <double 40.0, double 41.0, double 42.0>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <3 x double> %min		ret <3 x double> %min
}		}

define <4 x double> @constrained_vector_minnum_v4f64() {		define <4 x double> @constrained_vector_minnum_v4f64() {
; PC64LE-LABEL: constrained_vector_minnum_v4f64:		; PC64LE-LABEL: constrained_vector_minnum_v4f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI94_0@toc@ha		; PC64LE-NEXT: mflr 0
		; PC64LE-NEXT: std 0, 16(1)
		; PC64LE-NEXT: stdu 1, -80(1)
		; PC64LE-NEXT: .cfi_def_cfa_offset 80
		; PC64LE-NEXT: .cfi_offset lr, 16
		; PC64LE-NEXT: .cfi_offset v31, -16
		; PC64LE-NEXT: li 3, 64
; PC64LE-NEXT: addis 4, 2, .LCPI94_1@toc@ha		; PC64LE-NEXT: addis 4, 2, .LCPI94_1@toc@ha
; PC64LE-NEXT: addis 5, 2, .LCPI94_2@toc@ha		; PC64LE-NEXT: stxvd2x 63, 1, 3 # 16-byte Folded Spill
; PC64LE-NEXT: addis 6, 2, .LCPI94_3@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI94_0@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI94_0@toc@l		; PC64LE-NEXT: lfs 2, .LCPI94_1@toc@l(4)
; PC64LE-NEXT: addi 4, 4, .LCPI94_1@toc@l		; PC64LE-NEXT: lfs 1, .LCPI94_0@toc@l(3)
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: bl fmin
; PC64LE-NEXT: lxvd2x 1, 0, 4		; PC64LE-NEXT: nop
; PC64LE-NEXT: addi 3, 5, .LCPI94_2@toc@l		; PC64LE-NEXT: li 3, 48
; PC64LE-NEXT: addi 4, 6, .LCPI94_3@toc@l		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE-NEXT: lxvd2x 2, 0, 3		; PC64LE-NEXT: addis 4, 2, .LCPI94_3@toc@ha
; PC64LE-NEXT: lxvd2x 3, 0, 4		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: addis 3, 2, .LCPI94_2@toc@ha
; PC64LE-NEXT: xxswapd 1, 1		; PC64LE-NEXT: lfs 2, .LCPI94_3@toc@l(4)
; PC64LE-NEXT: xxswapd 2, 2		; PC64LE-NEXT: lfs 1, .LCPI94_2@toc@l(3)
; PC64LE-NEXT: xxswapd 3, 3		; PC64LE-NEXT: bl fmin
; PC64LE-NEXT: xvmindp 34, 1, 0		; PC64LE-NEXT: nop
; PC64LE-NEXT: xvmindp 35, 3, 2		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: addis 4, 2, .LCPI94_5@toc@ha
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: addis 3, 2, .LCPI94_4@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI94_5@toc@l(4)
		; PC64LE-NEXT: xxmrghd 63, 1, 0
		; PC64LE-NEXT: lfs 1, .LCPI94_4@toc@l(3)
		; PC64LE-NEXT: bl fmin
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: addis 4, 2, .LCPI94_7@toc@ha
		; PC64LE-NEXT: stxvd2x 1, 1, 3 # 16-byte Folded Spill
		; PC64LE-NEXT: addis 3, 2, .LCPI94_6@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI94_7@toc@l(4)
		; PC64LE-NEXT: lfs 1, .LCPI94_6@toc@l(3)
		; PC64LE-NEXT: bl fmin
		; PC64LE-NEXT: nop
		; PC64LE-NEXT: li 3, 48
		; PC64LE-NEXT: vmr 2, 31
		; PC64LE-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE-NEXT: lxvd2x 0, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: li 3, 64
		; PC64LE-NEXT: lxvd2x 63, 1, 3 # 16-byte Folded Reload
		; PC64LE-NEXT: xxmrghd 35, 1, 0
		; PC64LE-NEXT: addi 1, 1, 80
		; PC64LE-NEXT: ld 0, 16(1)
		; PC64LE-NEXT: mtlr 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_minnum_v4f64:		; PC64LE9-LABEL: constrained_vector_minnum_v4f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
		; PC64LE9-NEXT: mflr 0
		; PC64LE9-NEXT: std 0, 16(1)
		; PC64LE9-NEXT: stdu 1, -64(1)
		; PC64LE9-NEXT: .cfi_def_cfa_offset 64
		; PC64LE9-NEXT: .cfi_offset lr, 16
		; PC64LE9-NEXT: .cfi_offset v31, -16
; PC64LE9-NEXT: addis 3, 2, .LCPI94_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI94_0@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI94_0@toc@l		; PC64LE9-NEXT: lfs 1, .LCPI94_0@toc@l(3)
; PC64LE9-NEXT: lxvx 0, 0, 3
; PC64LE9-NEXT: addis 3, 2, .LCPI94_1@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI94_1@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI94_1@toc@l		; PC64LE9-NEXT: lfs 2, .LCPI94_1@toc@l(3)
; PC64LE9-NEXT: lxvx 1, 0, 3		; PC64LE9-NEXT: stxv 63, 48(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: bl fmin
		; PC64LE9-NEXT: nop
; PC64LE9-NEXT: addis 3, 2, .LCPI94_2@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI94_2@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI94_2@toc@l		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
; PC64LE9-NEXT: xvmindp 34, 1, 0		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: lfs 1, .LCPI94_2@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI94_3@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI94_3@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI94_3@toc@l		; PC64LE9-NEXT: lfs 2, .LCPI94_3@toc@l(3)
; PC64LE9-NEXT: lxvx 1, 0, 3		; PC64LE9-NEXT: bl fmin
; PC64LE9-NEXT: xvmindp 35, 1, 0		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: addis 3, 2, .LCPI94_4@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 63, 1, 0
		; PC64LE9-NEXT: lfs 1, .LCPI94_4@toc@l(3)
		; PC64LE9-NEXT: addis 3, 2, .LCPI94_5@toc@ha
		; PC64LE9-NEXT: lfs 2, .LCPI94_5@toc@l(3)
		; PC64LE9-NEXT: bl fmin
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: addis 3, 2, .LCPI94_6@toc@ha
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: stxv 1, 32(1) # 16-byte Folded Spill
		; PC64LE9-NEXT: lfs 1, .LCPI94_6@toc@l(3)
		; PC64LE9-NEXT: addis 3, 2, .LCPI94_7@toc@ha
		; PC64LE9-NEXT: lfs 2, .LCPI94_7@toc@l(3)
		; PC64LE9-NEXT: bl fmin
		; PC64LE9-NEXT: nop
		; PC64LE9-NEXT: lxv 0, 32(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: vmr 2, 31
		; PC64LE9-NEXT: lxv 63, 48(1) # 16-byte Folded Reload
		; PC64LE9-NEXT: # kill: def $f1 killed $f1 def $vsl1
		; PC64LE9-NEXT: xxmrghd 35, 1, 0
		; PC64LE9-NEXT: addi 1, 1, 64
		; PC64LE9-NEXT: ld 0, 16(1)
		; PC64LE9-NEXT: mtlr 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%min = call <4 x double> @llvm.experimental.constrained.minnum.v4f64(		%min = call <4 x double> @llvm.experimental.constrained.minnum.v4f64(
<4 x double> <double 44.0, double 45.0,		<4 x double> <double 44.0, double 45.0,
double 46.0, double 47.0>,		double 46.0, double 47.0>,
<4 x double> <double 40.0, double 41.0,		<4 x double> <double 40.0, double 41.0,
double 42.0, double 43.0>,		double 42.0, double 43.0>,
metadata !"round.dynamic",		metadata !"round.dynamic",
▲ Show 20 Lines • Show All 987 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

Show First 20 Lines • Show All 538 Lines • ▼ Show 20 Lines	%add = call <1 x float> @llvm.experimental.constrained.fadd.v1f32(
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <1 x float> %add		ret <1 x float> %add
}		}

define <2 x double> @constrained_vector_fadd_v2f64() {		define <2 x double> @constrained_vector_fadd_v2f64() {
; CHECK-LABEL: constrained_vector_fadd_v2f64:		; CHECK-LABEL: constrained_vector_fadd_v2f64:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: movapd {{.*#+}} xmm0 = [1.7976931348623157E+308,1.7976931348623157E+308]		; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
; CHECK-NEXT: addpd {{.*}}(%rip), %xmm0		; CHECK-NEXT: movsd {{.*#+}} xmm1 = mem[0],zero
		; CHECK-NEXT: addsd %xmm0, %xmm1
		; CHECK-NEXT: addsd {{.*}}(%rip), %xmm0
		; CHECK-NEXT: unpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]
; CHECK-NEXT: retq		; CHECK-NEXT: retq
;		;
; AVX-LABEL: constrained_vector_fadd_v2f64:		; AVX-LABEL: constrained_vector_fadd_v2f64:
; AVX: # %bb.0: # %entry		; AVX: # %bb.0: # %entry
; AVX-NEXT: vmovapd {{.*#+}} xmm0 = [1.7976931348623157E+308,1.7976931348623157E+308]		; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
; AVX-NEXT: vaddpd {{.*}}(%rip), %xmm0, %xmm0		; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm0, %xmm1
		; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm0, %xmm0
		; AVX-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]
; AVX-NEXT: retq		; AVX-NEXT: retq
entry:		entry:
%add = call <2 x double> @llvm.experimental.constrained.fadd.v2f64(		%add = call <2 x double> @llvm.experimental.constrained.fadd.v2f64(
<2 x double> <double 0x7FEFFFFFFFFFFFFF, double 0x7FEFFFFFFFFFFFFF>,		<2 x double> <double 0x7FEFFFFFFFFFFFFF, double 0x7FEFFFFFFFFFFFFF>,
<2 x double> <double 1.000000e+00, double 1.000000e-01>,		<2 x double> <double 1.000000e+00, double 1.000000e-01>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <2 x double> %add		ret <2 x double> %add
Show All 30 Lines	%add = call <3 x float> @llvm.experimental.constrained.fadd.v3f32(
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <3 x float> %add		ret <3 x float> %add
}		}

define <3 x double> @constrained_vector_fadd_v3f64() {		define <3 x double> @constrained_vector_fadd_v3f64() {
; CHECK-LABEL: constrained_vector_fadd_v3f64:		; CHECK-LABEL: constrained_vector_fadd_v3f64:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: movapd {{.*#+}} xmm0 = [1.7976931348623157E+308,1.7976931348623157E+308]		; CHECK-NEXT: xorpd %xmm2, %xmm2
; CHECK-NEXT: addpd {{.*}}(%rip), %xmm0		; CHECK-NEXT: movsd {{.*#+}} xmm1 = mem[0],zero
; CHECK-NEXT: xorpd %xmm1, %xmm1		; CHECK-NEXT: addsd %xmm1, %xmm2
		; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
		; CHECK-NEXT: addsd %xmm1, %xmm0
; CHECK-NEXT: addsd {{.*}}(%rip), %xmm1		; CHECK-NEXT: addsd {{.*}}(%rip), %xmm1
; CHECK-NEXT: movsd %xmm1, -{{[0-9]+}}(%rsp)		; CHECK-NEXT: movsd %xmm2, -{{[0-9]+}}(%rsp)
; CHECK-NEXT: movapd %xmm0, %xmm1
; CHECK-NEXT: unpckhpd {{.*#+}} xmm1 = xmm1[1],xmm0[1]
; CHECK-NEXT: fldl -{{[0-9]+}}(%rsp)		; CHECK-NEXT: fldl -{{[0-9]+}}(%rsp)
; CHECK-NEXT: retq		; CHECK-NEXT: retq
;		;
; AVX-LABEL: constrained_vector_fadd_v3f64:		; AVX-LABEL: constrained_vector_fadd_v3f64:
; AVX: # %bb.0: # %entry		; AVX: # %bb.0: # %entry
; AVX-NEXT: vxorpd %xmm0, %xmm0, %xmm0		; AVX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm0, %xmm0		; AVX-NEXT: vmovsd {{.*#+}} xmm1 = mem[0],zero
; AVX-NEXT: vmovapd {{.*#+}} xmm1 = [1.7976931348623157E+308,1.7976931348623157E+308]		; AVX-NEXT: vaddsd %xmm0, %xmm1, %xmm0
; AVX-NEXT: vaddpd {{.*}}(%rip), %xmm1, %xmm1		; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm1, %xmm2
		; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm1, %xmm1
		; AVX-NEXT: vunpcklpd {{.*#+}} xmm1 = xmm1[0],xmm2[0]
; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0		; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
; AVX-NEXT: retq		; AVX-NEXT: retq
entry:		entry:
%add = call <3 x double> @llvm.experimental.constrained.fadd.v3f64(		%add = call <3 x double> @llvm.experimental.constrained.fadd.v3f64(
<3 x double> <double 0x7FEFFFFFFFFFFFFF, double 0x7FEFFFFFFFFFFFFF,		<3 x double> <double 0x7FEFFFFFFFFFFFFF, double 0x7FEFFFFFFFFFFFFF,
double 0x7FEFFFFFFFFFFFFF>,		double 0x7FEFFFFFFFFFFFFF>,
<3 x double> <double 2.0, double 1.0, double 0.0>,		<3 x double> <double 2.0, double 1.0, double 0.0>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <3 x double> %add		ret <3 x double> %add
}		}

define <4 x double> @constrained_vector_fadd_v4f64() {		define <4 x double> @constrained_vector_fadd_v4f64() {
; CHECK-LABEL: constrained_vector_fadd_v4f64:		; CHECK-LABEL: constrained_vector_fadd_v4f64:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: movapd {{.*#+}} xmm1 = [1.7976931348623157E+308,1.7976931348623157E+308]		; CHECK-NEXT: movsd {{.*#+}} xmm1 = mem[0],zero
; CHECK-NEXT: movapd {{.*#+}} xmm0 = [1.0E+0,1.0000000000000001E-1]		; CHECK-NEXT: movsd {{.*#+}} xmm2 = mem[0],zero
; CHECK-NEXT: addpd %xmm1, %xmm0		; CHECK-NEXT: addsd %xmm1, %xmm2
; CHECK-NEXT: addpd {{.*}}(%rip), %xmm1		; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
		; CHECK-NEXT: addsd %xmm1, %xmm0
		; CHECK-NEXT: unpcklpd {{.*#+}} xmm0 = xmm0[0],xmm2[0]
		; CHECK-NEXT: movsd {{.*#+}} xmm2 = mem[0],zero
		; CHECK-NEXT: addsd %xmm1, %xmm2
		; CHECK-NEXT: addsd {{.*}}(%rip), %xmm1
		; CHECK-NEXT: unpcklpd {{.*#+}} xmm1 = xmm1[0],xmm2[0]
; CHECK-NEXT: retq		; CHECK-NEXT: retq
;		;
; AVX-LABEL: constrained_vector_fadd_v4f64:		; AVX-LABEL: constrained_vector_fadd_v4f64:
; AVX: # %bb.0: # %entry		; AVX: # %bb.0: # %entry
; AVX-NEXT: vmovapd {{.*#+}} ymm0 = [1.7976931348623157E+308,1.7976931348623157E+308,1.7976931348623157E+308,1.7976931348623157E+308]		; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
; AVX-NEXT: vaddpd {{.*}}(%rip), %ymm0, %ymm0		; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm0, %xmm1
		; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm0, %xmm2
		; AVX-NEXT: vunpcklpd {{.*#+}} xmm1 = xmm2[0],xmm1[0]
		; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm0, %xmm2
		; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm0, %xmm0
		; AVX-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm2[0]
		; AVX-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
; AVX-NEXT: retq		; AVX-NEXT: retq
entry:		entry:
%add = call <4 x double> @llvm.experimental.constrained.fadd.v4f64(		%add = call <4 x double> @llvm.experimental.constrained.fadd.v4f64(
<4 x double> <double 0x7FEFFFFFFFFFFFFF, double 0x7FEFFFFFFFFFFFFF,		<4 x double> <double 0x7FEFFFFFFFFFFFFF, double 0x7FEFFFFFFFFFFFFF,
double 0x7FEFFFFFFFFFFFFF, double 0x7FEFFFFFFFFFFFFF>,		double 0x7FEFFFFFFFFFFFFF, double 0x7FEFFFFFFFFFFFFF>,
<4 x double> <double 1.000000e+00, double 1.000000e-01,		<4 x double> <double 1.000000e+00, double 1.000000e-01,
double 2.000000e+00, double 2.000000e-01>,		double 2.000000e+00, double 2.000000e-01>,
metadata !"round.dynamic",		metadata !"round.dynamic",
Show All 20 Lines	%sub = call <1 x float> @llvm.experimental.constrained.fsub.v1f32(
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <1 x float> %sub		ret <1 x float> %sub
}		}

define <2 x double> @constrained_vector_fsub_v2f64() {		define <2 x double> @constrained_vector_fsub_v2f64() {
; CHECK-LABEL: constrained_vector_fsub_v2f64:		; CHECK-LABEL: constrained_vector_fsub_v2f64:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: movapd {{.*#+}} xmm0 = [-1.7976931348623157E+308,-1.7976931348623157E+308]		; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
; CHECK-NEXT: subpd {{.*}}(%rip), %xmm0		; CHECK-NEXT: movapd %xmm0, %xmm1
		; CHECK-NEXT: subsd {{.*}}(%rip), %xmm1
		; CHECK-NEXT: subsd {{.*}}(%rip), %xmm0
		; CHECK-NEXT: unpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]
; CHECK-NEXT: retq		; CHECK-NEXT: retq
;		;
; AVX-LABEL: constrained_vector_fsub_v2f64:		; AVX-LABEL: constrained_vector_fsub_v2f64:
; AVX: # %bb.0: # %entry		; AVX: # %bb.0: # %entry
; AVX-NEXT: vmovapd {{.*#+}} xmm0 = [-1.7976931348623157E+308,-1.7976931348623157E+308]		; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
; AVX-NEXT: vsubpd {{.*}}(%rip), %xmm0, %xmm0		; AVX-NEXT: vsubsd {{.*}}(%rip), %xmm0, %xmm1
		; AVX-NEXT: vsubsd {{.*}}(%rip), %xmm0, %xmm0
		; AVX-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]
; AVX-NEXT: retq		; AVX-NEXT: retq
entry:		entry:
%sub = call <2 x double> @llvm.experimental.constrained.fsub.v2f64(		%sub = call <2 x double> @llvm.experimental.constrained.fsub.v2f64(
<2 x double> <double 0xFFEFFFFFFFFFFFFF, double 0xFFEFFFFFFFFFFFFF>,		<2 x double> <double 0xFFEFFFFFFFFFFFFF, double 0xFFEFFFFFFFFFFFFF>,
<2 x double> <double 1.000000e+00, double 1.000000e-01>,		<2 x double> <double 1.000000e+00, double 1.000000e-01>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <2 x double> %sub		ret <2 x double> %sub
Show All 33 Lines	entry:
ret <3 x float> %sub		ret <3 x float> %sub
}		}

define <3 x double> @constrained_vector_fsub_v3f64() {		define <3 x double> @constrained_vector_fsub_v3f64() {
; CHECK-LABEL: constrained_vector_fsub_v3f64:		; CHECK-LABEL: constrained_vector_fsub_v3f64:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: xorpd %xmm0, %xmm0		; CHECK-NEXT: xorpd %xmm0, %xmm0
; CHECK-NEXT: movsd {{.*#+}} xmm1 = mem[0],zero		; CHECK-NEXT: movsd {{.*#+}} xmm1 = mem[0],zero
; CHECK-NEXT: subsd %xmm0, %xmm1		; CHECK-NEXT: movapd %xmm1, %xmm2
; CHECK-NEXT: movapd {{.*#+}} xmm0 = [-1.7976931348623157E+308,-1.7976931348623157E+308]		; CHECK-NEXT: subsd %xmm0, %xmm2
; CHECK-NEXT: subpd {{.*}}(%rip), %xmm0		; CHECK-NEXT: movapd %xmm1, %xmm0
; CHECK-NEXT: movsd %xmm1, -{{[0-9]+}}(%rsp)		; CHECK-NEXT: subsd {{.*}}(%rip), %xmm0
; CHECK-NEXT: movapd %xmm0, %xmm1		; CHECK-NEXT: subsd {{.*}}(%rip), %xmm1
; CHECK-NEXT: unpckhpd {{.*#+}} xmm1 = xmm1[1],xmm0[1]		; CHECK-NEXT: movsd %xmm2, -{{[0-9]+}}(%rsp)
; CHECK-NEXT: fldl -{{[0-9]+}}(%rsp)		; CHECK-NEXT: fldl -{{[0-9]+}}(%rsp)
; CHECK-NEXT: retq		; CHECK-NEXT: retq
;		;
; AVX-LABEL: constrained_vector_fsub_v3f64:		; AVX-LABEL: constrained_vector_fsub_v3f64:
; AVX: # %bb.0: # %entry		; AVX: # %bb.0: # %entry
; AVX-NEXT: vxorpd %xmm0, %xmm0, %xmm0		; AVX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
; AVX-NEXT: vmovsd {{.*#+}} xmm1 = mem[0],zero		; AVX-NEXT: vmovsd {{.*#+}} xmm1 = mem[0],zero
; AVX-NEXT: vsubsd %xmm0, %xmm1, %xmm0		; AVX-NEXT: vsubsd %xmm0, %xmm1, %xmm0
; AVX-NEXT: vmovapd {{.*#+}} xmm1 = [-1.7976931348623157E+308,-1.7976931348623157E+308]		; AVX-NEXT: vsubsd {{.*}}(%rip), %xmm1, %xmm2
; AVX-NEXT: vsubpd {{.*}}(%rip), %xmm1, %xmm1		; AVX-NEXT: vsubsd {{.*}}(%rip), %xmm1, %xmm1
		; AVX-NEXT: vunpcklpd {{.*#+}} xmm1 = xmm1[0],xmm2[0]
; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0		; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
; AVX-NEXT: retq		; AVX-NEXT: retq
entry:		entry:
%sub = call <3 x double> @llvm.experimental.constrained.fsub.v3f64(		%sub = call <3 x double> @llvm.experimental.constrained.fsub.v3f64(
<3 x double> <double 0xFFEFFFFFFFFFFFFF, double 0xFFEFFFFFFFFFFFFF,		<3 x double> <double 0xFFEFFFFFFFFFFFFF, double 0xFFEFFFFFFFFFFFFF,
double 0xFFEFFFFFFFFFFFFF>,		double 0xFFEFFFFFFFFFFFFF>,
<3 x double> <double 2.0, double 1.0, double 0.0>,		<3 x double> <double 2.0, double 1.0, double 0.0>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <3 x double> %sub		ret <3 x double> %sub
}		}

define <4 x double> @constrained_vector_fsub_v4f64() {		define <4 x double> @constrained_vector_fsub_v4f64() {
; CHECK-LABEL: constrained_vector_fsub_v4f64:		; CHECK-LABEL: constrained_vector_fsub_v4f64:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: movapd {{.*#+}} xmm1 = [-1.7976931348623157E+308,-1.7976931348623157E+308]		; CHECK-NEXT: movsd {{.*#+}} xmm1 = mem[0],zero
		; CHECK-NEXT: movapd %xmm1, %xmm2
		; CHECK-NEXT: subsd {{.*}}(%rip), %xmm2
; CHECK-NEXT: movapd %xmm1, %xmm0		; CHECK-NEXT: movapd %xmm1, %xmm0
; CHECK-NEXT: subpd {{.*}}(%rip), %xmm0		; CHECK-NEXT: subsd {{.*}}(%rip), %xmm0
; CHECK-NEXT: subpd {{.*}}(%rip), %xmm1		; CHECK-NEXT: unpcklpd {{.*#+}} xmm0 = xmm0[0],xmm2[0]
		; CHECK-NEXT: movapd %xmm1, %xmm2
		; CHECK-NEXT: subsd {{.*}}(%rip), %xmm2
		; CHECK-NEXT: subsd {{.*}}(%rip), %xmm1
		; CHECK-NEXT: unpcklpd {{.*#+}} xmm1 = xmm1[0],xmm2[0]
; CHECK-NEXT: retq		; CHECK-NEXT: retq
;		;
; AVX-LABEL: constrained_vector_fsub_v4f64:		; AVX-LABEL: constrained_vector_fsub_v4f64:
; AVX: # %bb.0: # %entry		; AVX: # %bb.0: # %entry
; AVX-NEXT: vmovapd {{.*#+}} ymm0 = [-1.7976931348623157E+308,-1.7976931348623157E+308,-1.7976931348623157E+308,-1.7976931348623157E+308]		; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
; AVX-NEXT: vsubpd {{.*}}(%rip), %ymm0, %ymm0		; AVX-NEXT: vsubsd {{.*}}(%rip), %xmm0, %xmm1
		; AVX-NEXT: vsubsd {{.*}}(%rip), %xmm0, %xmm2
		; AVX-NEXT: vunpcklpd {{.*#+}} xmm1 = xmm2[0],xmm1[0]
		; AVX-NEXT: vsubsd {{.*}}(%rip), %xmm0, %xmm2
		; AVX-NEXT: vsubsd {{.*}}(%rip), %xmm0, %xmm0
		; AVX-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm2[0]
		; AVX-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
; AVX-NEXT: retq		; AVX-NEXT: retq
entry:		entry:
%sub = call <4 x double> @llvm.experimental.constrained.fsub.v4f64(		%sub = call <4 x double> @llvm.experimental.constrained.fsub.v4f64(
<4 x double> <double 0xFFEFFFFFFFFFFFFF, double 0xFFEFFFFFFFFFFFFF,		<4 x double> <double 0xFFEFFFFFFFFFFFFF, double 0xFFEFFFFFFFFFFFFF,
double 0xFFEFFFFFFFFFFFFF, double 0xFFEFFFFFFFFFFFFF>,		double 0xFFEFFFFFFFFFFFFF, double 0xFFEFFFFFFFFFFFFF>,
<4 x double> <double 1.000000e+00, double 1.000000e-01,		<4 x double> <double 1.000000e+00, double 1.000000e-01,
double 2.000000e+00, double 2.000000e-01>,		double 2.000000e+00, double 2.000000e-01>,
metadata !"round.dynamic",		metadata !"round.dynamic",
▲ Show 20 Lines • Show All 3,953 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Strict FP] Allow custom operation actionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 213569

llvm/trunk/include/llvm/CodeGen/TargetLowering.h

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

llvm/trunk/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll

llvm/trunk/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

[Strict FP] Allow custom operation actions
ClosedPublic