This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
SelectionDAGISel.h
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
-
SelectionDAGISel.cpp
-
test/CodeGen/
-
CodeGen/
-
PowerPC/
1/2
fp-strict-conv-f128.ll
-
fp-strict-fcmp-noopt.ll
-
nofpexcept.ll
-
ppcf128-constrained-fp-intrinsics.ll
-
vector-constrained-fp-intrinsics.ll
-
SystemZ/
1
vector-constrained-fp-intrinsics.ll

Differential D127254

[SelectionDAGISel] Chain any mayRaiseFPException instruction created from a strict FP node
Needs ReviewPublic

Authored by craig.topper on Jun 7 2022, 2:48 PM.

Download Raw Diff

Details

Reviewers

andrew.w.kaylor
uweigand
jonpa
kpn
efriedma
nemanjai

Summary

Tablegen doesn't set the OFPL_Chain flag for all the mayRaiseFPException
instructions in an isel output pattern. In practice, it only sets it
for the root of the output pattern and only if the pattern in the
instruction class in tablegen is empty or contains a strict FP node.

It's unclear to me that tablegen has enough information to set the
OPFL_Chain flag.

The result of this is that we don't always add a chain through
mayRaiseFPException instructions created during isel.

This patch updates the isel machinery to detect when we're emitting
nodes for a pattern that contained a strict FP node and will thread
a chain through all mayRaiseFPException nodes we create.

Some isel patterns particularly on PowerPC emit the same compare
instruction multiple times in their isel pattern and expects them
to be CSEd. To ensure they CSE, this patch avoids updating InputChain
until the end of the match or when we find an OPFL_Chain flag.

Hopefully fixes PR54617.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	620 ms	x64 debian > LLVM.CodeGen/PowerPC::fp-strict-fcmp.ll
	60,070 ms	x64 debian > ThreadSanitizer-x86_64.ThreadSanitizer-x86_64::restore_stack.cpp

Event Timeline

craig.topper created this revision.Jun 7 2022, 2:48 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 2:48 PM

Herald added subscribers: StephenFan, steven.zhang, kbarton and 2 others. · View Herald Transcript

craig.topper requested review of this revision.Jun 7 2022, 2:48 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 2:48 PM

efriedma added inline comments.Jun 7 2022, 2:57 PM

llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll
625	What's up with the repeated fcmpo instructions? Are the compares repeated in the input, or does your code to ensure CSE not cover this case?

craig.topper added inline comments.Jun 7 2022, 3:12 PM

llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll

625

They exist in the input to isel

Originating from type legalization.

Expand float operand: t14: i1,ch = strict_fsetccs t0, t5, ConstantFP:ppcf128<APFloat(4746794007248502784)>, setlt:ch
                                                                                 
Creating new node: t26: i1,ch = strict_fsetccs t0, t2, ConstantFP:f64<2.147484e+09>, setoeq:ch
Creating new node: t27: i1,ch = strict_fsetccs t26:1, t4, ConstantFP:f64<0.000000e+00>, setlt:ch
Creating new node: t28: i1 = and t26, t27                                        
Creating new node: t30: i1,ch = strict_fsetccs t27:1, t2, ConstantFP:f64<2.147484e+09>, setune:ch
Creating new node: t31: i1,ch = strict_fsetccs t30:1, t2, ConstantFP:f64<2.147484e+09>, setlt:ch
Creating new node: t32: i1 = and t30, t31                                        
Creating new node: t33: i1 = or t32, t2

Would it make this code any simpler to say that nodes that can raise FP exceptions always have a chain, even if the original operation isn't strict? I mean, that restricts scheduling a bit, but not sure we care. Maybe it doesn't help, though...

How many PowerPC patterns need the CSE code? Would it make sense to get rid of the CSE code, and just reimplement the relevant patterns using custom lowering, or something like that?

Besides that, I can't see any way to reduce the complexity here. Maybe extract the "Update InputChain if there are any strict fp nodes" into a helper, since it looks like it's copy-pasted a few times.

Harbormaster completed remote builds in B168418: Diff 434963.Jun 7 2022, 3:40 PM

craig.topper added a reviewer: nemanjai.Jun 9 2022, 9:16 PM

How many PowerPC patterns need the CSE code? Would it make sense to get rid of the CSE code, and just reimplement the relevant patterns using custom lowering, or something like that?

It's basically anything that uses CRNotPat in PPCInstrInfo.td. This is because crnot is defined as

def crnot : OutPatFrag<(ops node:$in),
                      (CRNOR $in, $in)>;

which replicates the $in. I think this is quite a few patterns to custom select.

I tried replacing CRNOR $in, $in with CRXOR $in, (CRSET) and rely on PPCDAGToDAGISel::PeepholeCROps to turn it into a CRNOR. This works except at -O0.

Another option could be to add a CRNOT CodeGenOnly unary opcode.

@nemanjai what do you think?

From a SystemZ perspective this looks good to me.

llvm/test/CodeGen/SystemZ/vector-constrained-fp-intrinsics.ll
4301	Not sure why this patch causes the two output registers to be computed in reverse order now, but either order should be fine. (And now the order in vectorized code matches the order in scalar code ...)

In D127254#3572361, @craig.topper wrote:
How many PowerPC patterns need the CSE code? Would it make sense to get rid of the CSE code, and just reimplement the relevant patterns using custom lowering, or something like that?

It's basically anything that uses CRNotPat in PPCInstrInfo.td. This is because crnot is defined as
def crnot : OutPatFrag<(ops node:$in),
                      (CRNOR $in, $in)>;
which replicates the $in. I think this is quite a few patterns to custom select.

I tried replacing CRNOR $in, $in with CRXOR $in, (CRSET) and rely on PPCDAGToDAGISel::PeepholeCROps to turn it into a CRNOR. This works except at -O0.

Another option could be to add a CRNOT CodeGenOnly unary opcode.

@nemanjai what do you think?

I am really sorry about the delay in responding to this. I think a unary pseudo for CRNOT is a perfectly reasonable solution and is in line with other instructions for which we wanted to avoid duplicating inputs that have chains (such as XXPERMDIs, etc.).

I'll put up a patch to do this ASAP.

In D127254#3775212, @nemanjai wrote:

I am really sorry about the delay in responding to this. I think a unary pseudo for CRNOT is a perfectly reasonable solution and is in line with other instructions for which we wanted to avoid duplicating inputs that have chains (such as XXPERMDIs, etc.).

I'll put up a patch to do this ASAP.

Posted https://reviews.llvm.org/D133577 that handles this.

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

SelectionDAGISel.h

3 lines

lib/

CodeGen/

SelectionDAG/

SelectionDAGISel.cpp

92 lines

test/

CodeGen/

PowerPC/

fp-strict-conv-f128.ll

20 lines

fp-strict-fcmp-noopt.ll

18 lines

nofpexcept.ll

12 lines

ppcf128-constrained-fp-intrinsics.ll

16 lines

vector-constrained-fp-intrinsics.ll

92 lines

SystemZ/

vector-constrained-fp-intrinsics.ll

8 lines

Diff 434963

llvm/include/llvm/CodeGen/SelectionDAGISel.h

//===-- llvm/CodeGen/SelectionDAGISel.h - Common Base Class------- C++ --===//		//===-- llvm/CodeGen/SelectionDAGISel.h - Common Base Class------- C++ --===//
		Lint: Lint Inline Actions clang-format not found in user’s local PATH; not linting file. Lint: Lint: clang-format not found in user’s local PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 311 Lines • ▼ Show 20 Lines	private:
void CannotYetSelect(SDNode *N);		void CannotYetSelect(SDNode *N);

void Select_FREEZE(SDNode *N);		void Select_FREEZE(SDNode *N);
void Select_ARITH_FENCE(SDNode *N);		void Select_ARITH_FENCE(SDNode *N);

private:		private:
void DoInstructionSelection();		void DoInstructionSelection();
SDNode MorphNode(SDNode Node, unsigned TargetOpc, SDVTList VTList,		SDNode MorphNode(SDNode Node, unsigned TargetOpc, SDVTList VTList,
ArrayRef<SDValue> Ops, unsigned EmitNodeInfo);		ArrayRef<SDValue> Ops, unsigned EmitNodeInfo,
		bool NodeHasChain);

/// Prepares the landing pad to take incoming values or do other EH		/// Prepares the landing pad to take incoming values or do other EH
/// personality specific tasks. Returns true if the block should be		/// personality specific tasks. Returns true if the block should be
/// instruction selected, false if no code should be emitted for it.		/// instruction selected, false if no code should be emitted for it.
bool PrepareEHLandingPad();		bool PrepareEHLandingPad();

/// Perform instruction selection on all basic blocks in the function.		/// Perform instruction selection on all basic blocks in the function.
void SelectAllBasicBlocks(const Function &Fn);		void SelectAllBasicBlocks(const Function &Fn);
Show All 35 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp

//===- SelectionDAGISel.cpp - Implement the SelectionDAGISel class --------===//		//===- SelectionDAGISel.cpp - Implement the SelectionDAGISel class --------===//
		Lint: Lint Inline Actions clang-format not found in user’s local PATH; not linting file. Lint: Lint: clang-format not found in user’s local PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 2,364 Lines • ▼ Show 20 Lines	HandleMergeInputChains(SmallVectorImpl<SDNode*> &ChainNodesMatched,
// Return merged chain.		// Return merged chain.
if (InputChains.size() == 1)		if (InputChains.size() == 1)
return InputChains[0];		return InputChains[0];
return CurDAG->getNode(ISD::TokenFactor, SDLoc(ChainNodesMatched[0]),		return CurDAG->getNode(ISD::TokenFactor, SDLoc(ChainNodesMatched[0]),
MVT::Other, InputChains);		MVT::Other, InputChains);
}		}

/// MorphNode - Handle morphing a node in place for the selector.		/// MorphNode - Handle morphing a node in place for the selector.
SDNode *SelectionDAGISel::		SDNode SelectionDAGISel::MorphNode(SDNode Node, unsigned TargetOpc,
MorphNode(SDNode *Node, unsigned TargetOpc, SDVTList VTList,		SDVTList VTList, ArrayRef<SDValue> Ops,
ArrayRef<SDValue> Ops, unsigned EmitNodeInfo) {		unsigned EmitNodeInfo, bool NodeHasChain) {
// It is possible we're using MorphNodeTo to replace a node with no		// It is possible we're using MorphNodeTo to replace a node with no
// normal results with one that has a normal result (or we could be		// normal results with one that has a normal result (or we could be
// adding a chain) and the input could have glue and chains as well.		// adding a chain) and the input could have glue and chains as well.
// In this case we need to shift the operands down.		// In this case we need to shift the operands down.
// FIXME: This is a horrible hack and broken in obscure cases, no worse		// FIXME: This is a horrible hack and broken in obscure cases, no worse
// than the old isel though.		// than the old isel though.
int OldGlueResultNo = -1, OldChainResultNo = -1;		int OldGlueResultNo = -1, OldChainResultNo = -1;

Show All 25 Lines	if ((EmitNodeInfo & OPFL_GlueOutput) && OldGlueResultNo != -1 &&
(unsigned)OldGlueResultNo != ResNumResults-1)		(unsigned)OldGlueResultNo != ResNumResults-1)
ReplaceUses(SDValue(Node, OldGlueResultNo),		ReplaceUses(SDValue(Node, OldGlueResultNo),
SDValue(Res, ResNumResults - 1));		SDValue(Res, ResNumResults - 1));

if ((EmitNodeInfo & OPFL_GlueOutput) != 0)		if ((EmitNodeInfo & OPFL_GlueOutput) != 0)
--ResNumResults;		--ResNumResults;

// Move the chain reference if needed.		// Move the chain reference if needed.
if ((EmitNodeInfo & OPFL_Chain) && OldChainResultNo != -1 &&		if (NodeHasChain && OldChainResultNo != -1 &&
(unsigned)OldChainResultNo != ResNumResults-1)		(unsigned)OldChainResultNo != ResNumResults - 1)
ReplaceUses(SDValue(Node, OldChainResultNo),		ReplaceUses(SDValue(Node, OldChainResultNo),
SDValue(Res, ResNumResults - 1));		SDValue(Res, ResNumResults - 1));

// Otherwise, no replacement happened because the node already exists. Replace		// Otherwise, no replacement happened because the node already exists. Replace
// Uses of the old node with the new one.		// Uses of the old node with the new one.
if (Res != Node) {		if (Res != Node) {
ReplaceNode(Node, Res);		ReplaceNode(Node, Res);
} else {		} else {
▲ Show 20 Lines • Show All 388 Lines • ▼ Show 20 Lines	void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
SDValue InputChain, InputGlue;		SDValue InputChain, InputGlue;

// ChainNodesMatched - If a pattern matches nodes that have input/output		// ChainNodesMatched - If a pattern matches nodes that have input/output
// chains, the OPC_EmitMergeInputChains operation is emitted which indicates		// chains, the OPC_EmitMergeInputChains operation is emitted which indicates
// which ones they are. The result is captured into this list so that we can		// which ones they are. The result is captured into this list so that we can
// update the chain results when the pattern is complete.		// update the chain results when the pattern is complete.
SmallVector<SDNode*, 3> ChainNodesMatched;		SmallVector<SDNode*, 3> ChainNodesMatched;

		// Collect chains for machine nodes that may raise exceptions. We allow
		// multiple MachineSDNodes to be emitted with parallel chains. We'll join
		// them with a TokenFactor. This helps CSE identical nodes with in a match.
		SmallVector<SDValue, 2> StrictFPChains;

LLVM_DEBUG(dbgs() << "ISEL: Starting pattern match\n");		LLVM_DEBUG(dbgs() << "ISEL: Starting pattern match\n");

// Determine where to start the interpreter. Normally we start at opcode #0,		// Determine where to start the interpreter. Normally we start at opcode #0,
// but if the state machine starts with an OPC_SwitchOpcode, then we		// but if the state machine starts with an OPC_SwitchOpcode, then we
// accelerate the first lookup (which is guaranteed to be hot) with the		// accelerate the first lookup (which is guaranteed to be hot) with the
// OpcodeOffset table.		// OpcodeOffset table.
unsigned MatcherIndex = 0;		unsigned MatcherIndex = 0;

▲ Show 20 Lines • Show All 534 Lines • ▼ Show 20 Lines	#endif
}		}

case OPC_EmitNode: case OPC_MorphNodeTo:		case OPC_EmitNode: case OPC_MorphNodeTo:
case OPC_EmitNode0: case OPC_EmitNode1: case OPC_EmitNode2:		case OPC_EmitNode0: case OPC_EmitNode1: case OPC_EmitNode2:
case OPC_MorphNodeTo0: case OPC_MorphNodeTo1: case OPC_MorphNodeTo2: {		case OPC_MorphNodeTo0: case OPC_MorphNodeTo1: case OPC_MorphNodeTo2: {
uint16_t TargetOpc = MatcherTable[MatcherIndex++];		uint16_t TargetOpc = MatcherTable[MatcherIndex++];
TargetOpc \|= (unsigned short)MatcherTable[MatcherIndex++] << 8;		TargetOpc \|= (unsigned short)MatcherTable[MatcherIndex++] << 8;
unsigned EmitNodeInfo = MatcherTable[MatcherIndex++];		unsigned EmitNodeInfo = MatcherTable[MatcherIndex++];

		// We need to perform this check before potentially modifying one of the
		// nodes via MorphNode.
		bool MayRaiseFPException =
		llvm::any_of(ChainNodesMatched, [this](SDNode *N) {
		return mayRaiseFPException(N) && !N->getFlags().hasNoFPExcept();
		});

		bool NodeMayRaiseFPException =
		MayRaiseFPException && TII->get(TargetOpc).mayRaiseFPException();
		bool NodeHasChain =
		(EmitNodeInfo & OPFL_Chain) \|\| NodeMayRaiseFPException;

// Get the result VT list.		// Get the result VT list.
unsigned NumVTs;		unsigned NumVTs;
// If this is one of the compressed forms, get the number of VTs based		// If this is one of the compressed forms, get the number of VTs based
// on the Opcode. Otherwise read the next byte from the table.		// on the Opcode. Otherwise read the next byte from the table.
if (Opcode >= OPC_MorphNodeTo0 && Opcode <= OPC_MorphNodeTo2)		if (Opcode >= OPC_MorphNodeTo0 && Opcode <= OPC_MorphNodeTo2)
NumVTs = Opcode - OPC_MorphNodeTo0;		NumVTs = Opcode - OPC_MorphNodeTo0;
else if (Opcode >= OPC_EmitNode0 && Opcode <= OPC_EmitNode2)		else if (Opcode >= OPC_EmitNode0 && Opcode <= OPC_EmitNode2)
NumVTs = Opcode - OPC_EmitNode0;		NumVTs = Opcode - OPC_EmitNode0;
else		else
NumVTs = MatcherTable[MatcherIndex++];		NumVTs = MatcherTable[MatcherIndex++];
SmallVector<EVT, 4> VTs;		SmallVector<EVT, 4> VTs;
for (unsigned i = 0; i != NumVTs; ++i) {		for (unsigned i = 0; i != NumVTs; ++i) {
MVT::SimpleValueType VT =		MVT::SimpleValueType VT =
(MVT::SimpleValueType)MatcherTable[MatcherIndex++];		(MVT::SimpleValueType)MatcherTable[MatcherIndex++];
if (VT == MVT::iPTR)		if (VT == MVT::iPTR)
VT = TLI->getPointerTy(CurDAG->getDataLayout()).SimpleTy;		VT = TLI->getPointerTy(CurDAG->getDataLayout()).SimpleTy;
VTs.push_back(VT);		VTs.push_back(VT);
}		}

if (EmitNodeInfo & OPFL_Chain)		if (NodeHasChain)
VTs.push_back(MVT::Other);		VTs.push_back(MVT::Other);
if (EmitNodeInfo & OPFL_GlueOutput)		if (EmitNodeInfo & OPFL_GlueOutput)
VTs.push_back(MVT::Glue);		VTs.push_back(MVT::Glue);

// This is hot code, so optimize the two most common cases of 1 and 2		// This is hot code, so optimize the two most common cases of 1 and 2
// results.		// results.
SDVTList VTList;		SDVTList VTList;
if (VTs.size() == 1)		if (VTs.size() == 1)
Show All 14 Lines	case OPC_MorphNodeTo0: case OPC_MorphNodeTo1: case OPC_MorphNodeTo2: {
assert(RecNo < RecordedNodes.size() && "Invalid EmitNode");		assert(RecNo < RecordedNodes.size() && "Invalid EmitNode");
Ops.push_back(RecordedNodes[RecNo].first);		Ops.push_back(RecordedNodes[RecNo].first);
}		}

// If there are variadic operands to add, handle them now.		// If there are variadic operands to add, handle them now.
if (EmitNodeInfo & OPFL_VariadicInfo) {		if (EmitNodeInfo & OPFL_VariadicInfo) {
// Determine the start index to copy from.		// Determine the start index to copy from.
unsigned FirstOpToCopy = getNumFixedFromVariadicInfo(EmitNodeInfo);		unsigned FirstOpToCopy = getNumFixedFromVariadicInfo(EmitNodeInfo);
FirstOpToCopy += (EmitNodeInfo & OPFL_Chain) ? 1 : 0;		FirstOpToCopy += NodeHasChain ? 1 : 0;
assert(NodeToMatch->getNumOperands() >= FirstOpToCopy &&		assert(NodeToMatch->getNumOperands() >= FirstOpToCopy &&
"Invalid variadic node");		"Invalid variadic node");
// Copy all of the variadic operands, not including a potential glue		// Copy all of the variadic operands, not including a potential glue
// input.		// input.
for (unsigned i = FirstOpToCopy, e = NodeToMatch->getNumOperands();		for (unsigned i = FirstOpToCopy, e = NodeToMatch->getNumOperands();
i != e; ++i) {		i != e; ++i) {
SDValue V = NodeToMatch->getOperand(i);		SDValue V = NodeToMatch->getOperand(i);
if (V.getValueType() == MVT::Glue) break;		if (V.getValueType() == MVT::Glue) break;
Ops.push_back(V);		Ops.push_back(V);
}		}
}		}

		if (EmitNodeInfo & OPFL_Chain) {
		// Update InputChain if there are any strict fp nodes.
		if (!StrictFPChains.empty()) {
		if (StrictFPChains.size() == 1)
		InputChain = StrictFPChains[0];
		else
		InputChain =
		CurDAG->getNode(ISD::TokenFactor, SDLoc(StrictFPChains[0]),
		MVT::Other, StrictFPChains);
		StrictFPChains.clear();
		}
		}

// If this has chain/glue inputs, add them.		// If this has chain/glue inputs, add them.
if (EmitNodeInfo & OPFL_Chain)		if (NodeHasChain)
Ops.push_back(InputChain);		Ops.push_back(InputChain);
if ((EmitNodeInfo & OPFL_GlueInput) && InputGlue.getNode() != nullptr)		if ((EmitNodeInfo & OPFL_GlueInput) && InputGlue.getNode() != nullptr)
Ops.push_back(InputGlue);		Ops.push_back(InputGlue);

// Check whether any matched node could raise an FP exception. Since all
// such nodes must have a chain, it suffices to check ChainNodesMatched.
// We need to perform this check before potentially modifying one of the
// nodes via MorphNode.
bool MayRaiseFPException =
llvm::any_of(ChainNodesMatched, [this](SDNode *N) {
return mayRaiseFPException(N) && !N->getFlags().hasNoFPExcept();
});

// Create the node.		// Create the node.
MachineSDNode *Res = nullptr;		MachineSDNode *Res = nullptr;
bool IsMorphNodeTo = Opcode == OPC_MorphNodeTo \|\|		bool IsMorphNodeTo = Opcode == OPC_MorphNodeTo \|\|
(Opcode >= OPC_MorphNodeTo0 && Opcode <= OPC_MorphNodeTo2);		(Opcode >= OPC_MorphNodeTo0 && Opcode <= OPC_MorphNodeTo2);
if (!IsMorphNodeTo) {		if (!IsMorphNodeTo) {
// If this is a normal EmitNode command, just create the new node and		// If this is a normal EmitNode command, just create the new node and
// add the results to the RecordedNodes list.		// add the results to the RecordedNodes list.
Res = CurDAG->getMachineNode(TargetOpc, SDLoc(NodeToMatch),		Res = CurDAG->getMachineNode(TargetOpc, SDLoc(NodeToMatch),
Show All 11 Lines	case OPC_MorphNodeTo0: case OPC_MorphNodeTo1: case OPC_MorphNodeTo2: {
SelectionDAG::DAGNodeDeletedListener NDL(CurDAG, [&](SDNode N,		SelectionDAG::DAGNodeDeletedListener NDL(CurDAG, [&](SDNode N,
SDNode *E) {		SDNode *E) {
CurDAG->salvageDebugInfo(*N);		CurDAG->salvageDebugInfo(*N);
auto &Chain = ChainNodesMatched;		auto &Chain = ChainNodesMatched;
assert((!E \|\| !is_contained(Chain, N)) &&		assert((!E \|\| !is_contained(Chain, N)) &&
"Chain node replaced during MorphNode");		"Chain node replaced during MorphNode");
llvm::erase_value(Chain, N);		llvm::erase_value(Chain, N);
});		});
Res = cast<MachineSDNode>(MorphNode(NodeToMatch, TargetOpc, VTList,		Res = cast<MachineSDNode>(MorphNode(NodeToMatch, TargetOpc, VTList, Ops,
Ops, EmitNodeInfo));		EmitNodeInfo, NodeHasChain));
}		}

// Set the NoFPExcept flag when no original matched node could		// Set the NoFPExcept flag when no original matched node could
// raise an FP exception, but the new node potentially might.		// raise an FP exception, but the new node potentially might.
if (!MayRaiseFPException && mayRaiseFPException(Res)) {		if (!MayRaiseFPException && mayRaiseFPException(Res)) {
SDNodeFlags Flags = Res->getFlags();		SDNodeFlags Flags = Res->getFlags();
Flags.setNoFPExcept(true);		Flags.setNoFPExcept(true);
Res->setFlags(Flags);		Res->setFlags(Flags);
}		}

// If the node had chain/glue results, update our notion of the current		// If the node had chain/glue results, update our notion of the current
// chain and glue.		// chain and glue.
if (EmitNodeInfo & OPFL_GlueOutput) {		if (EmitNodeInfo & OPFL_GlueOutput) {
InputGlue = SDValue(Res, VTs.size()-1);		InputGlue = SDValue(Res, VTs.size()-1);
if (EmitNodeInfo & OPFL_Chain)		if (EmitNodeInfo & OPFL_Chain) {
		assert(StrictFPChains.empty() && "Chain node and strict FP node?");
InputChain = SDValue(Res, VTs.size()-2);		InputChain = SDValue(Res, VTs.size()-2);
} else if (EmitNodeInfo & OPFL_Chain)		} else if (NodeMayRaiseFPException)
		StrictFPChains.push_back(SDValue(Res, VTs.size() - 2));
		} else if (EmitNodeInfo & OPFL_Chain) {
		assert(StrictFPChains.empty() && "Chain node and strict FP node?");
InputChain = SDValue(Res, VTs.size()-1);		InputChain = SDValue(Res, VTs.size()-1);
		} else if (NodeMayRaiseFPException)
		StrictFPChains.push_back(SDValue(Res, VTs.size() - 1));

// If the OPFL_MemRefs glue is set on this node, slap all of the		// If the OPFL_MemRefs glue is set on this node, slap all of the
// accumulated memrefs onto it.		// accumulated memrefs onto it.
//		//
// FIXME: This is vastly incorrect for patterns with multiple outputs		// FIXME: This is vastly incorrect for patterns with multiple outputs
// instructions that access memory and for ComplexPatterns that match		// instructions that access memory and for ComplexPatterns that match
// loads.		// loads.
if (EmitNodeInfo & OPFL_MemRefs) {		if (EmitNodeInfo & OPFL_MemRefs) {
Show All 24 Lines	case OPC_MorphNodeTo0: case OPC_MorphNodeTo1: case OPC_MorphNodeTo2: {
LLVM_DEBUG(if (!MatchedMemRefs.empty() && Res->memoperands_empty()) dbgs()		LLVM_DEBUG(if (!MatchedMemRefs.empty() && Res->memoperands_empty()) dbgs()
<< " Dropping mem operands\n";		<< " Dropping mem operands\n";
dbgs() << " " << (IsMorphNodeTo ? "Morphed" : "Created")		dbgs() << " " << (IsMorphNodeTo ? "Morphed" : "Created")
<< " node: ";		<< " node: ";
Res->dump(CurDAG););		Res->dump(CurDAG););

// If this was a MorphNodeTo then we're completely done!		// If this was a MorphNodeTo then we're completely done!
if (IsMorphNodeTo) {		if (IsMorphNodeTo) {
		// Update InputChain if there are any strict fp nodes.
		if (!StrictFPChains.empty()) {
		if (StrictFPChains.size() == 1)
		InputChain = StrictFPChains[0];
		else
		InputChain =
		CurDAG->getNode(ISD::TokenFactor, SDLoc(StrictFPChains[0]),
		MVT::Other, StrictFPChains);
		StrictFPChains.clear();
		}

// Update chain uses.		// Update chain uses.
UpdateChains(Res, InputChain, ChainNodesMatched, true);		UpdateChains(Res, InputChain, ChainNodesMatched, true);
return;		return;
}		}
continue;		continue;
}		}

case OPC_CompleteMatch: {		case OPC_CompleteMatch: {
Show All 18 Lines	case OPC_CompleteMatch: {
NodeToMatch->getValueType(i) == MVT::iPTR \|\|		NodeToMatch->getValueType(i) == MVT::iPTR \|\|
Res.getValueType() == MVT::iPTR \|\|		Res.getValueType() == MVT::iPTR \|\|
NodeToMatch->getValueType(i).getSizeInBits() ==		NodeToMatch->getValueType(i).getSizeInBits() ==
Res.getValueSizeInBits()) &&		Res.getValueSizeInBits()) &&
"invalid replacement");		"invalid replacement");
ReplaceUses(SDValue(NodeToMatch, i), Res);		ReplaceUses(SDValue(NodeToMatch, i), Res);
}		}

		// Update InputChain if there are any strict fp nodes.
		if (!StrictFPChains.empty()) {
		if (StrictFPChains.size() == 1)
		InputChain = StrictFPChains[0];
		else
		InputChain =
		CurDAG->getNode(ISD::TokenFactor, SDLoc(StrictFPChains[0]),
		MVT::Other, StrictFPChains);
		StrictFPChains.clear();
		}

// Update chain uses.		// Update chain uses.
UpdateChains(NodeToMatch, InputChain, ChainNodesMatched, false);		UpdateChains(NodeToMatch, InputChain, ChainNodesMatched, false);

// If the root node defines glue, we need to update it to the glue result.		// If the root node defines glue, we need to update it to the glue result.
// TODO: This never happens in our tests and I think it can be removed /		// TODO: This never happens in our tests and I think it can be removed /
// replaced with an assert, but if we do it this the way the change is		// replaced with an assert, but if we do it this the way the change is
// NFC.		// NFC.
if (NodeToMatch->getValueType(NodeToMatch->getNumValues() - 1) ==		if (NodeToMatch->getValueType(NodeToMatch->getNumValues() - 1) ==
▲ Show 20 Lines • Show All 123 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll

	Show First 20 Lines • Show All 615 Lines • ▼ Show 20 Lines
	; P8-NEXT: addis r3, r2, .LCPI13_0@toc@ha			; P8-NEXT: addis r3, r2, .LCPI13_0@toc@ha
	; P8-NEXT: xxlxor f3, f3, f3			; P8-NEXT: xxlxor f3, f3, f3
	; P8-NEXT: std r30, 112(r1) # 8-byte Folded Spill			; P8-NEXT: std r30, 112(r1) # 8-byte Folded Spill
	; P8-NEXT: lfs f0, .LCPI13_0@toc@l(r3)			; P8-NEXT: lfs f0, .LCPI13_0@toc@l(r3)
	; P8-NEXT: lis r3, -32768			; P8-NEXT: lis r3, -32768
	; P8-NEXT: fcmpo cr0, f2, f3			; P8-NEXT: fcmpo cr0, f2, f3
	; P8-NEXT: xxlxor f3, f3, f3			; P8-NEXT: xxlxor f3, f3, f3
	; P8-NEXT: fcmpo cr1, f1, f0			; P8-NEXT: fcmpo cr1, f1, f0
	; P8-NEXT: crand 4cr5+lt, 4cr1+eq, lt			; P8-NEXT: fcmpo cr5, f1, f0
	; P8-NEXT: crandc 4cr5+gt, 4cr1+lt, 4*cr1+eq			; P8-NEXT: fcmpo cr6, f1, f0
				efriedmaUnsubmitted Not Done Reply Inline Actions What's up with the repeated fcmpo instructions? Are the compares repeated in the input, or does your code to ensure CSE not cover this case? efriedma: What's up with the repeated fcmpo instructions? Are the compares repeated in the input, or…
				craig.topperAuthorUnsubmitted Done Reply Inline Actions They exist in the input to isel Originating from type legalization. Expand float operand: t14: i1,ch = strict_fsetccs t0, t5, ConstantFP:ppcf128<APFloat(4746794007248502784)>, setlt:ch Creating new node: t26: i1,ch = strict_fsetccs t0, t2, ConstantFP:f64<2.147484e+09>, setoeq:ch Creating new node: t27: i1,ch = strict_fsetccs t26:1, t4, ConstantFP:f64<0.000000e+00>, setlt:ch Creating new node: t28: i1 = and t26, t27 Creating new node: t30: i1,ch = strict_fsetccs t27:1, t2, ConstantFP:f64<2.147484e+09>, setune:ch Creating new node: t31: i1,ch = strict_fsetccs t30:1, t2, ConstantFP:f64<2.147484e+09>, setlt:ch Creating new node: t32: i1 = and t30, t31 Creating new node: t33: i1 = or t32, t2 craig.topper: They exist in the input to isel Originating from type legalization. ``` Expand float operand…
	; P8-NEXT: cror 4cr5+lt, 4cr5+gt, 4*cr5+lt			; P8-NEXT: crand 4cr7+lt, 4cr1+eq, lt
				; P8-NEXT: crandc 4cr5+lt, 4cr6+lt, 4*cr5+eq
				; P8-NEXT: cror 4cr5+lt, 4cr5+lt, 4*cr7+lt
	; P8-NEXT: isel r30, 0, r3, 4*cr5+lt			; P8-NEXT: isel r30, 0, r3, 4*cr5+lt
	; P8-NEXT: bc 12, 4*cr5+lt, .LBB13_2			; P8-NEXT: bc 12, 4*cr5+lt, .LBB13_2
	; P8-NEXT: # %bb.1: # %entry			; P8-NEXT: # %bb.1: # %entry
	; P8-NEXT: fmr f3, f0			; P8-NEXT: fmr f3, f0
	; P8-NEXT: .LBB13_2: # %entry			; P8-NEXT: .LBB13_2: # %entry
	; P8-NEXT: xxlxor f4, f4, f4			; P8-NEXT: xxlxor f4, f4, f4
	; P8-NEXT: bl __gcc_qsub			; P8-NEXT: bl __gcc_qsub
	; P8-NEXT: nop			; P8-NEXT: nop
	Show All 24 Lines
	; P9-NEXT: addis r3, r2, .LCPI13_0@toc@ha			; P9-NEXT: addis r3, r2, .LCPI13_0@toc@ha
	; P9-NEXT: xxlxor f3, f3, f3			; P9-NEXT: xxlxor f3, f3, f3
	; P9-NEXT: lfs f0, .LCPI13_0@toc@l(r3)			; P9-NEXT: lfs f0, .LCPI13_0@toc@l(r3)
	; P9-NEXT: fcmpo cr1, f2, f3			; P9-NEXT: fcmpo cr1, f2, f3
	; P9-NEXT: lis r3, -32768			; P9-NEXT: lis r3, -32768
	; P9-NEXT: fcmpo cr0, f1, f0			; P9-NEXT: fcmpo cr0, f1, f0
	; P9-NEXT: xxlxor f3, f3, f3			; P9-NEXT: xxlxor f3, f3, f3
	; P9-NEXT: crand 4cr5+lt, eq, 4cr1+lt			; P9-NEXT: crand 4cr5+lt, eq, 4cr1+lt
	; P9-NEXT: crandc 4*cr5+gt, lt, eq			; P9-NEXT: fcmpo cr0, f1, f0
				; P9-NEXT: fcmpo cr1, f1, f0
				; P9-NEXT: crandc 4cr5+gt, 4cr1+lt, eq
	; P9-NEXT: cror 4cr5+lt, 4cr5+gt, 4*cr5+lt			; P9-NEXT: cror 4cr5+lt, 4cr5+gt, 4*cr5+lt
	; P9-NEXT: isel r30, 0, r3, 4*cr5+lt			; P9-NEXT: isel r30, 0, r3, 4*cr5+lt
	; P9-NEXT: bc 12, 4*cr5+lt, .LBB13_2			; P9-NEXT: bc 12, 4*cr5+lt, .LBB13_2
	; P9-NEXT: # %bb.1: # %entry			; P9-NEXT: # %bb.1: # %entry
	; P9-NEXT: fmr f3, f0			; P9-NEXT: fmr f3, f0
	; P9-NEXT: .LBB13_2: # %entry			; P9-NEXT: .LBB13_2: # %entry
	; P9-NEXT: xxlxor f4, f4, f4			; P9-NEXT: xxlxor f4, f4, f4
	; P9-NEXT: bl __gcc_qsub			; P9-NEXT: bl __gcc_qsub
	Show All 24 Lines
	; NOVSX-NEXT: .cfi_offset lr, 16			; NOVSX-NEXT: .cfi_offset lr, 16
	; NOVSX-NEXT: .cfi_offset cr2, 8			; NOVSX-NEXT: .cfi_offset cr2, 8
	; NOVSX-NEXT: addis r3, r2, .LCPI13_0@toc@ha			; NOVSX-NEXT: addis r3, r2, .LCPI13_0@toc@ha
	; NOVSX-NEXT: addis r4, r2, .LCPI13_1@toc@ha			; NOVSX-NEXT: addis r4, r2, .LCPI13_1@toc@ha
	; NOVSX-NEXT: lfs f0, .LCPI13_0@toc@l(r3)			; NOVSX-NEXT: lfs f0, .LCPI13_0@toc@l(r3)
	; NOVSX-NEXT: lfs f4, .LCPI13_1@toc@l(r4)			; NOVSX-NEXT: lfs f4, .LCPI13_1@toc@l(r4)
	; NOVSX-NEXT: fcmpo cr0, f1, f0			; NOVSX-NEXT: fcmpo cr0, f1, f0
	; NOVSX-NEXT: fcmpo cr1, f2, f4			; NOVSX-NEXT: fcmpo cr1, f2, f4
				; NOVSX-NEXT: fcmpo cr5, f1, f0
				; NOVSX-NEXT: fcmpo cr6, f1, f0
	; NOVSX-NEXT: fmr f3, f4			; NOVSX-NEXT: fmr f3, f4
	; NOVSX-NEXT: crand 4cr5+lt, eq, 4cr1+lt			; NOVSX-NEXT: crand 4cr7+lt, eq, 4cr1+lt
	; NOVSX-NEXT: crandc 4*cr5+gt, lt, eq			; NOVSX-NEXT: crandc 4cr5+lt, 4cr6+lt, 4*cr5+eq
	; NOVSX-NEXT: cror 4cr2+lt, 4cr5+gt, 4*cr5+lt			; NOVSX-NEXT: cror 4cr2+lt, 4cr5+lt, 4*cr7+lt
	; NOVSX-NEXT: bc 12, 4*cr2+lt, .LBB13_2			; NOVSX-NEXT: bc 12, 4*cr2+lt, .LBB13_2
	; NOVSX-NEXT: # %bb.1: # %entry			; NOVSX-NEXT: # %bb.1: # %entry
	; NOVSX-NEXT: fmr f3, f0			; NOVSX-NEXT: fmr f3, f0
	; NOVSX-NEXT: .LBB13_2: # %entry			; NOVSX-NEXT: .LBB13_2: # %entry
	; NOVSX-NEXT: bl __gcc_qsub			; NOVSX-NEXT: bl __gcc_qsub
	; NOVSX-NEXT: nop			; NOVSX-NEXT: nop
	; NOVSX-NEXT: mffs f0			; NOVSX-NEXT: mffs f0
	; NOVSX-NEXT: mtfsb1 31			; NOVSX-NEXT: mtfsb1 31
	▲ Show 20 Lines • Show All 315 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/fp-strict-fcmp-noopt.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \			; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
	; RUN: < %s -mtriple=powerpc64-unknown-linux -mcpu=pwr9 -O0 \| FileCheck %s			; RUN: < %s -mtriple=powerpc64-unknown-linux -mcpu=pwr9 -O0 \| FileCheck %s

	define i32 @une_ppcf128(ppc_fp128 %a, ppc_fp128 %b) #0 {			define i32 @une_ppcf128(ppc_fp128 %a, ppc_fp128 %b) #0 {
	; CHECK-LABEL: une_ppcf128:			; CHECK-LABEL: une_ppcf128:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: fcmpu cr0, f1, f3			; CHECK-NEXT: fcmpu cr0, f1, f3
	; CHECK-NEXT: crmove 4*cr5+lt, eq			; CHECK-NEXT: crmove 4*cr5+lt, eq
	; CHECK-NEXT: fcmpu cr1, f2, f4			; CHECK-NEXT: fcmpu cr0, f2, f4
	; CHECK-NEXT: crmove 4cr5+gt, 4cr1+eq			; CHECK-NEXT: crmove 4*cr5+gt, eq
	; CHECK-NEXT: crnot 4cr5+gt, 4cr5+gt			; CHECK-NEXT: crnot 4cr5+gt, 4cr5+gt
	; CHECK-NEXT: crand 4cr5+gt, 4cr5+lt, 4*cr5+gt			; CHECK-NEXT: crand 4cr5+gt, 4cr5+lt, 4*cr5+gt
				; CHECK-NEXT: fcmpu cr0, f1, f3
	; CHECK-NEXT: crmove 4*cr5+lt, eq			; CHECK-NEXT: crmove 4*cr5+lt, eq
	; CHECK-NEXT: crnot 4cr5+lt, 4cr5+lt			; CHECK-NEXT: crnot 4cr5+lt, 4cr5+lt
	; CHECK-NEXT: crand 4cr5+lt, 4cr5+lt, 4*cr5+lt			; CHECK-NEXT: fcmpu cr0, f1, f3
				; CHECK-NEXT: crmove 4*cr5+eq, eq
				; CHECK-NEXT: crnot 4cr5+eq, 4cr5+eq
				; CHECK-NEXT: crand 4cr5+lt, 4cr5+lt, 4*cr5+eq
	; CHECK-NEXT: cror 4cr5+lt, 4cr5+lt, 4*cr5+gt			; CHECK-NEXT: cror 4cr5+lt, 4cr5+lt, 4*cr5+gt
	; CHECK-NEXT: li r4, 0			; CHECK-NEXT: li r4, 0
	; CHECK-NEXT: li r3, 1			; CHECK-NEXT: li r3, 1
	; CHECK-NEXT: isel r3, r3, r4, 4*cr5+lt			; CHECK-NEXT: isel r3, r3, r4, 4*cr5+lt
	; CHECK-NEXT: clrldi r3, r3, 32			; CHECK-NEXT: clrldi r3, r3, 32
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	entry:			entry:
	%0 = call i1 @llvm.experimental.constrained.fcmp.ppcf128(ppc_fp128 %a, ppc_fp128 %b, metadata !"une", metadata !"fpexcept.strict") #0			%0 = call i1 @llvm.experimental.constrained.fcmp.ppcf128(ppc_fp128 %a, ppc_fp128 %b, metadata !"une", metadata !"fpexcept.strict") #0
	%1 = zext i1 %0 to i32			%1 = zext i1 %0 to i32
	ret i32 %1			ret i32 %1
	}			}

	; This is a different branch from une			; This is a different branch from une
	define i32 @ogt_ppcf128(ppc_fp128 %a, ppc_fp128 %b) #0 {			define i32 @ogt_ppcf128(ppc_fp128 %a, ppc_fp128 %b) #0 {
	; CHECK-LABEL: ogt_ppcf128:			; CHECK-LABEL: ogt_ppcf128:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: fcmpu cr0, f1, f3			; CHECK-NEXT: fcmpu cr0, f1, f3
	; CHECK-NEXT: crmove 4*cr5+lt, eq			; CHECK-NEXT: crmove 4*cr5+lt, eq
	; CHECK-NEXT: fcmpu cr1, f2, f4			; CHECK-NEXT: fcmpu cr0, f2, f4
	; CHECK-NEXT: crmove 4cr5+gt, 4cr1+gt			; CHECK-NEXT: crmove 4*cr5+gt, gt
	; CHECK-NEXT: crand 4cr5+gt, 4cr5+lt, 4*cr5+gt			; CHECK-NEXT: crand 4cr5+gt, 4cr5+lt, 4*cr5+gt
				; CHECK-NEXT: fcmpu cr0, f1, f3
				; CHECK-NEXT: fcmpu cr1, f1, f3
				; CHECK-NEXT: crmove 4cr5+eq, 4cr1+gt
	; CHECK-NEXT: crmove 4*cr5+lt, eq			; CHECK-NEXT: crmove 4*cr5+lt, eq
	; CHECK-NEXT: crnot 4cr5+lt, 4cr5+lt			; CHECK-NEXT: crnot 4cr5+lt, 4cr5+lt
	; CHECK-NEXT: crmove 4*cr5+eq, gt
	; CHECK-NEXT: crand 4cr5+lt, 4cr5+lt, 4*cr5+eq			; CHECK-NEXT: crand 4cr5+lt, 4cr5+lt, 4*cr5+eq
	; CHECK-NEXT: cror 4cr5+lt, 4cr5+lt, 4*cr5+gt			; CHECK-NEXT: cror 4cr5+lt, 4cr5+lt, 4*cr5+gt
	; CHECK-NEXT: li r4, 0			; CHECK-NEXT: li r4, 0
	; CHECK-NEXT: li r3, 1			; CHECK-NEXT: li r3, 1
	; CHECK-NEXT: isel r3, r3, r4, 4*cr5+lt			; CHECK-NEXT: isel r3, r3, r4, 4*cr5+lt
	; CHECK-NEXT: clrldi r3, r3, 32			; CHECK-NEXT: clrldi r3, r3, 32
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	entry:			entry:
	▲ Show 20 Lines • Show All 72 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/nofpexcept.ll

Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	define void @fptoint_nofpexcept(ppc_fp128 %p, fp128 %m, i32* %addr1, i64* %addr2) {
; CHECK-NEXT: [[DFLOADf32_:%[0-9]+]]:vssrc = DFLOADf32 target-flags(ppc-toc-lo) %const.0, killed [[ADDIStocHA8_]] :: (load (s32) from constant-pool)		; CHECK-NEXT: [[DFLOADf32_:%[0-9]+]]:vssrc = DFLOADf32 target-flags(ppc-toc-lo) %const.0, killed [[ADDIStocHA8_]] :: (load (s32) from constant-pool)
; CHECK-NEXT: [[COPY9:%[0-9]+]]:f8rc = COPY [[DFLOADf32_]]		; CHECK-NEXT: [[COPY9:%[0-9]+]]:f8rc = COPY [[DFLOADf32_]]
; CHECK-NEXT: [[FCMPOD:%[0-9]+]]:crrc = FCMPOD [[COPY4]], [[COPY9]]		; CHECK-NEXT: [[FCMPOD:%[0-9]+]]:crrc = FCMPOD [[COPY4]], [[COPY9]]
; CHECK-NEXT: [[COPY10:%[0-9]+]]:crbitrc = COPY [[FCMPOD]].sub_eq		; CHECK-NEXT: [[COPY10:%[0-9]+]]:crbitrc = COPY [[FCMPOD]].sub_eq
; CHECK-NEXT: [[XXLXORdpz:%[0-9]+]]:f8rc = XXLXORdpz		; CHECK-NEXT: [[XXLXORdpz:%[0-9]+]]:f8rc = XXLXORdpz
; CHECK-NEXT: [[FCMPOD1:%[0-9]+]]:crrc = FCMPOD [[COPY3]], [[XXLXORdpz]]		; CHECK-NEXT: [[FCMPOD1:%[0-9]+]]:crrc = FCMPOD [[COPY3]], [[XXLXORdpz]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:crbitrc = COPY [[FCMPOD1]].sub_lt		; CHECK-NEXT: [[COPY11:%[0-9]+]]:crbitrc = COPY [[FCMPOD1]].sub_lt
; CHECK-NEXT: [[CRAND:%[0-9]+]]:crbitrc = CRAND killed [[COPY10]], killed [[COPY11]]		; CHECK-NEXT: [[CRAND:%[0-9]+]]:crbitrc = CRAND killed [[COPY10]], killed [[COPY11]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:crbitrc = COPY [[FCMPOD]].sub_eq		; CHECK-NEXT: [[FCMPOD2:%[0-9]+]]:crrc = FCMPOD [[COPY4]], [[COPY9]]
; CHECK-NEXT: [[COPY13:%[0-9]+]]:crbitrc = COPY [[FCMPOD]].sub_lt		; CHECK-NEXT: [[COPY12:%[0-9]+]]:crbitrc = COPY [[FCMPOD2]].sub_eq
		; CHECK-NEXT: [[FCMPOD3:%[0-9]+]]:crrc = FCMPOD [[COPY4]], [[COPY9]]
		; CHECK-NEXT: [[COPY13:%[0-9]+]]:crbitrc = COPY [[FCMPOD3]].sub_lt
; CHECK-NEXT: [[CRANDC:%[0-9]+]]:crbitrc = CRANDC killed [[COPY13]], killed [[COPY12]]		; CHECK-NEXT: [[CRANDC:%[0-9]+]]:crbitrc = CRANDC killed [[COPY13]], killed [[COPY12]]
; CHECK-NEXT: [[CROR:%[0-9]+]]:crbitrc = CROR killed [[CRANDC]], killed [[CRAND]]		; CHECK-NEXT: [[CROR:%[0-9]+]]:crbitrc = CROR killed [[CRANDC]], killed [[CRAND]]
; CHECK-NEXT: [[LIS:%[0-9]+]]:gprc_and_gprc_nor0 = LIS 32768		; CHECK-NEXT: [[LIS:%[0-9]+]]:gprc_and_gprc_nor0 = LIS 32768
; CHECK-NEXT: [[LI:%[0-9]+]]:gprc_and_gprc_nor0 = LI 0		; CHECK-NEXT: [[LI:%[0-9]+]]:gprc_and_gprc_nor0 = LI 0
; CHECK-NEXT: [[ISEL:%[0-9]+]]:gprc = ISEL [[LI]], [[LIS]], [[CROR]]		; CHECK-NEXT: [[ISEL:%[0-9]+]]:gprc = ISEL [[LI]], [[LIS]], [[CROR]]
; CHECK-NEXT: BC [[CROR]], %bb.2		; CHECK-NEXT: BC [[CROR]], %bb.2
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.1.entry:		; CHECK-NEXT: bb.1.entry:
Show All 9 Lines	define void @fptoint_nofpexcept(ppc_fp128 %p, fp128 %m, i32* %addr1, i64* %addr2) {
; CHECK-NEXT: $f4 = COPY [[XXLXORdpz]]		; CHECK-NEXT: $f4 = COPY [[XXLXORdpz]]
; CHECK-NEXT: BL8_NOP &__gcc_qsub, csr_ppc64_altivec, implicit-def dead $lr8, implicit $rm, implicit $f1, implicit $f2, implicit $f3, implicit $f4, implicit $x2, implicit-def $r1, implicit-def $f1, implicit-def $f2		; CHECK-NEXT: BL8_NOP &__gcc_qsub, csr_ppc64_altivec, implicit-def dead $lr8, implicit $rm, implicit $f1, implicit $f2, implicit $f3, implicit $f4, implicit $x2, implicit-def $r1, implicit-def $f1, implicit-def $f2
; CHECK-NEXT: ADJCALLSTACKUP 32, 0, implicit-def dead $r1, implicit $r1		; CHECK-NEXT: ADJCALLSTACKUP 32, 0, implicit-def dead $r1, implicit $r1
; CHECK-NEXT: [[COPY14:%[0-9]+]]:f8rc = COPY $f1		; CHECK-NEXT: [[COPY14:%[0-9]+]]:f8rc = COPY $f1
; CHECK-NEXT: [[COPY15:%[0-9]+]]:f8rc = COPY $f2		; CHECK-NEXT: [[COPY15:%[0-9]+]]:f8rc = COPY $f2
; CHECK-NEXT: [[MFFS1:%[0-9]+]]:f8rc = MFFS implicit $rm		; CHECK-NEXT: [[MFFS1:%[0-9]+]]:f8rc = MFFS implicit $rm
; CHECK-NEXT: MTFSB1 31, implicit-def $rm, implicit-def $rm		; CHECK-NEXT: MTFSB1 31, implicit-def $rm, implicit-def $rm
; CHECK-NEXT: MTFSB0 30, implicit-def $rm, implicit-def $rm		; CHECK-NEXT: MTFSB0 30, implicit-def $rm, implicit-def $rm
; CHECK-NEXT: %37:f8rc = nofpexcept FADD [[COPY15]], [[COPY14]], implicit $rm		; CHECK-NEXT: %39:f8rc = nofpexcept FADD [[COPY15]], [[COPY14]], implicit $rm
; CHECK-NEXT: MTFSFb 1, [[MFFS1]], implicit-def $rm		; CHECK-NEXT: MTFSFb 1, [[MFFS1]], implicit-def $rm
; CHECK-NEXT: %38:vsfrc = nofpexcept XSCVDPSXWS killed %37, implicit $rm		; CHECK-NEXT: %40:vsfrc = nofpexcept XSCVDPSXWS killed %39, implicit $rm
; CHECK-NEXT: [[MFVSRWZ3:%[0-9]+]]:gprc = MFVSRWZ killed %38		; CHECK-NEXT: [[MFVSRWZ3:%[0-9]+]]:gprc = MFVSRWZ killed %40
; CHECK-NEXT: [[XOR:%[0-9]+]]:gprc = XOR killed [[MFVSRWZ3]], killed [[ISEL]]		; CHECK-NEXT: [[XOR:%[0-9]+]]:gprc = XOR killed [[MFVSRWZ3]], killed [[ISEL]]
; CHECK-NEXT: STW killed [[XOR]], 0, [[COPY1]] :: (volatile store (s32) into %ir.addr1)		; CHECK-NEXT: STW killed [[XOR]], 0, [[COPY1]] :: (volatile store (s32) into %ir.addr1)
; CHECK-NEXT: BLR8 implicit $lr8, implicit $rm		; CHECK-NEXT: BLR8 implicit $lr8, implicit $rm
entry:		entry:
%conv1 = tail call i32 @llvm.experimental.constrained.fptosi.i32.f128(fp128 %m, metadata !"fpexcept.ignore") #0		%conv1 = tail call i32 @llvm.experimental.constrained.fptosi.i32.f128(fp128 %m, metadata !"fpexcept.ignore") #0
store volatile i32 %conv1, i32* %addr1, align 4		store volatile i32 %conv1, i32* %addr1, align 4
%conv2 = tail call i32 @llvm.experimental.constrained.fptoui.i32.f128(fp128 %m, metadata !"fpexcept.ignore") #0		%conv2 = tail call i32 @llvm.experimental.constrained.fptoui.i32.f128(fp128 %m, metadata !"fpexcept.ignore") #0
store volatile i32 %conv2, i32* %addr1, align 4		store volatile i32 %conv2, i32* %addr1, align 4
▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/ppcf128-constrained-fp-intrinsics.ll

	Show First 20 Lines • Show All 1,291 Lines • ▼ Show 20 Lines
	; PC64LE-NEXT: stdu 1, -48(1)			; PC64LE-NEXT: stdu 1, -48(1)
	; PC64LE-NEXT: addis 3, 2, .LCPI31_0@toc@ha			; PC64LE-NEXT: addis 3, 2, .LCPI31_0@toc@ha
	; PC64LE-NEXT: xxlxor 3, 3, 3			; PC64LE-NEXT: xxlxor 3, 3, 3
	; PC64LE-NEXT: lfs 0, .LCPI31_0@toc@l(3)			; PC64LE-NEXT: lfs 0, .LCPI31_0@toc@l(3)
	; PC64LE-NEXT: lis 3, -32768			; PC64LE-NEXT: lis 3, -32768
	; PC64LE-NEXT: fcmpo 0, 2, 3			; PC64LE-NEXT: fcmpo 0, 2, 3
	; PC64LE-NEXT: xxlxor 3, 3, 3			; PC64LE-NEXT: xxlxor 3, 3, 3
	; PC64LE-NEXT: fcmpo 1, 1, 0			; PC64LE-NEXT: fcmpo 1, 1, 0
	; PC64LE-NEXT: crand 20, 6, 0			; PC64LE-NEXT: fcmpo 5, 1, 0
	; PC64LE-NEXT: crandc 21, 4, 6			; PC64LE-NEXT: fcmpo 6, 1, 0
	; PC64LE-NEXT: cror 20, 21, 20			; PC64LE-NEXT: crand 28, 6, 0
				; PC64LE-NEXT: crandc 20, 24, 22
				; PC64LE-NEXT: cror 20, 20, 28
	; PC64LE-NEXT: isel 30, 0, 3, 20			; PC64LE-NEXT: isel 30, 0, 3, 20
	; PC64LE-NEXT: bc 12, 20, .LBB31_2			; PC64LE-NEXT: bc 12, 20, .LBB31_2
	; PC64LE-NEXT: # %bb.1: # %entry			; PC64LE-NEXT: # %bb.1: # %entry
	; PC64LE-NEXT: fmr 3, 0			; PC64LE-NEXT: fmr 3, 0
	; PC64LE-NEXT: .LBB31_2: # %entry			; PC64LE-NEXT: .LBB31_2: # %entry
	; PC64LE-NEXT: xxlxor 4, 4, 4			; PC64LE-NEXT: xxlxor 4, 4, 4
	; PC64LE-NEXT: bl __gcc_qsub			; PC64LE-NEXT: bl __gcc_qsub
	; PC64LE-NEXT: nop			; PC64LE-NEXT: nop
	Show All 20 Lines
	; PC64LE9-NEXT: addis 3, 2, .LCPI31_0@toc@ha			; PC64LE9-NEXT: addis 3, 2, .LCPI31_0@toc@ha
	; PC64LE9-NEXT: xxlxor 3, 3, 3			; PC64LE9-NEXT: xxlxor 3, 3, 3
	; PC64LE9-NEXT: lfs 0, .LCPI31_0@toc@l(3)			; PC64LE9-NEXT: lfs 0, .LCPI31_0@toc@l(3)
	; PC64LE9-NEXT: fcmpo 1, 2, 3			; PC64LE9-NEXT: fcmpo 1, 2, 3
	; PC64LE9-NEXT: lis 3, -32768			; PC64LE9-NEXT: lis 3, -32768
	; PC64LE9-NEXT: fcmpo 0, 1, 0			; PC64LE9-NEXT: fcmpo 0, 1, 0
	; PC64LE9-NEXT: xxlxor 3, 3, 3			; PC64LE9-NEXT: xxlxor 3, 3, 3
	; PC64LE9-NEXT: crand 20, 2, 4			; PC64LE9-NEXT: crand 20, 2, 4
	; PC64LE9-NEXT: crandc 21, 0, 2			; PC64LE9-NEXT: fcmpo 0, 1, 0
				; PC64LE9-NEXT: fcmpo 1, 1, 0
				; PC64LE9-NEXT: crandc 21, 4, 2
	; PC64LE9-NEXT: cror 20, 21, 20			; PC64LE9-NEXT: cror 20, 21, 20
	; PC64LE9-NEXT: isel 30, 0, 3, 20			; PC64LE9-NEXT: isel 30, 0, 3, 20
	; PC64LE9-NEXT: bc 12, 20, .LBB31_2			; PC64LE9-NEXT: bc 12, 20, .LBB31_2
	; PC64LE9-NEXT: # %bb.1: # %entry			; PC64LE9-NEXT: # %bb.1: # %entry
	; PC64LE9-NEXT: fmr 3, 0			; PC64LE9-NEXT: fmr 3, 0
	; PC64LE9-NEXT: .LBB31_2: # %entry			; PC64LE9-NEXT: .LBB31_2: # %entry
	; PC64LE9-NEXT: xxlxor 4, 4, 4			; PC64LE9-NEXT: xxlxor 4, 4, 4
	; PC64LE9-NEXT: bl __gcc_qsub			; PC64LE9-NEXT: bl __gcc_qsub
	Show All 19 Lines
	; PC64-NEXT: mfcr 12			; PC64-NEXT: mfcr 12
	; PC64-NEXT: stw 12, 8(1)			; PC64-NEXT: stw 12, 8(1)
	; PC64-NEXT: stdu 1, -128(1)			; PC64-NEXT: stdu 1, -128(1)
	; PC64-NEXT: addis 3, 2, .LCPI31_0@toc@ha			; PC64-NEXT: addis 3, 2, .LCPI31_0@toc@ha
	; PC64-NEXT: lfs 0, .LCPI31_0@toc@l(3)			; PC64-NEXT: lfs 0, .LCPI31_0@toc@l(3)
	; PC64-NEXT: addis 3, 2, .LCPI31_1@toc@ha			; PC64-NEXT: addis 3, 2, .LCPI31_1@toc@ha
	; PC64-NEXT: lfs 4, .LCPI31_1@toc@l(3)			; PC64-NEXT: lfs 4, .LCPI31_1@toc@l(3)
	; PC64-NEXT: fcmpo 0, 1, 0			; PC64-NEXT: fcmpo 0, 1, 0
	; PC64-NEXT: crandc 21, 0, 2
	; PC64-NEXT: fcmpo 1, 2, 4			; PC64-NEXT: fcmpo 1, 2, 4
	; PC64-NEXT: crand 20, 2, 4			; PC64-NEXT: crand 20, 2, 4
				; PC64-NEXT: fcmpo 0, 1, 0
				; PC64-NEXT: fcmpo 1, 1, 0
				; PC64-NEXT: crandc 21, 4, 2
	; PC64-NEXT: cror 8, 21, 20			; PC64-NEXT: cror 8, 21, 20
	; PC64-NEXT: fmr 3, 4			; PC64-NEXT: fmr 3, 4
	; PC64-NEXT: bc 12, 8, .LBB31_2			; PC64-NEXT: bc 12, 8, .LBB31_2
	; PC64-NEXT: # %bb.1: # %entry			; PC64-NEXT: # %bb.1: # %entry
	; PC64-NEXT: fmr 3, 0			; PC64-NEXT: fmr 3, 0
	; PC64-NEXT: .LBB31_2: # %entry			; PC64-NEXT: .LBB31_2: # %entry
	; PC64-NEXT: bl __gcc_qsub			; PC64-NEXT: bl __gcc_qsub
	; PC64-NEXT: nop			; PC64-NEXT: nop
	▲ Show 20 Lines • Show All 729 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,216 Lines • ▼ Show 20 Lines	%max = call <3 x double> @llvm.experimental.constrained.maxnum.v3f64(
<3 x double> %y,		<3 x double> %y,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <3 x double> %max		ret <3 x double> %max
}		}

define <4 x double> @constrained_vector_maxnum_v4f64(<4 x double> %x, <4 x double> %y) #0 {		define <4 x double> @constrained_vector_maxnum_v4f64(<4 x double> %x, <4 x double> %y) #0 {
; PC64LE-LABEL: constrained_vector_maxnum_v4f64:		; PC64LE-LABEL: constrained_vector_maxnum_v4f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: xvmaxdp 34, 34, 36
; PC64LE-NEXT: xvmaxdp 35, 35, 37		; PC64LE-NEXT: xvmaxdp 35, 35, 37
		; PC64LE-NEXT: xvmaxdp 34, 34, 36
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_maxnum_v4f64:		; PC64LE9-LABEL: constrained_vector_maxnum_v4f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: xvmaxdp 34, 34, 36
; PC64LE9-NEXT: xvmaxdp 35, 35, 37		; PC64LE9-NEXT: xvmaxdp 35, 35, 37
		; PC64LE9-NEXT: xvmaxdp 34, 34, 36
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%max = call <4 x double> @llvm.experimental.constrained.maxnum.v4f64(		%max = call <4 x double> @llvm.experimental.constrained.maxnum.v4f64(
<4 x double> %x,		<4 x double> %x,
<4 x double> %y,		<4 x double> %y,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <4 x double> %max		ret <4 x double> %max
}		}
▲ Show 20 Lines • Show All 217 Lines • ▼ Show 20 Lines	%min = call <3 x double> @llvm.experimental.constrained.minnum.v3f64(
<3 x double> %y,		<3 x double> %y,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <3 x double> %min		ret <3 x double> %min
}		}

define <4 x double> @constrained_vector_minnum_v4f64(<4 x double> %x, <4 x double> %y) #0 {		define <4 x double> @constrained_vector_minnum_v4f64(<4 x double> %x, <4 x double> %y) #0 {
; PC64LE-LABEL: constrained_vector_minnum_v4f64:		; PC64LE-LABEL: constrained_vector_minnum_v4f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: xvmindp 34, 34, 36
; PC64LE-NEXT: xvmindp 35, 35, 37		; PC64LE-NEXT: xvmindp 35, 35, 37
		; PC64LE-NEXT: xvmindp 34, 34, 36
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_minnum_v4f64:		; PC64LE9-LABEL: constrained_vector_minnum_v4f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: xvmindp 34, 34, 36
; PC64LE9-NEXT: xvmindp 35, 35, 37		; PC64LE9-NEXT: xvmindp 35, 35, 37
		; PC64LE9-NEXT: xvmindp 34, 34, 36
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%min = call <4 x double> @llvm.experimental.constrained.minnum.v4f64(		%min = call <4 x double> @llvm.experimental.constrained.minnum.v4f64(
<4 x double> %x,		<4 x double> %x,
<4 x double> %y,		<4 x double> %y,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <4 x double> %min		ret <4 x double> %min
}		}
▲ Show 20 Lines • Show All 1,890 Lines • ▼ Show 20 Lines	%result = call <2 x float>
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict") #0		metadata !"fpexcept.strict") #0
ret <2 x float> %result		ret <2 x float> %result
}		}

define <3 x double> @constrained_vector_sitofp_v3f64_v3i32(<3 x i32> %x) #0 {		define <3 x double> @constrained_vector_sitofp_v3f64_v3i32(<3 x i32> %x) #0 {
; PC64LE-LABEL: constrained_vector_sitofp_v3f64_v3i32:		; PC64LE-LABEL: constrained_vector_sitofp_v3f64_v3i32:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: xxswapd 0, 34		; PC64LE-NEXT: xxsldwi 0, 34, 34, 1
; PC64LE-NEXT: xxsldwi 1, 34, 34, 1		; PC64LE-NEXT: xxswapd 1, 34
; PC64LE-NEXT: mfvsrwz 3, 34		; PC64LE-NEXT: mfvsrwz 3, 34
; PC64LE-NEXT: mtfprwa 3, 3		; PC64LE-NEXT: mffprwz 4, 0
; PC64LE-NEXT: mffprwz 3, 0
; PC64LE-NEXT: mffprwz 4, 1
; PC64LE-NEXT: mtfprwa 0, 3		; PC64LE-NEXT: mtfprwa 0, 3
		; PC64LE-NEXT: mffprwz 3, 1
; PC64LE-NEXT: mtfprwa 2, 4		; PC64LE-NEXT: mtfprwa 2, 4
; PC64LE-NEXT: xscvsxddp 1, 0		; PC64LE-NEXT: xscvsxddp 3, 0
		; PC64LE-NEXT: mtfprwa 0, 3
; PC64LE-NEXT: xscvsxddp 2, 2		; PC64LE-NEXT: xscvsxddp 2, 2
; PC64LE-NEXT: xscvsxddp 3, 3		; PC64LE-NEXT: xscvsxddp 1, 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_sitofp_v3f64_v3i32:		; PC64LE9-LABEL: constrained_vector_sitofp_v3f64_v3i32:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: li 3, 0		; PC64LE9-NEXT: li 3, 4
; PC64LE9-NEXT: vextuwrx 3, 3, 2		; PC64LE9-NEXT: vextuwrx 3, 3, 2
; PC64LE9-NEXT: mtfprwa 0, 3		; PC64LE9-NEXT: mtfprwa 0, 3
; PC64LE9-NEXT: li 3, 4		; PC64LE9-NEXT: li 3, 0
; PC64LE9-NEXT: vextuwrx 3, 3, 2		; PC64LE9-NEXT: vextuwrx 3, 3, 2
; PC64LE9-NEXT: xscvsxddp 1, 0		; PC64LE9-NEXT: xscvsxddp 2, 0
; PC64LE9-NEXT: mtfprwa 0, 3		; PC64LE9-NEXT: mtfprwa 0, 3
; PC64LE9-NEXT: mfvsrwz 3, 34		; PC64LE9-NEXT: mfvsrwz 3, 34
; PC64LE9-NEXT: xscvsxddp 2, 0		; PC64LE9-NEXT: xscvsxddp 1, 0
; PC64LE9-NEXT: mtfprwa 0, 3		; PC64LE9-NEXT: mtfprwa 0, 3
; PC64LE9-NEXT: xscvsxddp 3, 0		; PC64LE9-NEXT: xscvsxddp 3, 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%result = call <3 x double>		%result = call <3 x double>
@llvm.experimental.constrained.sitofp.v3f64.v3i32(<3 x i32> %x,		@llvm.experimental.constrained.sitofp.v3f64.v3i32(<3 x i32> %x,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict") #0		metadata !"fpexcept.strict") #0
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	%result = call <3 x float>
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict") #0		metadata !"fpexcept.strict") #0
ret <3 x float> %result		ret <3 x float> %result
}		}

define <3 x double> @constrained_vector_sitofp_v3f64_v3i64(<3 x i64> %x) #0 {		define <3 x double> @constrained_vector_sitofp_v3f64_v3i64(<3 x i64> %x) #0 {
; PC64LE-LABEL: constrained_vector_sitofp_v3f64_v3i64:		; PC64LE-LABEL: constrained_vector_sitofp_v3f64_v3i64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: mtfprd 0, 3		; PC64LE-NEXT: mtfprd 0, 5
; PC64LE-NEXT: mtfprd 2, 4		; PC64LE-NEXT: mtfprd 1, 4
; PC64LE-NEXT: mtfprd 3, 5		; PC64LE-NEXT: mtfprd 4, 3
; PC64LE-NEXT: xscvsxddp 1, 0		; PC64LE-NEXT: xscvsxddp 3, 0
; PC64LE-NEXT: xscvsxddp 2, 2		; PC64LE-NEXT: xscvsxddp 2, 1
; PC64LE-NEXT: xscvsxddp 3, 3		; PC64LE-NEXT: xscvsxddp 1, 4
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_sitofp_v3f64_v3i64:		; PC64LE9-LABEL: constrained_vector_sitofp_v3f64_v3i64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: mtfprd 0, 3
; PC64LE9-NEXT: xscvsxddp 1, 0
; PC64LE9-NEXT: mtfprd 0, 4
; PC64LE9-NEXT: xscvsxddp 2, 0
; PC64LE9-NEXT: mtfprd 0, 5		; PC64LE9-NEXT: mtfprd 0, 5
; PC64LE9-NEXT: xscvsxddp 3, 0		; PC64LE9-NEXT: xscvsxddp 3, 0
		; PC64LE9-NEXT: mtfprd 0, 4
		; PC64LE9-NEXT: xscvsxddp 2, 0
		; PC64LE9-NEXT: mtfprd 0, 3
		; PC64LE9-NEXT: xscvsxddp 1, 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%result = call <3 x double>		%result = call <3 x double>
@llvm.experimental.constrained.sitofp.v3f64.v3i64(<3 x i64> %x,		@llvm.experimental.constrained.sitofp.v3f64.v3i64(<3 x i64> %x,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict") #0		metadata !"fpexcept.strict") #0
ret <3 x double> %result		ret <3 x double> %result
}		}
▲ Show 20 Lines • Show All 439 Lines • ▼ Show 20 Lines	%result = call <2 x float>
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict") #0		metadata !"fpexcept.strict") #0
ret <2 x float> %result		ret <2 x float> %result
}		}

define <3 x double> @constrained_vector_uitofp_v3f64_v3i32(<3 x i32> %x) #0 {		define <3 x double> @constrained_vector_uitofp_v3f64_v3i32(<3 x i32> %x) #0 {
; PC64LE-LABEL: constrained_vector_uitofp_v3f64_v3i32:		; PC64LE-LABEL: constrained_vector_uitofp_v3f64_v3i32:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: xxswapd 0, 34		; PC64LE-NEXT: xxsldwi 0, 34, 34, 1
; PC64LE-NEXT: xxsldwi 1, 34, 34, 1		; PC64LE-NEXT: xxswapd 1, 34
; PC64LE-NEXT: mfvsrwz 3, 34		; PC64LE-NEXT: mfvsrwz 3, 34
; PC64LE-NEXT: mtfprwz 3, 3		; PC64LE-NEXT: mffprwz 4, 0
; PC64LE-NEXT: mffprwz 3, 0
; PC64LE-NEXT: mffprwz 4, 1
; PC64LE-NEXT: mtfprwz 0, 3		; PC64LE-NEXT: mtfprwz 0, 3
		; PC64LE-NEXT: mffprwz 3, 1
; PC64LE-NEXT: mtfprwz 2, 4		; PC64LE-NEXT: mtfprwz 2, 4
; PC64LE-NEXT: xscvuxddp 1, 0		; PC64LE-NEXT: xscvuxddp 3, 0
		; PC64LE-NEXT: mtfprwz 0, 3
; PC64LE-NEXT: xscvuxddp 2, 2		; PC64LE-NEXT: xscvuxddp 2, 2
; PC64LE-NEXT: xscvuxddp 3, 3		; PC64LE-NEXT: xscvuxddp 1, 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_uitofp_v3f64_v3i32:		; PC64LE9-LABEL: constrained_vector_uitofp_v3f64_v3i32:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: li 3, 0		; PC64LE9-NEXT: li 3, 4
; PC64LE9-NEXT: vextuwrx 3, 3, 2		; PC64LE9-NEXT: vextuwrx 3, 3, 2
; PC64LE9-NEXT: mtfprwz 0, 3		; PC64LE9-NEXT: mtfprwz 0, 3
; PC64LE9-NEXT: li 3, 4		; PC64LE9-NEXT: li 3, 0
; PC64LE9-NEXT: vextuwrx 3, 3, 2		; PC64LE9-NEXT: vextuwrx 3, 3, 2
; PC64LE9-NEXT: xscvuxddp 1, 0		; PC64LE9-NEXT: xscvuxddp 2, 0
; PC64LE9-NEXT: mtfprwz 0, 3		; PC64LE9-NEXT: mtfprwz 0, 3
; PC64LE9-NEXT: mfvsrwz 3, 34		; PC64LE9-NEXT: mfvsrwz 3, 34
; PC64LE9-NEXT: xscvuxddp 2, 0		; PC64LE9-NEXT: xscvuxddp 1, 0
; PC64LE9-NEXT: mtfprwz 0, 3		; PC64LE9-NEXT: mtfprwz 0, 3
; PC64LE9-NEXT: xscvuxddp 3, 0		; PC64LE9-NEXT: xscvuxddp 3, 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%result = call <3 x double>		%result = call <3 x double>
@llvm.experimental.constrained.uitofp.v3f64.v3i32(<3 x i32> %x,		@llvm.experimental.constrained.uitofp.v3f64.v3i32(<3 x i32> %x,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict") #0		metadata !"fpexcept.strict") #0
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	%result = call <3 x float>
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict") #0		metadata !"fpexcept.strict") #0
ret <3 x float> %result		ret <3 x float> %result
}		}

define <3 x double> @constrained_vector_uitofp_v3f64_v3i64(<3 x i64> %x) #0 {		define <3 x double> @constrained_vector_uitofp_v3f64_v3i64(<3 x i64> %x) #0 {
; PC64LE-LABEL: constrained_vector_uitofp_v3f64_v3i64:		; PC64LE-LABEL: constrained_vector_uitofp_v3f64_v3i64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: mtfprd 0, 3		; PC64LE-NEXT: mtfprd 0, 5
; PC64LE-NEXT: mtfprd 2, 4		; PC64LE-NEXT: mtfprd 1, 4
; PC64LE-NEXT: mtfprd 3, 5		; PC64LE-NEXT: mtfprd 4, 3
; PC64LE-NEXT: xscvuxddp 1, 0		; PC64LE-NEXT: xscvuxddp 3, 0
; PC64LE-NEXT: xscvuxddp 2, 2		; PC64LE-NEXT: xscvuxddp 2, 1
; PC64LE-NEXT: xscvuxddp 3, 3		; PC64LE-NEXT: xscvuxddp 1, 4
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_uitofp_v3f64_v3i64:		; PC64LE9-LABEL: constrained_vector_uitofp_v3f64_v3i64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: mtfprd 0, 3
; PC64LE9-NEXT: xscvuxddp 1, 0
; PC64LE9-NEXT: mtfprd 0, 4
; PC64LE9-NEXT: xscvuxddp 2, 0
; PC64LE9-NEXT: mtfprd 0, 5		; PC64LE9-NEXT: mtfprd 0, 5
; PC64LE9-NEXT: xscvuxddp 3, 0		; PC64LE9-NEXT: xscvuxddp 3, 0
		; PC64LE9-NEXT: mtfprd 0, 4
		; PC64LE9-NEXT: xscvuxddp 2, 0
		; PC64LE9-NEXT: mtfprd 0, 3
		; PC64LE9-NEXT: xscvuxddp 1, 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%result = call <3 x double>		%result = call <3 x double>
@llvm.experimental.constrained.uitofp.v3f64.v3i64(<3 x i64> %x,		@llvm.experimental.constrained.uitofp.v3f64.v3i64(<3 x i64> %x,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict") #0		metadata !"fpexcept.strict") #0
ret <3 x double> %result		ret <3 x double> %result
}		}
▲ Show 20 Lines • Show All 409 Lines • Show Last 20 Lines

llvm/test/CodeGen/SystemZ/vector-constrained-fp-intrinsics.ll

	Show First 20 Lines • Show All 4,290 Lines • ▼ Show 20 Lines
	; S390X-NEXT: fidbr %f0, 0, %f3			; S390X-NEXT: fidbr %f0, 0, %f3
	; S390X-NEXT: br %r14			; S390X-NEXT: br %r14
	;			;
	; SZ13-LABEL: constrained_vector_rint_v4f64:			; SZ13-LABEL: constrained_vector_rint_v4f64:
	; SZ13: # %bb.0: # %entry			; SZ13: # %bb.0: # %entry
	; SZ13-NEXT: larl %r1, .LCPI79_0			; SZ13-NEXT: larl %r1, .LCPI79_0
	; SZ13-NEXT: vl %v0, 0(%r1), 3			; SZ13-NEXT: vl %v0, 0(%r1), 3
	; SZ13-NEXT: larl %r1, .LCPI79_1			; SZ13-NEXT: larl %r1, .LCPI79_1
	; SZ13-NEXT: vfidb %v24, %v0, 0, 0
	; SZ13-NEXT: vl %v0, 0(%r1), 3
	; SZ13-NEXT: vfidb %v26, %v0, 0, 0			; SZ13-NEXT: vfidb %v26, %v0, 0, 0
				; SZ13-NEXT: vl %v0, 0(%r1), 3
				; SZ13-NEXT: vfidb %v24, %v0, 0, 0
				uweigandUnsubmitted Not Done Reply Inline Actions Not sure why this patch causes the two output registers to be computed in reverse order now, but either order should be fine. (And now the order in vectorized code matches the order in scalar code ...) uweigand: Not sure why this patch causes the two output registers to be computed in reverse order now…
	; SZ13-NEXT: br %r14			; SZ13-NEXT: br %r14
	entry:			entry:
	%rint = call <4 x double> @llvm.experimental.constrained.rint.v4f64(			%rint = call <4 x double> @llvm.experimental.constrained.rint.v4f64(
	<4 x double> <double 42.1, double 42.2,			<4 x double> <double 42.1, double 42.2,
	double 42.3, double 42.4>,			double 42.3, double 42.4>,
	metadata !"round.dynamic",			metadata !"round.dynamic",
	metadata !"fpexcept.strict") #0			metadata !"fpexcept.strict") #0
	ret <4 x double> %rint			ret <4 x double> %rint
	▲ Show 20 Lines • Show All 215 Lines • ▼ Show 20 Lines
	; S390X-NEXT: lmg %r14, %r15, 296(%r15)			; S390X-NEXT: lmg %r14, %r15, 296(%r15)
	; S390X-NEXT: br %r14			; S390X-NEXT: br %r14
	;			;
	; SZ13-LABEL: constrained_vector_nearbyint_v4f64:			; SZ13-LABEL: constrained_vector_nearbyint_v4f64:
	; SZ13: # %bb.0: # %entry			; SZ13: # %bb.0: # %entry
	; SZ13-NEXT: larl %r1, .LCPI84_0			; SZ13-NEXT: larl %r1, .LCPI84_0
	; SZ13-NEXT: vl %v0, 0(%r1), 3			; SZ13-NEXT: vl %v0, 0(%r1), 3
	; SZ13-NEXT: larl %r1, .LCPI84_1			; SZ13-NEXT: larl %r1, .LCPI84_1
	; SZ13-NEXT: vfidb %v24, %v0, 4, 0
	; SZ13-NEXT: vl %v0, 0(%r1), 3
	; SZ13-NEXT: vfidb %v26, %v0, 4, 0			; SZ13-NEXT: vfidb %v26, %v0, 4, 0
				; SZ13-NEXT: vl %v0, 0(%r1), 3
				; SZ13-NEXT: vfidb %v24, %v0, 4, 0
	; SZ13-NEXT: br %r14			; SZ13-NEXT: br %r14
	entry:			entry:
	%nearby = call <4 x double> @llvm.experimental.constrained.nearbyint.v4f64(			%nearby = call <4 x double> @llvm.experimental.constrained.nearbyint.v4f64(
	<4 x double> <double 42.1, double 42.2,			<4 x double> <double 42.1, double 42.2,
	double 42.3, double 42.4>,			double 42.3, double 42.4>,
	metadata !"round.dynamic",			metadata !"round.dynamic",
	metadata !"fpexcept.strict") #0			metadata !"fpexcept.strict") #0
	ret <4 x double> %nearby			ret <4 x double> %nearby
	▲ Show 20 Lines • Show All 1,809 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAGISel] Chain any mayRaiseFPException instruction created from a strict FP nodeNeeds ReviewPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 434963

llvm/include/llvm/CodeGen/SelectionDAGISel.h

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp

llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll

llvm/test/CodeGen/PowerPC/fp-strict-fcmp-noopt.ll

llvm/test/CodeGen/PowerPC/nofpexcept.ll

llvm/test/CodeGen/PowerPC/ppcf128-constrained-fp-intrinsics.ll

llvm/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll

llvm/test/CodeGen/SystemZ/vector-constrained-fp-intrinsics.ll

[SelectionDAGISel] Chain any mayRaiseFPException instruction created from a strict FP node
Needs ReviewPublic