This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
1/2
PPCISelLowering.h
22/39
PPCISelLowering.cpp
-
PPCInstrSPE.td
3/3
PPCInstrVSX.td
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
2/2
fp-strict-conv-f128.ll
4/4
fp-strict-conv.ll

Differential D81537

[PowerPC] Support constrained fp operation for scalar fptosi/fptoui
ClosedPublic

Authored by qiucf on Jun 10 2020, 12:27 AM.

Download Raw Diff

Details

Reviewers

steven.zhang
nemanjai
kbarton
kpn
jsji
uweigand

Group Reviewers

Restricted Project

Commits

rG131b3b9ed4ef: [PowerPC] Support constrained scalar fptosi/fptoui

Summary

This patch adds support for constrained conversion operation (fptoui/fptosi) from f32/f64 to i32/i64.

Vector support will be done in following patches. For targets older than ISA 2.06, we need to make strict_fsetcc/strict_fsetccs work well first.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

qiucf created this revision.Jun 10 2020, 12:27 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2020, 12:27 AM

Herald added subscribers: llvm-commits, shchenz, hiraditya. · View Herald Transcript

steven.zhang added inline comments.Jun 10 2020, 1:14 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8144	The parameter SDLoc is not needed. And can we change the function like: getFpNode() or something else ? You don't need to have the strict in the function name as it is already one of the function parameter.
8147	This is not good code practice. Please use: Strict ? 1 : 0 or seeking some API in the SDValue.
8148	!Strict is not needed.
8200–8209	The logic between LowerFP_TO_INTForReuse and LowerFP_TO_INTDirectMove is nearly the same between line 8168 ~ 8196. And that is expected as the difference between the two is how to move the data from FPR to GPR. So, can we add another function to do the convert ? Something like: LowerFP_TO_INTDirectMove: V = convertToFp() MFVSR V LowerFP_TO_INTForReuse: V = convertToFp() Store V Load V
8263	This could be something that we can improve later. We should mark it as legal instead of checking it here if I understand the intention correctly.
llvm/test/CodeGen/PowerPC/fp-strict-conv.ll
9	Add run for SPE target.
11	A a test for fp128

steven.zhang added inline comments.Jun 10 2020, 1:14 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8150	Please give the default initializer to avoid uninitialized local variable if llvm_unreachable is off.
llvm/test/CodeGen/PowerPC/fp-strict-conv.ll
172	Does this attribute need ?

Harbormaster failed remote builds in B59746: Diff 269746!Jun 10 2020, 1:36 AM

Address Steven's comments.

qiucf added inline comments.Jun 10 2020, 8:07 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8148	If `Strict` is `false`, we can allow Chain to be null.
llvm/test/CodeGen/PowerPC/fp-strict-conv.ll
172	All function calls done in a function that uses constrained floating point intrinsics must have the strictfp attribute. Although output won't change if we remove this attr. It's better to keep it according to langref.

Remove redundant chain logic.

Harbormaster failed remote builds in B59912: Diff 270025!Jun 10 2020, 9:02 PM

steven.zhang added inline comments.Jun 10 2020, 10:18 PM

llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll
16	So, what is it if it is ppcfp128 ?

Harbormaster failed remote builds in B59914: Diff 270027!Jun 10 2020, 10:39 PM

qiucf added a child revision: D81669: [PowerPC] Support constrained fp operation for scalar sitofp/uitofp.Jun 11 2020, 9:48 AM

Some code style comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8253–8255	Don't use Tmp but with some meaningful name.
8254	Op.getValueType() ?

steven.zhang added inline comments.Jun 12 2020, 2:23 AM

llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll
4	Please specify option -enable-ppc-quad-precision to enable the quad precision support in powerpc.

qiucf mentioned this in D81818: [NFC] [PowerPC] Use shared method in FP_TO_INT and INT_TO_FP lowering.Jun 14 2020, 7:56 PM

Rebase after D81818 and add f128 support

qiucf added a parent revision: D81818: [NFC] [PowerPC] Use shared method in FP_TO_INT and INT_TO_FP lowering.Jun 14 2020, 10:32 PM

Harbormaster failed remote builds in B60259: Diff 270660!Jun 14 2020, 10:56 PM

steven.zhang added inline comments.Jun 15 2020, 1:57 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8145	The Strict parameter is not needed as you can check the value of the Chain to know it.
8174	IsStrict, IsSigned
llvm/lib/Target/PowerPC/PPCInstrVSX.td
3278	Please move them into some group has the semantics of truncating. It is not bitconvert.

Address some style-related comments

Harbormaster failed remote builds in B61196: Diff 272360!Jun 22 2020, 3:11 AM

steven.zhang added inline comments.Jun 22 2020, 3:52 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	Don’t do this for spe target and remove the test for spe. Sorry about the back and forth.
8158	You don't need this assertion now.
8203	So, do we have problem if it is strict opcode in this code path?
8261	move the assertion into concertFPToInt
8428	ConvertIntToFP and ConvertFPToInt should have the same parameters.
8533	Please remove such kind of change as it is not part of your change.

nemanjai added subscribers: jhibbits, chmeee.Jun 24 2020, 5:39 AM

nemanjai added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	I would defer to @jhibbits @chmeee (not sure which of Justin's ID's is active) regarding SPE bits.
8150	This function appears to just take a node and produce a strict version of that node. It seems like there isn't anything target dependent about such an operation so it is very suspicious to me why this is in the PPC back end. If for some reason it has to be here, please explain why in a comment. If the only target specific part of this is the `STRICT_MFVSR` node, then at least the rest can be handled by target independent code, can't it?
8153	This seems dangerous to me. You are deciding whether to return a strict node based on whether a valid Chain is provided. I am personally against making decisions based on orthogonal concerns. If the caller wants a strict node, that should be explicit rather than this strange implicit contract of "If you want a strict node, provide a valid chain."
llvm/lib/Target/PowerPC/PPCISelLowering.h
435	Why? The instruction simply moves bits around. It does not cause any exceptions, it is not subject to rounding, etc. If this is necessary, it needs to be clear from the comment why.

jhibbits added inline comments.Jun 24 2020, 6:48 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	What are the semantic differences between STRICT_FP_TO_UINT and FP_TO_UINT? EFDCTUIZ/EFSCTUIZ and their signed counterparts, which we currently use for the FP_TO_{U,S}INT, saturate if they can't be represented as a 32-bit integer, and round toward zero always (the non-Z variants round via the current rounding mode).

uweigand added inline comments.Jun 24 2020, 7:31 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	FP_TO_UINT assumes the current rounding mode is default, and exception conditions can be ignored. With STRICT_FP_TO_UINT those assumptions no longer apply, so it would appear that those instructions you mention should not be used there.

uweigand added inline comments.Jun 25 2020, 8:43 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	Ah -- my last comment was incorrect, sorry for any confusion. STRICT_FP_TO_UINT/SINT are in fact an exception to most "STRICT" operations in that they do not use the current rounding mode, but always round towards zero. (Following the C standard as well as the LLVM IR specification.) So for these operations the only difference between strict and non-strict variants is whether exception conditions can be ignored or not.

jhibbits added inline comments.Jun 25 2020, 12:10 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	Ah, okay. So, if I understand correctly, EF{S,D}CT{S,U}I should be used for fp_to_{s,u}int, and the current 'Z' variants should be used for the strict_fp_to_*int.

uweigand added inline comments.Jun 26 2020, 5:50 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	Hmm, if I'm reading the SPE_PEM correctly, I think the "Z" variants are in fact correct for both strict and non-strict variants: they round towards zero (which both variants do), and they handle exceptions (which the strict variant requires, while the non-strict variant doesn't care). The non-"Z" variants seem wrong either way since they use the current rounding mode, which is incorrect for both strict and non-strict variants.

jhibbits added inline comments.Jun 26 2020, 2:35 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	Thanks for that explanation @uweigand now I understand. So the change here looks fine to me.

steven.zhang added inline comments.Jun 28 2020, 6:53 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	The reason why I asked to remove the spe here is to split this patch into two, one for PowerPC and another one for spe which need some inputs from spe experts. Does it make sense ?

qiucf marked 9 inline comments as done.Jun 29 2020, 2:33 AM

qiucf added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8150	Thanks for the comments. The method (including some overloads) doesn't help much. So I removed it and wrote a simple helper method to get strict version of ppc-specific opcode. There's method `mutateStrictFPToFP` doing similar things but maybe not suitable here. I'll send an NFC to make opcode conversion a hook so that each target can benefit from it.
8203	Do you mean these PPC-specific opcodes are not strict? But the result is either load/store or direct moved. What we do here is to keep operands of value consistent. So changing these opcodes to strict may be unnecessary.
8428	FPToInt is round-then-move, while IntToFP is move-then-round. So when IntToFP we need extra information from original `Op` besides the moved `Src`.
llvm/lib/Target/PowerPC/PPCISelLowering.h
435	Because (1) this prevents it being combined somewhere unexpectedly; (2) all strict nodes have extra operand for their chains, so replacing original `strict_*` node with non-strict one will cause operands mismatch. I added necessary comments. Thanks.

Removed SPE logic from this revision.
Add some comments for strict nodes.
Removed getFPNode method.
Addressed other minor comments.

qiucf mentioned this in D82747: [PowerPC] Support constrained int/fp conversion in SPE targets.Jun 29 2020, 2:39 AM

Harbormaster failed remote builds in B62108: Diff 274020!Jun 29 2020, 2:40 AM

qiucf marked 2 inline comments as done.Jun 29 2020, 2:41 AM

qiucf added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	I split D82747 out from this. It's much clearer and independent from this.

qiucf mentioned this in D81669: [PowerPC] Support constrained fp operation for scalar sitofp/uitofp.Jun 29 2020, 2:42 AM

Reflect changes after strict conversion for SPE and enable-ppc-quad-precision's removal.

Harbormaster failed remote builds in B64649: Diff 278704!Jul 17 2020, 3:14 AM

Ping..

LGTM now. Please hold on for several days to see if @nemanjai or @uweigand have comments.

This revision is now accepted and ready to land.Aug 4 2020, 2:34 AM

This doesn't look correct. As far as I can see, none of the conversion functions were actually changed to handle strict operations. For one, you'll need strict variants of all the PowerPC-specific conversion operations, use them in all the conversion subroutines, and consistently track their chain nodes.

The patch only adds a strict variant of the direct move, which seems to me the only operation where actually a strict version is not required ...

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8230	This doesn't look right. The Chain produced by this strict node just vanishes, this cannot be correct.
8247	And you need strict versions of all these conversion operations, I'd assume.
8258	Nothing in the function actually handles strict nodes, that cannot be right.
8307	Why do we need a strict version of a plain move?
8312	Again, nothing in this function actually handles strict nodes ...

In D81537#2193216, @uweigand wrote:

This doesn't look correct. As far as I can see, none of the conversion functions were actually changed to handle strict operations. For one, you'll need strict variants of all the PowerPC-specific conversion operations, use them in all the conversion subroutines, and consistently track their chain nodes.

The patch only adds a strict variant of the direct move, which seems to me the only operation where actually a strict version is not required ...

Thanks for pointing them out! I have something unclear about chains:

(1) If a constrained operation is expanded into several FP nodes a-b-c, they should all have chain set to former operation (b's chain is a, c's chain is b) even if they have def relationship?

(2) In MachineInstr emitting after ISel, chains are identified just by operand type (countOperand), so some chains are not ignored and assert hit. Is this expected?

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8307	Yes, a strict move doesn't look reasonable. But the original `strict_fptosi` node will be replaced by the result. So if directly return the move, operands will not match (no chain in `mfvsr`). Is there a better way here?

In D81537#2207249, @qiucf wrote:

In D81537#2193216, @uweigand wrote:

This doesn't look correct. As far as I can see, none of the conversion functions were actually changed to handle strict operations. For one, you'll need strict variants of all the PowerPC-specific conversion operations, use them in all the conversion subroutines, and consistently track their chain nodes.

The patch only adds a strict variant of the direct move, which seems to me the only operation where actually a strict version is not required ...

Thanks for pointing them out! I have something unclear about chains:

(1) If a constrained operation is expanded into several FP nodes a-b-c, they should all have chain set to former operation (b's chain is a, c's chain is b) even if they have def relationship?

That may depend on the specific semantics on which of those nodes may or may not trap. In some cases, the original sequence may in fact not be valid at all for strict mode. But if it is, then they'll need to be chained up properly. If they have data dependencies, then it usually makes sense for the chain to follow that dependency. In other cases, the may be an option for more flexibility by allowing certain operations to be re-scheduled. In those cases you'd give the same input chain to all operations and collect all output chains via a TokenFactor.

(2) In MachineInstr emitting after ISel, chains are identified just by operand type (countOperand), so some chains are not ignored and assert hit. Is this expected?

I'm not sure I understand what specific case you're refering to. But in any case, a chain should *never* be simply ignored, that would always be a bug.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8307	So this gets expanded to a PPCISD::FCTI... variant (inside convertFPToInt) followed by the MFVSR. Now, with proper chain handling, the input chain of the strict_fptosi is consumed by a strict variant of FCTI..., and the output chain of that STRICT_FCTI... is then the correct output chain for the whole operation. The data output (only) of the STRICT_FCTI... acts then as the input of the MFVSR, and the output of the MFVSR is the correct value output of the whole operation. So if short, you need to replace (out-val, out-chain) = strict_fptosi (in-val, in-chain) by (tmp-val, out-chain) = STRICT_FCTI... (in-val, in-chain) out-val = MFVSR (tmp-val) This probably will require some ReplaceAllUses... instead of just returning a result, as is already done elsewhere with chain output instructions.

Thanks for the detailed explanation!

Remove strict mfvsr
Update tests
Add strict fc*
Add chains to some expanded operation

uweigand added inline comments.Aug 13 2020, 5:00 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8068	This looks still wrong to me. I think you should replace the chain with the one from Conv, and return the value (i.e. Mov). Something like: SDValue Mov = DAG.getNode(PPCISD::MFVSR, dl, Op.getValueType(), Conv); DAG.ReplaceAllUsesOfValueWith(SDValue(Op, 1), Conv.getValue(1)); return Mov; (Actually, returning the value is then the same in strict and non-strict cases, so this can be merged.) Or, another possible approach would be to not use ReplaceAllUses at all but reconstruct the multi-output value via DAG.getMergeValues. Something like; SDValue Mov = DAG.getNode(PPCISD::MFVSR, dl, Op.getValueType(), Conv); return DAG.getMergeValues({Mov, Conv.getValue(1)}, dl);
llvm/lib/Target/PowerPC/PPCInstrVSX.td
3772	This also doesn't look quite correct. The XSCVQP... instructions are not (yet?) marked as mayRaiseFPException, instead they're marked as hasSideEffects. This means that the exception flag is probably not going to be automatically transferred over to the MI level. I think if the instructions are changed to set mayRaiseFPException, that should work correctly. But it would be best to have a test case that validates that the "nofpexcept" marker is transferred depending on the value of the "fpexect." metadata in the strict intrinsic (in LLVM IR).

Return a merge_value for round and move.
Set fp exception bit for f128 round instructions.

llvm/lib/Target/PowerPC/PPCInstrVSX.td
3772	Thanks for the reminder. The FP exception bits in PPC instruction definition files need to be carefully re-examined with more tests..

This LGTM now. Thanks!

Closed by commit rG131b3b9ed4ef: [PowerPC] Support constrained scalar fptosi/fptoui (authored by qiucf). · Explain WhyAug 19 2020, 10:35 PM

This revision was automatically updated to reflect the committed changes.

qiucf added a commit: rG131b3b9ed4ef: [PowerPC] Support constrained scalar fptosi/fptoui.

qiucf mentioned this in D71287: [PowerPC] Use fcti[dw] instructions in additional cases.Dec 29 2020, 7:30 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

PowerPC/

3 lines

131 lines

8 lines

6 lines

test/

CodeGen/

PowerPC/

fp-strict-conv-f128.ll

263 lines

fp-strict-conv.ll

261 lines

Diff 270025

llvm/lib/Target/PowerPC/PPCISelLowering.h

Show First 20 Lines • Show All 425 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
/// lower (IDX=1) half of v4f32 to v2f64.		/// lower (IDX=1) half of v4f32 to v2f64.
FP_EXTEND_HALF,		FP_EXTEND_HALF,

/// MAT_PCREL_ADDR = Materialize a PC Relative address. This can be done		/// MAT_PCREL_ADDR = Materialize a PC Relative address. This can be done
/// either through an add like PADDI or through a PC Relative load like		/// either through an add like PADDI or through a PC Relative load like
/// PLD.		/// PLD.
MAT_PCREL_ADDR,		MAT_PCREL_ADDR,

		/// Constrained direct move from VSR instruction.
		STRICT_MFVSR = ISD::FIRST_TARGET_STRICTFP_OPCODE,
		nemanjaiUnsubmitted Not Done Reply Inline Actions Why? The instruction simply moves bits around. It does not cause any exceptions, it is not subject to rounding, etc. If this is necessary, it needs to be clear from the comment why. nemanjai: Why? The instruction simply moves bits around. It does not cause any exceptions, it is not…
		qiucfAuthorUnsubmitted Done Reply Inline Actions Because (1) this prevents it being combined somewhere unexpectedly; (2) all strict nodes have extra operand for their chains, so replacing original `strict_` node with non-strict one will cause operands mismatch. I added necessary comments. Thanks. qiucf:* Because (1) this prevents it being combined somewhere unexpectedly; (2) all strict nodes have…

/// CHAIN = STBRX CHAIN, GPRC, Ptr, Type - This is a		/// CHAIN = STBRX CHAIN, GPRC, Ptr, Type - This is a
/// byte-swapping store instruction. It byte-swaps the low "Type" bits of		/// byte-swapping store instruction. It byte-swaps the low "Type" bits of
/// the GPRC input, then stores it through Ptr. Type can be either i16 or		/// the GPRC input, then stores it through Ptr. Type can be either i16 or
/// i32.		/// i32.
STBRX = ISD::FIRST_TARGET_MEMORY_OPCODE,		STBRX = ISD::FIRST_TARGET_MEMORY_OPCODE,

/// GPRC, CHAIN = LBRX CHAIN, Ptr, Type - This is a		/// GPRC, CHAIN = LBRX CHAIN, Ptr, Type - This is a
/// byte-swapping load instruction. It loads "Type" bits, byte swaps it,		/// byte-swapping load instruction. It loads "Type" bits, byte swaps it,
▲ Show 20 Lines • Show All 826 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 417 Lines • ▼ Show 20 Lines
	// PowerPC does not have BRCOND which requires SetCC			// PowerPC does not have BRCOND which requires SetCC
	if (!Subtarget.useCRBits())			if (!Subtarget.useCRBits())
	setOperationAction(ISD::BRCOND, MVT::Other, Expand);			setOperationAction(ISD::BRCOND, MVT::Other, Expand);

	setOperationAction(ISD::BR_JT, MVT::Other, Expand);			setOperationAction(ISD::BR_JT, MVT::Other, Expand);

	if (Subtarget.hasSPE()) {			if (Subtarget.hasSPE()) {
	// SPE has built-in conversions			// SPE has built-in conversions
				setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i32, Legal);
	setOperationAction(ISD::FP_TO_SINT, MVT::i32, Legal);			setOperationAction(ISD::FP_TO_SINT, MVT::i32, Legal);
	setOperationAction(ISD::SINT_TO_FP, MVT::i32, Legal);			setOperationAction(ISD::SINT_TO_FP, MVT::i32, Legal);
	setOperationAction(ISD::UINT_TO_FP, MVT::i32, Legal);			setOperationAction(ISD::UINT_TO_FP, MVT::i32, Legal);
	} else {			} else {
	// PowerPC turns FP_TO_SINT into FCTIWZ and some load/stores.			// PowerPC turns FP_TO_SINT into FCTIWZ and some load/stores.
	setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);			setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
				setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i32, Custom);

	// PowerPC does not have [U\|S]INT_TO_FP			// PowerPC does not have [U\|S]INT_TO_FP
	setOperationAction(ISD::SINT_TO_FP, MVT::i32, Expand);			setOperationAction(ISD::SINT_TO_FP, MVT::i32, Expand);
	setOperationAction(ISD::UINT_TO_FP, MVT::i32, Expand);			setOperationAction(ISD::UINT_TO_FP, MVT::i32, Expand);
	}			}

	if (Subtarget.hasDirectMove() && isPPC64) {			if (Subtarget.hasDirectMove() && isPPC64) {
	setOperationAction(ISD::BITCAST, MVT::f32, Legal);			setOperationAction(ISD::BITCAST, MVT::f32, Legal);
	▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines
	setCondCodeAction(ISD::SETOGE, MVT::f64, Expand);			setCondCodeAction(ISD::SETOGE, MVT::f64, Expand);
	setCondCodeAction(ISD::SETOLE, MVT::f32, Expand);			setCondCodeAction(ISD::SETOLE, MVT::f32, Expand);
	setCondCodeAction(ISD::SETOLE, MVT::f64, Expand);			setCondCodeAction(ISD::SETOLE, MVT::f64, Expand);
	setCondCodeAction(ISD::SETONE, MVT::f32, Expand);			setCondCodeAction(ISD::SETONE, MVT::f32, Expand);
	setCondCodeAction(ISD::SETONE, MVT::f64, Expand);			setCondCodeAction(ISD::SETONE, MVT::f64, Expand);

	if (Subtarget.has64BitSupport()) {			if (Subtarget.has64BitSupport()) {
	// They also have instructions for converting between i64 and fp.			// They also have instructions for converting between i64 and fp.
				setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i64, Custom);
				setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i64, Expand);
	setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);			setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);
	setOperationAction(ISD::FP_TO_UINT, MVT::i64, Expand);			setOperationAction(ISD::FP_TO_UINT, MVT::i64, Expand);
	setOperationAction(ISD::SINT_TO_FP, MVT::i64, Custom);			setOperationAction(ISD::SINT_TO_FP, MVT::i64, Custom);
	setOperationAction(ISD::UINT_TO_FP, MVT::i64, Expand);			setOperationAction(ISD::UINT_TO_FP, MVT::i64, Expand);
	// This is just the low 32 bits of a (signed) fp->i64 conversion.			// This is just the low 32 bits of a (signed) fp->i64 conversion.
	// We cannot do this with Promote because i64 is not a legal type.			// We cannot do this with Promote because i64 is not a legal type.
				setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i32, Custom);
	setOperationAction(ISD::FP_TO_UINT, MVT::i32, Custom);			setOperationAction(ISD::FP_TO_UINT, MVT::i32, Custom);

	if (Subtarget.hasLFIWAX() \|\| Subtarget.isPPC64())			if (Subtarget.hasLFIWAX() \|\| Subtarget.isPPC64())
	setOperationAction(ISD::SINT_TO_FP, MVT::i32, Custom);			setOperationAction(ISD::SINT_TO_FP, MVT::i32, Custom);
	} else {			} else {
	// PowerPC does not have FP_TO_UINT on 32-bit implementations.			// PowerPC does not have FP_TO_UINT on 32-bit implementations.
	if (Subtarget.hasSPE())			if (Subtarget.hasSPE()) {
				setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i32, Legal);
	setOperationAction(ISD::FP_TO_UINT, MVT::i32, Legal);			setOperationAction(ISD::FP_TO_UINT, MVT::i32, Legal);
				steven.zhangUnsubmitted Not Done Reply Inline Actions Don’t do this for spe target and remove the test for spe. Sorry about the back and forth. steven.zhang: Don’t do this for spe target and remove the test for spe. Sorry about the back and forth.
				nemanjaiUnsubmitted Not Done Reply Inline Actions I would defer to @jhibbits @chmeee (not sure which of Justin's ID's is active) regarding SPE bits. nemanjai: I would defer to @jhibbits @chmeee (not sure which of Justin's ID's is active) regarding SPE…
				jhibbitsUnsubmitted Not Done Reply Inline Actions What are the semantic differences between STRICT_FP_TO_UINT and FP_TO_UINT? EFDCTUIZ/EFSCTUIZ and their signed counterparts, which we currently use for the FP_TO_{U,S}INT, saturate if they can't be represented as a 32-bit integer, and round toward zero always (the non-Z variants round via the current rounding mode). jhibbits: What are the semantic differences between STRICT_FP_TO_UINT and FP_TO_UINT? EFDCTUIZ/EFSCTUIZ…
				uweigandUnsubmitted Not Done Reply Inline Actions FP_TO_UINT assumes the current rounding mode is default, and exception conditions can be ignored. With STRICT_FP_TO_UINT those assumptions no longer apply, so it would appear that those instructions you mention should not be used there. uweigand: FP_TO_UINT assumes the current rounding mode is default, and exception conditions can be…
				uweigandUnsubmitted Not Done Reply Inline Actions Ah -- my last comment was incorrect, sorry for any confusion. STRICT_FP_TO_UINT/SINT are in fact an exception to most "STRICT" operations in that they do not use the current rounding mode, but always round towards zero. (Following the C standard as well as the LLVM IR specification.) So for these operations the only difference between strict and non-strict variants is whether exception conditions can be ignored or not. uweigand: Ah -- my last comment was incorrect, sorry for any confusion. STRICT_FP_TO_UINT/SINT are in…
				jhibbitsUnsubmitted Not Done Reply Inline Actions Ah, okay. So, if I understand correctly, EF{S,D}CT{S,U}I should be used for fp_to_{s,u}int, and the current 'Z' variants should be used for the strict_fp_to_int. jhibbits:* Ah, okay. So, if I understand correctly, EF{S,D}CT{S,U}I should be used for fp_to_{s,u}int…
				uweigandUnsubmitted Not Done Reply Inline Actions Hmm, if I'm reading the SPE_PEM correctly, I think the "Z" variants are in fact correct for both strict and non-strict variants: they round towards zero (which both variants do), and they handle exceptions (which the strict variant requires, while the non-strict variant doesn't care). The non-"Z" variants seem wrong either way since they use the current rounding mode, which is incorrect for both strict and non-strict variants. uweigand: Hmm, if I'm reading the SPE_PEM correctly, I think the "Z" variants are in fact correct for…
				jhibbitsUnsubmitted Not Done Reply Inline Actions Thanks for that explanation @uweigand now I understand. So the change here looks fine to me. jhibbits: Thanks for that explanation @uweigand now I understand. So the change here looks fine to me.
				steven.zhangUnsubmitted Done Reply Inline Actions The reason why I asked to remove the spe here is to split this patch into two, one for PowerPC and another one for spe which need some inputs from spe experts. Does it make sense ? steven.zhang: The reason why I asked to remove the spe here is to split this patch into two, one for PowerPC…
				qiucfAuthorUnsubmitted Done Reply Inline Actions I split D82747 out from this. It's much clearer and independent from this. qiucf: I split D82747 out from this. It's much clearer and independent from this.
	else			} else {
				setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i32, Expand);
	setOperationAction(ISD::FP_TO_UINT, MVT::i32, Expand);			setOperationAction(ISD::FP_TO_UINT, MVT::i32, Expand);
				}
	}			}

	// With the instructions enabled under FPCVT, we can do everything.			// With the instructions enabled under FPCVT, we can do everything.
	if (Subtarget.hasFPCVT()) {			if (Subtarget.hasFPCVT()) {
	if (Subtarget.has64BitSupport()) {			if (Subtarget.has64BitSupport()) {
				setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i64, Custom);
				setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i64, Custom);
	setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);			setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);
	setOperationAction(ISD::FP_TO_UINT, MVT::i64, Custom);			setOperationAction(ISD::FP_TO_UINT, MVT::i64, Custom);
	setOperationAction(ISD::SINT_TO_FP, MVT::i64, Custom);			setOperationAction(ISD::SINT_TO_FP, MVT::i64, Custom);
	setOperationAction(ISD::UINT_TO_FP, MVT::i64, Custom);			setOperationAction(ISD::UINT_TO_FP, MVT::i64, Custom);
	}			}

				setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i32, Custom);
				setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i32, Custom);
	setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);			setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
	setOperationAction(ISD::FP_TO_UINT, MVT::i32, Custom);			setOperationAction(ISD::FP_TO_UINT, MVT::i32, Custom);
	setOperationAction(ISD::SINT_TO_FP, MVT::i32, Custom);			setOperationAction(ISD::SINT_TO_FP, MVT::i32, Custom);
	setOperationAction(ISD::UINT_TO_FP, MVT::i32, Custom);			setOperationAction(ISD::UINT_TO_FP, MVT::i32, Custom);
	}			}

	if (Subtarget.use64BitRegs()) {			if (Subtarget.use64BitRegs()) {
	// 64-bit PowerPC implementations can support i64 types directly			// 64-bit PowerPC implementations can support i64 types directly
	▲ Show 20 Lines • Show All 931 Lines • ▼ Show 20 Lines
	case PPCISD::FADDRTZ: return "PPCISD::FADDRTZ";			case PPCISD::FADDRTZ: return "PPCISD::FADDRTZ";
	case PPCISD::TC_RETURN: return "PPCISD::TC_RETURN";			case PPCISD::TC_RETURN: return "PPCISD::TC_RETURN";
	case PPCISD::CR6SET: return "PPCISD::CR6SET";			case PPCISD::CR6SET: return "PPCISD::CR6SET";
	case PPCISD::CR6UNSET: return "PPCISD::CR6UNSET";			case PPCISD::CR6UNSET: return "PPCISD::CR6UNSET";
	case PPCISD::PPC32_GOT: return "PPCISD::PPC32_GOT";			case PPCISD::PPC32_GOT: return "PPCISD::PPC32_GOT";
	case PPCISD::PPC32_PICGOT: return "PPCISD::PPC32_PICGOT";			case PPCISD::PPC32_PICGOT: return "PPCISD::PPC32_PICGOT";
	case PPCISD::ADDIS_GOT_TPREL_HA: return "PPCISD::ADDIS_GOT_TPREL_HA";			case PPCISD::ADDIS_GOT_TPREL_HA: return "PPCISD::ADDIS_GOT_TPREL_HA";
	case PPCISD::LD_GOT_TPREL_L: return "PPCISD::LD_GOT_TPREL_L";			case PPCISD::LD_GOT_TPREL_L: return "PPCISD::LD_GOT_TPREL_L";
	case PPCISD::ADD_TLS: return "PPCISD::ADD_TLS";			case PPCISD::ADD_TLS: return "PPCISD::ADD_TLS";
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - case PPCISD::STRICT_MFVSR: return "PPCISD::STRICT_MFVSR"; + case PPCISD::STRICT_MFVSR: + return "PPCISD::STRICT_MFVSR"; Lint: Pre-merge checks: clang-format: please reformat the code ``` - case PPCISD::STRICT_MFVSR: return "PPCISD…
	case PPCISD::ADDIS_TLSGD_HA: return "PPCISD::ADDIS_TLSGD_HA";			case PPCISD::ADDIS_TLSGD_HA: return "PPCISD::ADDIS_TLSGD_HA";
	case PPCISD::ADDI_TLSGD_L: return "PPCISD::ADDI_TLSGD_L";			case PPCISD::ADDI_TLSGD_L: return "PPCISD::ADDI_TLSGD_L";
	case PPCISD::GET_TLS_ADDR: return "PPCISD::GET_TLS_ADDR";			case PPCISD::GET_TLS_ADDR: return "PPCISD::GET_TLS_ADDR";
	case PPCISD::ADDI_TLSGD_L_ADDR: return "PPCISD::ADDI_TLSGD_L_ADDR";			case PPCISD::ADDI_TLSGD_L_ADDR: return "PPCISD::ADDI_TLSGD_L_ADDR";
	case PPCISD::ADDIS_TLSLD_HA: return "PPCISD::ADDIS_TLSLD_HA";			case PPCISD::ADDIS_TLSLD_HA: return "PPCISD::ADDIS_TLSLD_HA";
	case PPCISD::ADDI_TLSLD_L: return "PPCISD::ADDI_TLSLD_L";			case PPCISD::ADDI_TLSLD_L: return "PPCISD::ADDI_TLSLD_L";
	case PPCISD::GET_TLSLD_ADDR: return "PPCISD::GET_TLSLD_ADDR";			case PPCISD::GET_TLSLD_ADDR: return "PPCISD::GET_TLSLD_ADDR";
	case PPCISD::ADDI_TLSLD_L_ADDR: return "PPCISD::ADDI_TLSLD_L_ADDR";			case PPCISD::ADDI_TLSLD_L_ADDR: return "PPCISD::ADDI_TLSLD_L_ADDR";
	Show All 17 Lines
	case PPCISD::BUILD_SPE64: return "PPCISD::BUILD_SPE64";			case PPCISD::BUILD_SPE64: return "PPCISD::BUILD_SPE64";
	case PPCISD::EXTRACT_SPE: return "PPCISD::EXTRACT_SPE";			case PPCISD::EXTRACT_SPE: return "PPCISD::EXTRACT_SPE";
	case PPCISD::EXTSWSLI: return "PPCISD::EXTSWSLI";			case PPCISD::EXTSWSLI: return "PPCISD::EXTSWSLI";
	case PPCISD::LD_VSX_LH: return "PPCISD::LD_VSX_LH";			case PPCISD::LD_VSX_LH: return "PPCISD::LD_VSX_LH";
	case PPCISD::FP_EXTEND_HALF: return "PPCISD::FP_EXTEND_HALF";			case PPCISD::FP_EXTEND_HALF: return "PPCISD::FP_EXTEND_HALF";
	case PPCISD::MAT_PCREL_ADDR: return "PPCISD::MAT_PCREL_ADDR";			case PPCISD::MAT_PCREL_ADDR: return "PPCISD::MAT_PCREL_ADDR";
	case PPCISD::LD_SPLAT: return "PPCISD::LD_SPLAT";			case PPCISD::LD_SPLAT: return "PPCISD::LD_SPLAT";
	case PPCISD::FNMSUB: return "PPCISD::FNMSUB";			case PPCISD::FNMSUB: return "PPCISD::FNMSUB";
				case PPCISD::STRICT_MFVSR: return "PPCISD::STRICT_MFVSR";
	}			}
	return nullptr;			return nullptr;
	}			}

	EVT PPCTargetLowering::getSetCCResultType(const DataLayout &DL, LLVMContext &C,			EVT PPCTargetLowering::getSetCCResultType(const DataLayout &DL, LLVMContext &C,
	EVT VT) const {			EVT VT) const {
	if (!VT.isVector())			if (!VT.isVector())
	return Subtarget.useCRBits() ? MVT::i1 : MVT::i32;			return Subtarget.useCRBits() ? MVT::i1 : MVT::i32;
	▲ Show 20 Lines • Show All 1,906 Lines • ▼ Show 20 Lines
	(!DAG.getTarget().Options.NoNaNsFPMath && !Flags.hasNoNaNs()))			(!DAG.getTarget().Options.NoNaNsFPMath && !Flags.hasNoNaNs()))
	return Op;			return Op;

	// If the RHS of the comparison is a 0.0, we don't need to do the			// If the RHS of the comparison is a 0.0, we don't need to do the
	// subtraction at all.			// subtraction at all.
	SDValue Sel1;			SDValue Sel1;
	if (isFloatingPointZero(RHS))			if (isFloatingPointZero(RHS))
	switch (CC) {			switch (CC) {
	default: break; // SETUO etc aren't handled by fsel.			default: break; // SETUO etc aren't handled by fsel.
				uweigandUnsubmitted Done Reply Inline Actions This looks still wrong to me. I think you should replace the chain with the one from Conv, and return the value (i.e. Mov). Something like: SDValue Mov = DAG.getNode(PPCISD::MFVSR, dl, Op.getValueType(), Conv); DAG.ReplaceAllUsesOfValueWith(SDValue(Op, 1), Conv.getValue(1)); return Mov; (Actually, returning the value is then the same in strict and non-strict cases, so this can be merged.) Or, another possible approach would be to not use ReplaceAllUses at all but reconstruct the multi-output value via DAG.getMergeValues. Something like; SDValue Mov = DAG.getNode(PPCISD::MFVSR, dl, Op.getValueType(), Conv); return DAG.getMergeValues({Mov, Conv.getValue(1)}, dl); uweigand: This looks still wrong to me. I think you should replace the chain with the one from Conv…
	case ISD::SETNE:			case ISD::SETNE:
	std::swap(TV, FV);			std::swap(TV, FV);
	LLVM_FALLTHROUGH;			LLVM_FALLTHROUGH;
	case ISD::SETEQ:			case ISD::SETEQ:
	if (LHS.getValueType() == MVT::f32) // Comparison is always 64-bits			if (LHS.getValueType() == MVT::f32) // Comparison is always 64-bits
	LHS = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, LHS);			LHS = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, LHS);
	Sel1 = DAG.getNode(PPCISD::FSEL, dl, ResVT, LHS, TV, FV);			Sel1 = DAG.getNode(PPCISD::FSEL, dl, ResVT, LHS, TV, FV);
	if (Sel1.getValueType() == MVT::f32) // Comparison is always 64-bits			if (Sel1.getValueType() == MVT::f32) // Comparison is always 64-bits
	Show All 27 Lines
	case ISD::SETNE:			case ISD::SETNE:
	std::swap(TV, FV);			std::swap(TV, FV);
	LLVM_FALLTHROUGH;			LLVM_FALLTHROUGH;
	case ISD::SETEQ:			case ISD::SETEQ:
	Cmp = DAG.getNode(ISD::FSUB, dl, CmpVT, LHS, RHS, Flags);			Cmp = DAG.getNode(ISD::FSUB, dl, CmpVT, LHS, RHS, Flags);
	if (Cmp.getValueType() == MVT::f32) // Comparison is always 64-bits			if (Cmp.getValueType() == MVT::f32) // Comparison is always 64-bits
	Cmp = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Cmp);			Cmp = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Cmp);
	Sel1 = DAG.getNode(PPCISD::FSEL, dl, ResVT, Cmp, TV, FV);			Sel1 = DAG.getNode(PPCISD::FSEL, dl, ResVT, Cmp, TV, FV);
	if (Sel1.getValueType() == MVT::f32) // Comparison is always 64-bits			if (Sel1.getValueType() == MVT::f32) // Comparison is always 64-bits
				Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming]…
	Sel1 = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Sel1);			Sel1 = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Sel1);
	return DAG.getNode(PPCISD::FSEL, dl, ResVT,			return DAG.getNode(PPCISD::FSEL, dl, ResVT,
	DAG.getNode(ISD::FNEG, dl, MVT::f64, Cmp), Sel1, FV);			DAG.getNode(ISD::FNEG, dl, MVT::f64, Cmp), Sel1, FV);
	case ISD::SETULT:			case ISD::SETULT:
	case ISD::SETLT:			case ISD::SETLT:
	Cmp = DAG.getNode(ISD::FSUB, dl, CmpVT, LHS, RHS, Flags);			Cmp = DAG.getNode(ISD::FSUB, dl, CmpVT, LHS, RHS, Flags);
	if (Cmp.getValueType() == MVT::f32) // Comparison is always 64-bits			if (Cmp.getValueType() == MVT::f32) // Comparison is always 64-bits
	Cmp = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Cmp);			Cmp = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Cmp);
	Show All 10 Lines
	if (Cmp.getValueType() == MVT::f32) // Comparison is always 64-bits			if (Cmp.getValueType() == MVT::f32) // Comparison is always 64-bits
	Cmp = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Cmp);			Cmp = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Cmp);
	return DAG.getNode(PPCISD::FSEL, dl, ResVT, Cmp, FV, TV);			return DAG.getNode(PPCISD::FSEL, dl, ResVT, Cmp, FV, TV);
	case ISD::SETOLE:			case ISD::SETOLE:
	case ISD::SETLE:			case ISD::SETLE:
	Cmp = DAG.getNode(ISD::FSUB, dl, CmpVT, RHS, LHS, Flags);			Cmp = DAG.getNode(ISD::FSUB, dl, CmpVT, RHS, LHS, Flags);
	if (Cmp.getValueType() == MVT::f32) // Comparison is always 64-bits			if (Cmp.getValueType() == MVT::f32) // Comparison is always 64-bits
	Cmp = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Cmp);			Cmp = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Cmp);
	return DAG.getNode(PPCISD::FSEL, dl, ResVT, Cmp, TV, FV);			return DAG.getNode(PPCISD::FSEL, dl, ResVT, Cmp, TV, FV);
				Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming]…
	}			}
	return Op;			return Op;
	}			}

	void PPCTargetLowering::LowerFP_TO_INTForReuse(SDValue Op, ReuseLoadInfo &RLI,			static SDValue getFPNode(unsigned Opc, EVT VT, SDValue Op, SDValue Chain,
				steven.zhangUnsubmitted Done Reply Inline Actions The parameter SDLoc is not needed. And can we change the function like: getFpNode() or something else ? You don't need to have the strict in the function name as it is already one of the function parameter. steven.zhang: The parameter SDLoc is not needed. And can we change the function like: getFpNode() or…
	SelectionDAG &DAG,			SelectionDAG &DAG, bool Strict) {
				steven.zhangUnsubmitted Done Reply Inline Actions The Strict parameter is not needed as you can check the value of the Chain to know it. steven.zhang: The Strict parameter is not needed as you can check the value of the Chain to know it.
	const SDLoc &dl) const {			SDLoc dl(Op);
	assert(Op.getOperand(0).getValueType().isFloatingPoint());			if (!Strict)
				steven.zhangUnsubmitted Done Reply Inline Actions This is not good code practice. Please use: Strict ? 1 : 0 or seeking some API in the SDValue. steven.zhang: This is not good code practice. Please use: Strict ? 1 : 0 or seeking some API in the SDValue.
	SDValue Src = Op.getOperand(0);			return DAG.getNode(Opc, dl, VT, Op);
				steven.zhangUnsubmitted Done Reply Inline Actions !Strict is not needed. steven.zhang: !Strict is not needed.
				qiucfAuthorUnsubmitted Done Reply Inline Actions If `Strict` is `false`, we can allow Chain to be null. qiucf: If `Strict` is `false`, we can allow Chain to be null.
	if (Src.getValueType() == MVT::f32)
	Src = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Src);

				// Try to generate a STRICT node version
				steven.zhangUnsubmitted Done Reply Inline Actions Please give the default initializer to avoid uninitialized local variable if llvm_unreachable is off. steven.zhang: Please give the default initializer to avoid uninitialized local variable if llvm_unreachable…
				nemanjaiUnsubmitted Done Reply Inline Actions This function appears to just take a node and produce a strict version of that node. It seems like there isn't anything target dependent about such an operation so it is very suspicious to me why this is in the PPC back end. If for some reason it has to be here, please explain why in a comment. If the only target specific part of this is the `STRICT_MFVSR` node, then at least the rest can be handled by target independent code, can't it? nemanjai: This function appears to just take a node and produce a strict version of that node. It seems…
				qiucfAuthorUnsubmitted Done Reply Inline Actions Thanks for the comments. The method (including some overloads) doesn't help much. So I removed it and wrote a simple helper method to get strict version of ppc-specific opcode. There's method `mutateStrictFPToFP` doing similar things but maybe not suitable here. I'll send an NFC to make opcode conversion a hook so that each target can benefit from it. qiucf: Thanks for the comments. The method (including some overloads) doesn't help much. So I removed…
				assert((!Strict \|\| Chain) && "Missing chain for creating strict nodes");
				unsigned NewOpc = ISD::DELETED_NODE;
				switch (Opc) {
				nemanjaiUnsubmitted Done Reply Inline Actions This seems dangerous to me. You are deciding whether to return a strict node based on whether a valid Chain is provided. I am personally against making decisions based on orthogonal concerns. If the caller wants a strict node, that should be explicit rather than this strange implicit contract of "If you want a strict node, provide a valid chain." nemanjai: This seems dangerous to me. You are deciding whether to return a strict node based on whether a…
				default:
				llvm_unreachable("getFPNode called with unexpected opcode!");
				case PPCISD::MFVSR:
				NewOpc = PPCISD::STRICT_MFVSR;
				break;
				steven.zhangUnsubmitted Done Reply Inline Actions You don't need this assertion now. steven.zhang: You don't need this assertion now.
				#define DAG_INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC, DAGN) \
				case ISD::DAGN: \
				NewOpc = ISD::STRICT_##DAGN; \
				break;
				#define CMP_INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC, DAGN)
				#include "llvm/IR/ConstrainedOps.def"
				#undef DAG_INSTRUCTION
				#undef CMP_INSTRUCTION
				}
				return DAG.getNode(NewOpc, dl, {VT, MVT::Other}, {Chain, Op});
				}

				Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'LowerFP_TO_INTForReuse' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'LowerFP_TO_INTForReuse' [readability…
				static SDValue convertFPToInt(SDValue Op, SelectionDAG &DAG,
				const PPCSubtarget &Subtarget) {
				Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for parameter 'dl' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for parameter 'dl' [readability-identifier-naming]…
				SDLoc dl(Op);
				bool Strict = Op->isStrictFPOpcode();
				steven.zhangUnsubmitted Done Reply Inline Actions IsStrict, IsSigned steven.zhang: IsStrict, IsSigned
				bool Signed = Op.getOpcode() == ISD::FP_TO_SINT \|\|
				Op.getOpcode() == ISD::STRICT_FP_TO_SINT;
				// For strict nodes, source is the second operand.
				SDValue Src = Op.getOperand(Strict ? 1 : 0);
				SDValue FPChain;
				if (Strict)
				FPChain = Op.getOperand(0);
				assert(Src.getValueType().isFloatingPoint());
				if (Src.getValueType() == MVT::f32)
				Src = getFPNode(ISD::FP_EXTEND, MVT::f64, Src, FPChain, DAG, Strict);
	SDValue Tmp;			SDValue Tmp;
	switch (Op.getSimpleValueType().SimpleTy) {			switch (Op.getSimpleValueType().SimpleTy) {
	default: llvm_unreachable("Unhandled FP_TO_INT type in custom expander!");			default: llvm_unreachable("Unhandled FP_TO_INT type in custom expander!");
	case MVT::i32:			case MVT::i32:
	Tmp = DAG.getNode(			Tmp = DAG.getNode(
	Op.getOpcode() == ISD::FP_TO_SINT			Signed ? PPCISD::FCTIWZ
	? PPCISD::FCTIWZ			: (Subtarget.hasFPCVT() ? PPCISD::FCTIWUZ : PPCISD::FCTIDZ),
	: (Subtarget.hasFPCVT() ? PPCISD::FCTIWUZ : PPCISD::FCTIDZ),
	dl, MVT::f64, Src);			dl, MVT::f64, Src);
	break;			break;
	case MVT::i64:			case MVT::i64:
	assert((Op.getOpcode() == ISD::FP_TO_SINT \|\| Subtarget.hasFPCVT()) &&			assert((Signed \|\| Subtarget.hasFPCVT()) &&
	"i64 FP_TO_UINT is supported only with FPCVT");			"i64 FP_TO_UINT is supported only with FPCVT");
	Tmp = DAG.getNode(Op.getOpcode()==ISD::FP_TO_SINT ? PPCISD::FCTIDZ :			Tmp = DAG.getNode(Signed ? PPCISD::FCTIDZ : PPCISD::FCTIDUZ, dl, MVT::f64,
	PPCISD::FCTIDUZ,			Src);
	dl, MVT::f64, Src);
	break;			break;
	}			}
				return Tmp;
				}

				steven.zhangUnsubmitted Not Done Reply Inline Actions So, do we have problem if it is strict opcode in this code path? steven.zhang: So, do we have problem if it is strict opcode in this code path?
				qiucfAuthorUnsubmitted Done Reply Inline Actions Do you mean these PPC-specific opcodes are not strict? But the result is either load/store or direct moved. What we do here is to keep operands of value consistent. So changing these opcodes to strict may be unnecessary. qiucf: Do you mean these PPC-specific opcodes are not strict? But the result is either load/store or…
				void PPCTargetLowering::LowerFP_TO_INTForReuse(SDValue Op, ReuseLoadInfo &RLI,
				SelectionDAG &DAG,
				const SDLoc &dl) const {
				SDValue Tmp = convertFPToInt(Op, DAG, Subtarget);
				bool Signed = Op.getOpcode() == ISD::FP_TO_SINT \|\|
				Op.getOpcode() == ISD::STRICT_FP_TO_SINT;
				steven.zhangUnsubmitted Done Reply Inline Actions The logic between LowerFP_TO_INTForReuse and LowerFP_TO_INTDirectMove is nearly the same between line 8168 ~ 8196. And that is expected as the difference between the two is how to move the data from FPR to GPR. So, can we add another function to do the convert ? Something like: LowerFP_TO_INTDirectMove: V = convertToFp() MFVSR V LowerFP_TO_INTForReuse: V = convertToFp() Store V Load V steven.zhang: The logic between LowerFP_TO_INTForReuse and LowerFP_TO_INTDirectMove is nearly the same…

	// Convert the FP value to an int value through memory.			// Convert the FP value to an int value through memory.
	bool i32Stack = Op.getValueType() == MVT::i32 && Subtarget.hasSTFIWX() &&			bool i32Stack = Op.getValueType() == MVT::i32 && Subtarget.hasSTFIWX() &&
	(Op.getOpcode() == ISD::FP_TO_SINT \|\| Subtarget.hasFPCVT());			(Signed \|\| Subtarget.hasFPCVT());
	SDValue FIPtr = DAG.CreateStackTemporary(i32Stack ? MVT::i32 : MVT::f64);			SDValue FIPtr = DAG.CreateStackTemporary(i32Stack ? MVT::i32 : MVT::f64);
	int FI = cast<FrameIndexSDNode>(FIPtr)->getIndex();			int FI = cast<FrameIndexSDNode>(FIPtr)->getIndex();
	MachinePointerInfo MPI =			MachinePointerInfo MPI =
	MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI);			MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI);

	// Emit a store to the stack slot.			// Emit a store to the stack slot.
	SDValue Chain;			SDValue Chain;
	Align Alignment(DAG.getEVTAlign(Tmp.getValueType()));			Align Alignment(DAG.getEVTAlign(Tmp.getValueType()));
	if (i32Stack) {			if (i32Stack) {
	MachineFunction &MF = DAG.getMachineFunction();			MachineFunction &MF = DAG.getMachineFunction();
	Alignment = Align(4);			Alignment = Align(4);
	MachineMemOperand *MMO =			MachineMemOperand *MMO =
	MF.getMachineMemOperand(MPI, MachineMemOperand::MOStore, 4, Alignment);			MF.getMachineMemOperand(MPI, MachineMemOperand::MOStore, 4, Alignment);
	SDValue Ops[] = { DAG.getEntryNode(), Tmp, FIPtr };			SDValue Ops[] = { DAG.getEntryNode(), Tmp, FIPtr };
	Chain = DAG.getMemIntrinsicNode(PPCISD::STFIWX, dl,			Chain = DAG.getMemIntrinsicNode(PPCISD::STFIWX, dl,
	DAG.getVTList(MVT::Other), Ops, MVT::i32, MMO);			DAG.getVTList(MVT::Other), Ops, MVT::i32, MMO);
	} else			} else
				uweigandUnsubmitted Not Done Reply Inline Actions This doesn't look right. The Chain produced by this strict node just vanishes, this cannot be correct. uweigand: This doesn't look right. The Chain produced by this strict node just vanishes, this cannot be…
	Chain = DAG.getStore(DAG.getEntryNode(), dl, Tmp, FIPtr, MPI, Alignment);			Chain = DAG.getStore(DAG.getEntryNode(), dl, Tmp, FIPtr, MPI, Alignment);

	// Result is a load from the stack slot. If loading 4 bytes, make sure to			// Result is a load from the stack slot. If loading 4 bytes, make sure to
	// add in a bias on big endian.			// add in a bias on big endian.
	if (Op.getValueType() == MVT::i32 && !i32Stack) {			if (Op.getValueType() == MVT::i32 && !i32Stack) {
	FIPtr = DAG.getNode(ISD::ADD, dl, FIPtr.getValueType(), FIPtr,			FIPtr = DAG.getNode(ISD::ADD, dl, FIPtr.getValueType(), FIPtr,
	DAG.getConstant(4, dl, FIPtr.getValueType()));			DAG.getConstant(4, dl, FIPtr.getValueType()));
	MPI = MPI.getWithOffset(Subtarget.isLittleEndian() ? 0 : 4);			MPI = MPI.getWithOffset(Subtarget.isLittleEndian() ? 0 : 4);
	}			}

	RLI.Chain = Chain;			RLI.Chain = Chain;
	RLI.Ptr = FIPtr;			RLI.Ptr = FIPtr;
	RLI.MPI = MPI;			RLI.MPI = MPI;
	RLI.Alignment = Alignment;			RLI.Alignment = Alignment;
	}			}

	/// Custom lowers floating point to integer conversions to use			/// Custom lowers floating point to integer conversions to use
				uweigandUnsubmitted Not Done Reply Inline Actions And you need strict versions of all these conversion operations, I'd assume. uweigand: And you need strict versions of all these conversion operations, I'd assume.
	/// the direct move instructions available in ISA 2.07 to avoid the			/// the direct move instructions available in ISA 2.07 to avoid the
	/// need for load/store combinations.			/// need for load/store combinations.
	SDValue PPCTargetLowering::LowerFP_TO_INTDirectMove(SDValue Op,			SDValue PPCTargetLowering::LowerFP_TO_INTDirectMove(SDValue Op,
	SelectionDAG &DAG,			SelectionDAG &DAG,
	const SDLoc &dl) const {			const SDLoc &dl) const {
	assert(Op.getOperand(0).getValueType().isFloatingPoint());			SDValue Tmp = convertFPToInt(Op, DAG, Subtarget);
	SDValue Src = Op.getOperand(0);			return getFPNode(PPCISD::MFVSR, Op.getSimpleValueType().SimpleTy, Tmp,
				steven.zhangUnsubmitted Done Reply Inline Actions Op.getValueType() ? steven.zhang: Op.getValueType() ?
				Op.getOperand(0), DAG, Op->isStrictFPOpcode());
				steven.zhangUnsubmitted Done Reply Inline Actions Don't use Tmp but with some meaningful name. steven.zhang: Don't use Tmp but with some meaningful name.
	if (Src.getValueType() == MVT::f32)
	Src = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Src);

	SDValue Tmp;
	switch (Op.getSimpleValueType().SimpleTy) {
	default: llvm_unreachable("Unhandled FP_TO_INT type in custom expander!");
	case MVT::i32:
	Tmp = DAG.getNode(
	Op.getOpcode() == ISD::FP_TO_SINT
	? PPCISD::FCTIWZ
	: (Subtarget.hasFPCVT() ? PPCISD::FCTIWUZ : PPCISD::FCTIDZ),
	dl, MVT::f64, Src);
	Tmp = DAG.getNode(PPCISD::MFVSR, dl, MVT::i32, Tmp);
	break;
	case MVT::i64:
	assert((Op.getOpcode() == ISD::FP_TO_SINT \|\| Subtarget.hasFPCVT()) &&
	"i64 FP_TO_UINT is supported only with FPCVT");
	Tmp = DAG.getNode(Op.getOpcode()==ISD::FP_TO_SINT ? PPCISD::FCTIDZ :
	PPCISD::FCTIDUZ,
	dl, MVT::f64, Src);
	Tmp = DAG.getNode(PPCISD::MFVSR, dl, MVT::i64, Tmp);
	break;
	}
	return Tmp;
	}			}

	SDValue PPCTargetLowering::LowerFP_TO_INT(SDValue Op, SelectionDAG &DAG,			SDValue PPCTargetLowering::LowerFP_TO_INT(SDValue Op, SelectionDAG &DAG,
				uweigandUnsubmitted Not Done Reply Inline Actions Nothing in the function actually handles strict nodes, that cannot be right. uweigand: Nothing in the function actually handles strict nodes, that cannot be right.
	const SDLoc &dl) const {			const SDLoc &dl) const {
				bool Strict = Op->isStrictFPOpcode();
				SDValue Src = Op.getOperand(Strict ? 1 : 0);
				steven.zhangUnsubmitted Done Reply Inline Actions move the assertion into concertFPToInt steven.zhang: move the assertion into concertFPToInt
	// FP to INT conversions are legal for f128.			// FP to INT conversions are legal for f128.
	if (EnableQuadPrecision && (Op->getOperand(0).getValueType() == MVT::f128))			if (EnableQuadPrecision && (Src.getValueType() == MVT::f128))
				steven.zhangUnsubmitted Not Done Reply Inline Actions This could be something that we can improve later. We should mark it as legal instead of checking it here if I understand the intention correctly. steven.zhang: This could be something that we can improve later. We should mark it as legal instead of…
	return Op;			return Op;

	// Expand ppcf128 to i32 by hand for the benefit of llvm-gcc bootstrap on			// Expand ppcf128 to i32 by hand for the benefit of llvm-gcc bootstrap on
	// PPC (the libcall is not available).			// PPC (the libcall is not available).
	if (Op.getOperand(0).getValueType() == MVT::ppcf128) {			if (Src.getValueType() == MVT::ppcf128) {
	if (Op.getValueType() == MVT::i32) {			if (Op.getValueType() == MVT::i32) {
	if (Op.getOpcode() == ISD::FP_TO_SINT) {			if (Op.getOpcode() == ISD::FP_TO_SINT) {
	SDValue Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, dl,			SDValue Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, dl,
	MVT::f64, Op.getOperand(0),			MVT::f64, Op.getOperand(0),
	DAG.getIntPtrConstant(0, dl));			DAG.getIntPtrConstant(0, dl));
	SDValue Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, dl,			SDValue Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, dl,
	MVT::f64, Op.getOperand(0),			MVT::f64, Op.getOperand(0),
	DAG.getIntPtrConstant(1, dl));			DAG.getIntPtrConstant(1, dl));
	Show All 22 Lines
	ISD::SETGE);			ISD::SETGE);
	}			}
	}			}

	return SDValue();			return SDValue();
	}			}

	if (Subtarget.hasDirectMove() && Subtarget.isPPC64())			if (Subtarget.hasDirectMove() && Subtarget.isPPC64())
	return LowerFP_TO_INTDirectMove(Op, DAG, dl);			return LowerFP_TO_INTDirectMove(Op, DAG, dl);
				uweigandUnsubmitted Not Done Reply Inline Actions Why do we need a strict version of a plain move? uweigand: Why do we need a strict version of a plain move?
				qiucfAuthorUnsubmitted Done Reply Inline Actions Yes, a strict move doesn't look reasonable. But the original `strict_fptosi` node will be replaced by the result. So if directly return the move, operands will not match (no chain in `mfvsr`). Is there a better way here? qiucf: Yes, a strict move doesn't look reasonable. But the original `strict_fptosi` node will be…
				uweigandUnsubmitted Not Done Reply Inline Actions So this gets expanded to a PPCISD::FCTI... variant (inside convertFPToInt) followed by the MFVSR. Now, with proper chain handling, the input chain of the strict_fptosi is consumed by a strict variant of FCTI..., and the output chain of that STRICT_FCTI... is then the correct output chain for the whole operation. The data output (only) of the STRICT_FCTI... acts then as the input of the MFVSR, and the output of the MFVSR is the correct value output of the whole operation. So if short, you need to replace (out-val, out-chain) = strict_fptosi (in-val, in-chain) by (tmp-val, out-chain) = STRICT_FCTI... (in-val, in-chain) out-val = MFVSR (tmp-val) This probably will require some ReplaceAllUses... instead of just returning a result, as is already done elsewhere with chain output instructions. uweigand: So this gets expanded to a PPCISD::FCTI... variant (inside convertFPToInt) followed by the…

	ReuseLoadInfo RLI;			ReuseLoadInfo RLI;
	LowerFP_TO_INTForReuse(Op, RLI, DAG, dl);			LowerFP_TO_INTForReuse(Op, RLI, DAG, dl);

	return DAG.getLoad(Op.getValueType(), dl, RLI.Chain, RLI.Ptr, RLI.MPI,			return DAG.getLoad(Op.getValueType(), dl, RLI.Chain, RLI.Ptr, RLI.MPI,
				uweigandUnsubmitted Not Done Reply Inline Actions Again, nothing in this function actually handles strict nodes ... uweigand: Again, nothing in this function actually handles strict nodes ...
	RLI.Alignment, RLI.MMOFlags(), RLI.AAInfo, RLI.Ranges);			RLI.Alignment, RLI.MMOFlags(), RLI.AAInfo, RLI.Ranges);
	}			}

	// We're trying to insert a regular store, S, and then a load, L. If the			// We're trying to insert a regular store, S, and then a load, L. If the
	// incoming value, O, is a load, we might just be able to have our load use the			// incoming value, O, is a load, we might just be able to have our load use the
	// address used by O. However, we don't know if anything else will store to			// address used by O. However, we don't know if anything else will store to
	// that address before we can load from it. To prevent this situation, we need			// that address before we can load from it. To prevent this situation, we need
	// to insert our load, L, into the chain as a peer of O. To do this, we give L			// to insert our load, L, into the chain as a peer of O. To do this, we give L
	▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines
	/// need for load/store combinations.			/// need for load/store combinations.
	SDValue PPCTargetLowering::LowerINT_TO_FPDirectMove(SDValue Op,			SDValue PPCTargetLowering::LowerINT_TO_FPDirectMove(SDValue Op,
	SelectionDAG &DAG,			SelectionDAG &DAG,
	const SDLoc &dl) const {			const SDLoc &dl) const {
	assert((Op.getValueType() == MVT::f32 \|\|			assert((Op.getValueType() == MVT::f32 \|\|
	Op.getValueType() == MVT::f64) &&			Op.getValueType() == MVT::f64) &&
	"Invalid floating point type as target of conversion");			"Invalid floating point type as target of conversion");
	assert(Subtarget.hasFPCVT() &&			assert(Subtarget.hasFPCVT() &&
	"Int to FP conversions with direct moves require FPCVT");			"Int to FP conversions with direct moves require FPCVT");
				steven.zhangUnsubmitted Not Done Reply Inline Actions ConvertIntToFP and ConvertFPToInt should have the same parameters. steven.zhang: ConvertIntToFP and ConvertFPToInt should have the same parameters.
				qiucfAuthorUnsubmitted Done Reply Inline Actions FPToInt is round-then-move, while IntToFP is move-then-round. So when IntToFP we need extra information from original `Op` besides the moved `Src`. qiucf: FPToInt is round-then-move, while IntToFP is move-then-round. So when IntToFP we need extra…
	SDValue FP;			SDValue FP;
	SDValue Src = Op.getOperand(0);			SDValue Src = Op.getOperand(0);
	bool SinglePrec = Op.getValueType() == MVT::f32;			bool SinglePrec = Op.getValueType() == MVT::f32;
	bool WordInt = Src.getSimpleValueType().SimpleTy == MVT::i32;			bool WordInt = Src.getSimpleValueType().SimpleTy == MVT::i32;
	bool Signed = Op.getOpcode() == ISD::SINT_TO_FP;			bool Signed = Op.getOpcode() == ISD::SINT_TO_FP;
	unsigned ConvOp = Signed ? (SinglePrec ? PPCISD::FCFIDS : PPCISD::FCFID) :			unsigned ConvOp = Signed ? (SinglePrec ? PPCISD::FCFIDS : PPCISD::FCFID) :
	(SinglePrec ? PPCISD::FCFIDUS : PPCISD::FCFIDU);			(SinglePrec ? PPCISD::FCFIDUS : PPCISD::FCFIDU);

	▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines
	EVT OutVT = Op.getValueType();			EVT OutVT = Op.getValueType();
	if (OutVT.isVector() && OutVT.isFloatingPoint() &&			if (OutVT.isVector() && OutVT.isFloatingPoint() &&
	isOperationCustom(Op.getOpcode(), InVT))			isOperationCustom(Op.getOpcode(), InVT))
	return LowerINT_TO_FPVector(Op, DAG, dl);			return LowerINT_TO_FPVector(Op, DAG, dl);

	// Conversions to f128 are legal.			// Conversions to f128 are legal.
	if (EnableQuadPrecision && (Op.getValueType() == MVT::f128))			if (EnableQuadPrecision && (Op.getValueType() == MVT::f128))
	return Op;			return Op;

				steven.zhangUnsubmitted Done Reply Inline Actions Please remove such kind of change as it is not part of your change. steven.zhang: Please remove such kind of change as it is not part of your change.
	if (Subtarget.hasQPX() && Op.getOperand(0).getValueType() == MVT::v4i1) {			if (Subtarget.hasQPX() && Op.getOperand(0).getValueType() == MVT::v4i1) {
	if (Op.getValueType() != MVT::v4f32 && Op.getValueType() != MVT::v4f64)			if (Op.getValueType() != MVT::v4f32 && Op.getValueType() != MVT::v4f64)
	return SDValue();			return SDValue();

	SDValue Value = Op.getOperand(0);			SDValue Value = Op.getOperand(0);
	// The values are now known to be -1 (false) or 1 (true). To convert this			// The values are now known to be -1 (false) or 1 (true). To convert this
	// into 0 (false) and 1 (true), add 1 and then divide by 2 (multiply by 0.5).			// into 0 (false) and 1 (true), add 1 and then divide by 2 (multiply by 0.5).
	// This can be done with an fma and the 0.5 constant: (V+1.0)0.5 = 0.5V+0.5			// This can be done with an fma and the 0.5 constant: (V+1.0)0.5 = 0.5V+0.5
	▲ Show 20 Lines • Show All 1,717 Lines • ▼ Show 20 Lines
	case ISD::EH_DWARF_CFA: return LowerEH_DWARF_CFA(Op, DAG);			case ISD::EH_DWARF_CFA: return LowerEH_DWARF_CFA(Op, DAG);
	case ISD::EH_SJLJ_SETJMP: return lowerEH_SJLJ_SETJMP(Op, DAG);			case ISD::EH_SJLJ_SETJMP: return lowerEH_SJLJ_SETJMP(Op, DAG);
	case ISD::EH_SJLJ_LONGJMP: return lowerEH_SJLJ_LONGJMP(Op, DAG);			case ISD::EH_SJLJ_LONGJMP: return lowerEH_SJLJ_LONGJMP(Op, DAG);

	case ISD::LOAD: return LowerLOAD(Op, DAG);			case ISD::LOAD: return LowerLOAD(Op, DAG);
	case ISD::STORE: return LowerSTORE(Op, DAG);			case ISD::STORE: return LowerSTORE(Op, DAG);
	case ISD::TRUNCATE: return LowerTRUNCATE(Op, DAG);			case ISD::TRUNCATE: return LowerTRUNCATE(Op, DAG);
	case ISD::SELECT_CC: return LowerSELECT_CC(Op, DAG);			case ISD::SELECT_CC: return LowerSELECT_CC(Op, DAG);
				case ISD::STRICT_FP_TO_UINT:
				case ISD::STRICT_FP_TO_SINT:
	case ISD::FP_TO_UINT:			case ISD::FP_TO_UINT:
	case ISD::FP_TO_SINT: return LowerFP_TO_INT(Op, DAG, SDLoc(Op));			case ISD::FP_TO_SINT: return LowerFP_TO_INT(Op, DAG, SDLoc(Op));
	case ISD::UINT_TO_FP:			case ISD::UINT_TO_FP:
	case ISD::SINT_TO_FP: return LowerINT_TO_FP(Op, DAG);			case ISD::SINT_TO_FP: return LowerINT_TO_FP(Op, DAG);
	case ISD::FLT_ROUNDS_: return LowerFLT_ROUNDS_(Op, DAG);			case ISD::FLT_ROUNDS_: return LowerFLT_ROUNDS_(Op, DAG);

	// Lower 64-bit shifts.			// Lower 64-bit shifts.
	case ISD::SHL_PARTS: return LowerSHL_PARTS(Op, DAG);			case ISD::SHL_PARTS: return LowerSHL_PARTS(Op, DAG);
	▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
	if (VT == MVT::i64) {			if (VT == MVT::i64) {
	SDValue NewNode = LowerVAARG(SDValue(N, 1), DAG);			SDValue NewNode = LowerVAARG(SDValue(N, 1), DAG);

	Results.push_back(NewNode);			Results.push_back(NewNode);
	Results.push_back(NewNode.getValue(1));			Results.push_back(NewNode.getValue(1));
	}			}
	return;			return;
	}			}
				case ISD::STRICT_FP_TO_SINT:
				case ISD::STRICT_FP_TO_UINT:
	case ISD::FP_TO_SINT:			case ISD::FP_TO_SINT:
	case ISD::FP_TO_UINT:			case ISD::FP_TO_UINT:
	// LowerFP_TO_INT() can only handle f32 and f64.			// LowerFP_TO_INT() can only handle f32 and f64.
	if (N->getOperand(0).getValueType() == MVT::ppcf128)			if (N->getOperand((int)N->isStrictFPOpcode()).getValueType() ==
				MVT::ppcf128)
	return;			return;
	Results.push_back(LowerFP_TO_INT(SDValue(N, 0), DAG, dl));			Results.push_back(LowerFP_TO_INT(SDValue(N, 0), DAG, dl));
	return;			return;
	case ISD::TRUNCATE: {			case ISD::TRUNCATE: {
	EVT TrgVT = N->getValueType(0);			EVT TrgVT = N->getValueType(0);
	EVT OpVT = N->getOperand(0).getValueType();			EVT OpVT = N->getOperand(0).getValueType();
	if (TrgVT.isVector() &&			if (TrgVT.isVector() &&
	isOperationCustom(N->getOpcode(), TrgVT) &&			isOperationCustom(N->getOpcode(), TrgVT) &&
	▲ Show 20 Lines • Show All 991 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrSPE.td

Show First 20 Lines • Show All 191 Lines • ▼ Show 20 Lines	def EFDCTSI : EFXForm_2a<757, (outs gprc:$RT), (ins sperc:$RB),
[]>;		[]>;

def EFDCTSIDZ : EFXForm_2a<747, (outs gprc:$RT), (ins sperc:$RB),		def EFDCTSIDZ : EFXForm_2a<747, (outs gprc:$RT), (ins sperc:$RB),
"efdctsidz $RT, $RB", IIC_FPDGeneral,		"efdctsidz $RT, $RB", IIC_FPDGeneral,
[]>;		[]>;

def EFDCTSIZ : EFXForm_2a<762, (outs gprc:$RT), (ins sperc:$RB),		def EFDCTSIZ : EFXForm_2a<762, (outs gprc:$RT), (ins sperc:$RB),
"efdctsiz $RT, $RB", IIC_FPDGeneral,		"efdctsiz $RT, $RB", IIC_FPDGeneral,
[(set i32:$RT, (fp_to_sint f64:$RB))]>;		[(set i32:$RT, (any_fp_to_sint f64:$RB))]>;

def EFDCTUF : EFXForm_2a<758, (outs sperc:$RT), (ins spe4rc:$RB),		def EFDCTUF : EFXForm_2a<758, (outs sperc:$RT), (ins spe4rc:$RB),
"efdctuf $RT, $RB", IIC_FPDGeneral, []>;		"efdctuf $RT, $RB", IIC_FPDGeneral, []>;

def EFDCTUI : EFXForm_2a<756, (outs gprc:$RT), (ins sperc:$RB),		def EFDCTUI : EFXForm_2a<756, (outs gprc:$RT), (ins sperc:$RB),
"efdctui $RT, $RB", IIC_FPDGeneral,		"efdctui $RT, $RB", IIC_FPDGeneral,
[]>;		[]>;

def EFDCTUIDZ : EFXForm_2a<746, (outs gprc:$RT), (ins sperc:$RB),		def EFDCTUIDZ : EFXForm_2a<746, (outs gprc:$RT), (ins sperc:$RB),
"efdctuidz $RT, $RB", IIC_FPDGeneral,		"efdctuidz $RT, $RB", IIC_FPDGeneral,
[]>;		[]>;

def EFDCTUIZ : EFXForm_2a<760, (outs gprc:$RT), (ins sperc:$RB),		def EFDCTUIZ : EFXForm_2a<760, (outs gprc:$RT), (ins sperc:$RB),
"efdctuiz $RT, $RB", IIC_FPDGeneral,		"efdctuiz $RT, $RB", IIC_FPDGeneral,
[(set i32:$RT, (fp_to_uint f64:$RB))]>;		[(set i32:$RT, (any_fp_to_uint f64:$RB))]>;

def EFDDIV : EFXForm_1<745, (outs sperc:$RT), (ins sperc:$RA, sperc:$RB),		def EFDDIV : EFXForm_1<745, (outs sperc:$RT), (ins sperc:$RA, sperc:$RB),
"efddiv $RT, $RA, $RB", IIC_FPDivD,		"efddiv $RT, $RA, $RB", IIC_FPDivD,
[(set f64:$RT, (fdiv f64:$RA, f64:$RB))]>;		[(set f64:$RT, (fdiv f64:$RA, f64:$RB))]>;

def EFDMUL : EFXForm_1<744, (outs sperc:$RT), (ins sperc:$RA, sperc:$RB),		def EFDMUL : EFXForm_1<744, (outs sperc:$RT), (ins sperc:$RA, sperc:$RB),
"efdmul $RT, $RA, $RB", IIC_FPDGeneral,		"efdmul $RT, $RA, $RB", IIC_FPDGeneral,
[(set f64:$RT, (fmul f64:$RA, f64:$RB))]>;		[(set f64:$RT, (fmul f64:$RA, f64:$RB))]>;
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	def EFSCTSF : EFXForm_2a<727, (outs spe4rc:$RT), (ins spe4rc:$RB),
"efsctsf $RT, $RB", IIC_FPSGeneral, []>;		"efsctsf $RT, $RB", IIC_FPSGeneral, []>;

def EFSCTSI : EFXForm_2a<725, (outs gprc:$RT), (ins spe4rc:$RB),		def EFSCTSI : EFXForm_2a<725, (outs gprc:$RT), (ins spe4rc:$RB),
"efsctsi $RT, $RB", IIC_FPSGeneral,		"efsctsi $RT, $RB", IIC_FPSGeneral,
[]>;		[]>;

def EFSCTSIZ : EFXForm_2a<730, (outs gprc:$RT), (ins spe4rc:$RB),		def EFSCTSIZ : EFXForm_2a<730, (outs gprc:$RT), (ins spe4rc:$RB),
"efsctsiz $RT, $RB", IIC_FPSGeneral,		"efsctsiz $RT, $RB", IIC_FPSGeneral,
[(set i32:$RT, (fp_to_sint f32:$RB))]>;		[(set i32:$RT, (any_fp_to_sint f32:$RB))]>;

def EFSCTUF : EFXForm_2a<726, (outs sperc:$RT), (ins spe4rc:$RB),		def EFSCTUF : EFXForm_2a<726, (outs sperc:$RT), (ins spe4rc:$RB),
"efsctuf $RT, $RB", IIC_FPSGeneral, []>;		"efsctuf $RT, $RB", IIC_FPSGeneral, []>;

def EFSCTUI : EFXForm_2a<724, (outs gprc:$RT), (ins spe4rc:$RB),		def EFSCTUI : EFXForm_2a<724, (outs gprc:$RT), (ins spe4rc:$RB),
"efsctui $RT, $RB", IIC_FPSGeneral,		"efsctui $RT, $RB", IIC_FPSGeneral,
[]>;		[]>;

def EFSCTUIZ : EFXForm_2a<728, (outs gprc:$RT), (ins spe4rc:$RB),		def EFSCTUIZ : EFXForm_2a<728, (outs gprc:$RT), (ins spe4rc:$RB),
"efsctuiz $RT, $RB", IIC_FPSGeneral,		"efsctuiz $RT, $RB", IIC_FPSGeneral,
[(set i32:$RT, (fp_to_uint f32:$RB))]>;		[(set i32:$RT, (any_fp_to_uint f32:$RB))]>;

def EFSDIV : EFXForm_1<713, (outs spe4rc:$RT), (ins spe4rc:$RA, spe4rc:$RB),		def EFSDIV : EFXForm_1<713, (outs spe4rc:$RT), (ins spe4rc:$RA, spe4rc:$RB),
"efsdiv $RT, $RA, $RB", IIC_FPDivD,		"efsdiv $RT, $RA, $RB", IIC_FPDivD,
[(set f32:$RT, (fdiv f32:$RA, f32:$RB))]>;		[(set f32:$RT, (fdiv f32:$RA, f32:$RB))]>;

def EFSMUL : EFXForm_1<712, (outs spe4rc:$RT), (ins spe4rc:$RA, spe4rc:$RB),		def EFSMUL : EFXForm_1<712, (outs spe4rc:$RT), (ins spe4rc:$RA, spe4rc:$RB),
"efsmul $RT, $RA, $RB", IIC_FPGeneral,		"efsmul $RT, $RA, $RB", IIC_FPGeneral,
[(set f32:$RT, (fmul f32:$RA, f32:$RB))]>;		[(set f32:$RT, (fmul f32:$RA, f32:$RB))]>;
▲ Show 20 Lines • Show All 589 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrVSX.td

	Show First 20 Lines • Show All 133 Lines • ▼ Show 20 Lines
	def PPCvabsd : SDNode<"PPCISD::VABSD", SDTVabsd, []>;			def PPCvabsd : SDNode<"PPCISD::VABSD", SDTVabsd, []>;

	def PPCfpexth : SDNode<"PPCISD::FP_EXTEND_HALF", SDT_PPCfpexth, []>;			def PPCfpexth : SDNode<"PPCISD::FP_EXTEND_HALF", SDT_PPCfpexth, []>;
	def PPCldvsxlh : SDNode<"PPCISD::LD_VSX_LH", SDT_PPCldvsxlh,			def PPCldvsxlh : SDNode<"PPCISD::LD_VSX_LH", SDT_PPCldvsxlh,
	[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;			[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;
	def PPCldsplat : SDNode<"PPCISD::LD_SPLAT", SDT_PPCldsplat,			def PPCldsplat : SDNode<"PPCISD::LD_SPLAT", SDT_PPCldsplat,
	[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;			[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;

				def PPCstrict_mfvsr : SDNode<"PPCISD::STRICT_MFVSR", SDTUnaryOp, [SDNPHasChain]>;

	//-------------------------- Predicate definitions ---------------------------//			//-------------------------- Predicate definitions ---------------------------//
	def HasVSX : Predicate<"PPCSubTarget->hasVSX()">;			def HasVSX : Predicate<"PPCSubTarget->hasVSX()">;
	def IsLittleEndian : Predicate<"PPCSubTarget->isLittleEndian()">;			def IsLittleEndian : Predicate<"PPCSubTarget->isLittleEndian()">;
	def IsBigEndian : Predicate<"!PPCSubTarget->isLittleEndian()">;			def IsBigEndian : Predicate<"!PPCSubTarget->isLittleEndian()">;
	def HasOnlySwappingMemOps : Predicate<"!PPCSubTarget->hasP9Vector()">;			def HasOnlySwappingMemOps : Predicate<"!PPCSubTarget->hasP9Vector()">;
	def HasP8Vector : Predicate<"PPCSubTarget->hasP8Vector()">;			def HasP8Vector : Predicate<"PPCSubTarget->hasP8Vector()">;
	def HasDirectMove : Predicate<"PPCSubTarget->hasDirectMove()">;			def HasDirectMove : Predicate<"PPCSubTarget->hasDirectMove()">;
	def NoP9Vector : Predicate<"!PPCSubTarget->hasP9Vector()">;			def NoP9Vector : Predicate<"!PPCSubTarget->hasP9Vector()">;
	▲ Show 20 Lines • Show All 1,982 Lines • ▼ Show 20 Lines
	def : Pat<(f32 (bitconvert i32:$A)),			def : Pat<(f32 (bitconvert i32:$A)),
	(f32 (XSCVSPDPN			(f32 (XSCVSPDPN
	(XXSLDWI MovesToVSR.LE_WORD_1, MovesToVSR.LE_WORD_1, 1)))>;			(XXSLDWI MovesToVSR.LE_WORD_1, MovesToVSR.LE_WORD_1, 1)))>;

	// bitconvert f64 -> i64			// bitconvert f64 -> i64
	// (move to GPR, nothing else needed)			// (move to GPR, nothing else needed)
	def : Pat<(i64 (bitconvert f64:$S)),			def : Pat<(i64 (bitconvert f64:$S)),
	(i64 (MFVSRD $S))>;			(i64 (MFVSRD $S))>;
				def : Pat<(i64 (PPCstrict_mfvsr f64:$A)),
				steven.zhangUnsubmitted Done Reply Inline Actions Please move them into some group has the semantics of truncating. It is not bitconvert. steven.zhang: Please move them into some group has the semantics of truncating. It is not bitconvert.
				(i64 (MFVSRD f64:$A))>;

	// bitconvert i64 -> f64			// bitconvert i64 -> f64
	// (move to FPR, nothing else needed)			// (move to FPR, nothing else needed)
	def : Pat<(f64 (bitconvert i64:$S)),			def : Pat<(f64 (bitconvert i64:$S)),
	(f64 (MTVSRD $S))>;			(f64 (MTVSRD $S))>;

	// Rounding to integer.			// Rounding to integer.
	def : Pat<(i64 (lrint f64:$S)),			def : Pat<(i64 (lrint f64:$S)),
	(i64 (MFVSRD (FCTID $S)))>;			(i64 (MFVSRD (FCTID $S)))>;
	def : Pat<(i64 (lrint f32:$S)),			def : Pat<(i64 (lrint f32:$S)),
	(i64 (MFVSRD (FCTID (COPY_TO_REGCLASS $S, F8RC))))>;			(i64 (MFVSRD (FCTID (COPY_TO_REGCLASS $S, F8RC))))>;
	def : Pat<(i64 (llrint f64:$S)),			def : Pat<(i64 (llrint f64:$S)),
	(i64 (MFVSRD (FCTID $S)))>;			(i64 (MFVSRD (FCTID $S)))>;
	def : Pat<(i64 (llrint f32:$S)),			def : Pat<(i64 (llrint f32:$S)),
	(i64 (MFVSRD (FCTID (COPY_TO_REGCLASS $S, F8RC))))>;			(i64 (MFVSRD (FCTID (COPY_TO_REGCLASS $S, F8RC))))>;
	def : Pat<(i64 (lround f64:$S)),			def : Pat<(i64 (lround f64:$S)),
	(i64 (MFVSRD (FCTID (XSRDPI $S))))>;			(i64 (MFVSRD (FCTID (XSRDPI $S))))>;
	def : Pat<(i64 (lround f32:$S)),			def : Pat<(i64 (lround f32:$S)),
	(i64 (MFVSRD (FCTID (XSRDPI (COPY_TO_REGCLASS $S, VSFRC)))))>;			(i64 (MFVSRD (FCTID (XSRDPI (COPY_TO_REGCLASS $S, VSFRC)))))>;
	def : Pat<(i64 (llround f64:$S)),			def : Pat<(i64 (llround f64:$S)),
	(i64 (MFVSRD (FCTID (XSRDPI $S))))>;			(i64 (MFVSRD (FCTID (XSRDPI $S))))>;
	def : Pat<(i64 (llround f32:$S)),			def : Pat<(i64 (llround f32:$S)),
	(i64 (MFVSRD (FCTID (XSRDPI (COPY_TO_REGCLASS $S, VSFRC)))))>;			(i64 (MFVSRD (FCTID (XSRDPI (COPY_TO_REGCLASS $S, VSFRC)))))>;
				def : Pat<(i32 (PPCstrict_mfvsr f64:$A)),
				(i32 (MFVSRWZ f64:$A))>;

	// Alternate patterns for PPCmtvsrz where the output is v8i16 or v16i8 instead			// Alternate patterns for PPCmtvsrz where the output is v8i16 or v16i8 instead
	// of f64			// of f64
	def : Pat<(v8i16 (PPCmtvsrz i32:$A)),			def : Pat<(v8i16 (PPCmtvsrz i32:$A)),
	(v8i16 (SUBREG_TO_REG (i64 1), (MTVSRWZ $A), sub_64))>;			(v8i16 (SUBREG_TO_REG (i64 1), (MTVSRWZ $A), sub_64))>;
	def : Pat<(v16i8 (PPCmtvsrz i32:$A)),			def : Pat<(v16i8 (PPCmtvsrz i32:$A)),
	(v16i8 (SUBREG_TO_REG (i64 1), (MTVSRWZ $A), sub_64))>;			(v16i8 (SUBREG_TO_REG (i64 1), (MTVSRWZ $A), sub_64))>;

	▲ Show 20 Lines • Show All 451 Lines • ▼ Show 20 Lines
	(STXSIBX (XSCVDPSXWS f64:$src), xoaddr:$dst)>;			(STXSIBX (XSCVDPSXWS f64:$src), xoaddr:$dst)>;

	// Instructions for store(fptoui).			// Instructions for store(fptoui).
	def : Pat<(PPCstore_scal_int_from_vsr			def : Pat<(PPCstore_scal_int_from_vsr
	(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), xaddrX4:$dst, 8),			(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), xaddrX4:$dst, 8),
	(STXSDX (COPY_TO_REGCLASS (XSCVQPUDZ f128:$src), VFRC),			(STXSDX (COPY_TO_REGCLASS (XSCVQPUDZ f128:$src), VFRC),
	xaddrX4:$dst)>;			xaddrX4:$dst)>;
	def : Pat<(PPCstore_scal_int_from_vsr			def : Pat<(PPCstore_scal_int_from_vsr
	(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), iaddrX4:$dst, 8),			(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), iaddrX4:$dst, 8),
				uweigandUnsubmitted Done Reply Inline Actions This also doesn't look quite correct. The XSCVQP... instructions are not (yet?) marked as mayRaiseFPException, instead they're marked as hasSideEffects. This means that the exception flag is probably not going to be automatically transferred over to the MI level. I think if the instructions are changed to set mayRaiseFPException, that should work correctly. But it would be best to have a test case that validates that the "nofpexcept" marker is transferred depending on the value of the "fpexect." metadata in the strict intrinsic (in LLVM IR). uweigand: This also doesn't look quite correct. The XSCVQP... instructions are not (yet?) marked as…
				qiucfAuthorUnsubmitted Done Reply Inline Actions Thanks for the reminder. The FP exception bits in PPC instruction definition files need to be carefully re-examined with more tests.. qiucf: Thanks for the reminder. The FP exception bits in PPC instruction definition files need to be…
	(STXSD (COPY_TO_REGCLASS (XSCVQPUDZ f128:$src), VFRC),			(STXSD (COPY_TO_REGCLASS (XSCVQPUDZ f128:$src), VFRC),
	iaddrX4:$dst)>;			iaddrX4:$dst)>;
	def : Pat<(PPCstore_scal_int_from_vsr			def : Pat<(PPCstore_scal_int_from_vsr
	(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), xoaddr:$dst, 4),			(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), xoaddr:$dst, 4),
	(STXSIWX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), xoaddr:$dst)>;			(STXSIWX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), xoaddr:$dst)>;
	def : Pat<(PPCstore_scal_int_from_vsr			def : Pat<(PPCstore_scal_int_from_vsr
	(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), xoaddr:$dst, 2),			(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), xoaddr:$dst, 2),
	(STXSIHX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), xoaddr:$dst)>;			(STXSIHX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), xoaddr:$dst)>;
	▲ Show 20 Lines • Show All 523 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64-unknown-linux -mcpu=pwr8 \| FileCheck %s \
				; RUN: -check-prefix=P8
				steven.zhangUnsubmitted Done Reply Inline Actions Please specify option -enable-ppc-quad-precision to enable the quad precision support in powerpc. steven.zhang: Please specify option -enable-ppc-quad-precision to enable the quad precision support in…
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64le-unknown-linux -mcpu=pwr9 \| FileCheck %s \
				; RUN: -check-prefix=P9
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64le-unknown-linux -mcpu=pwr8 -mattr=-vsx \| \
				; RUN: FileCheck %s -check-prefix=NOVSX
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names < %s -mcpu=e500 \
				; RUN: -mtriple=powerpc-unknown-linux-gnu -mattr=spe \| FileCheck %s \
				; RUN: -check-prefix=SPE

				declare i32 @llvm.experimental.constrained.fptosi.i32.f128(fp128, metadata)
				declare i64 @llvm.experimental.constrained.fptosi.i64.f128(fp128, metadata)
				steven.zhangUnsubmitted Done Reply Inline Actions So, what is it if it is ppcfp128 ? steven.zhang: So, what is it if it is ppcfp128 ?
				declare i64 @llvm.experimental.constrained.fptoui.i64.f128(fp128, metadata)
				declare i32 @llvm.experimental.constrained.fptoui.i32.f128(fp128, metadata)

				define signext i32 @q_to_i32(fp128 %m) #0 {
				; P8-LABEL: q_to_i32:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixkfsi
				; P8-NEXT: nop
				; P8-NEXT: extsw r3, r3
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: q_to_i32:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixkfsi
				; P9-NEXT: nop
				; P9-NEXT: extsw r3, r3
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: q_to_i32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixkfsi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: extsw r3, r3
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: q_to_i32:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: mflr r0
				; SPE-NEXT: stw r0, 4(r1)
				; SPE-NEXT: stwu r1, -16(r1)
				; SPE-NEXT: .cfi_def_cfa_offset 16
				; SPE-NEXT: .cfi_offset lr, 4
				; SPE-NEXT: bl __fixkfsi
				; SPE-NEXT: lwz r0, 20(r1)
				; SPE-NEXT: addi r1, r1, 16
				; SPE-NEXT: mtlr r0
				; SPE-NEXT: blr
				entry:
				%conv = tail call i32 @llvm.experimental.constrained.fptosi.i32.f128(fp128 %m, metadata !"fpexcept.ignore") #0
				ret i32 %conv
				}

				define i64 @q_to_i64(fp128 %m) #0 {
				; P8-LABEL: q_to_i64:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixkfdi
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: q_to_i64:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixkfdi
				; P9-NEXT: nop
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: q_to_i64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixkfdi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: q_to_i64:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: mflr r0
				; SPE-NEXT: stw r0, 4(r1)
				; SPE-NEXT: stwu r1, -16(r1)
				; SPE-NEXT: .cfi_def_cfa_offset 16
				; SPE-NEXT: .cfi_offset lr, 4
				; SPE-NEXT: bl __fixkfdi
				; SPE-NEXT: lwz r0, 20(r1)
				; SPE-NEXT: addi r1, r1, 16
				; SPE-NEXT: mtlr r0
				; SPE-NEXT: blr
				entry:
				%conv = tail call i64 @llvm.experimental.constrained.fptosi.i64.f128(fp128 %m, metadata !"fpexcept.ignore") #0
				ret i64 %conv
				}

				define i64 @q_to_u64(fp128 %m) #0 {
				; P8-LABEL: q_to_u64:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixunskfdi
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: q_to_u64:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixunskfdi
				; P9-NEXT: nop
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: q_to_u64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixunskfdi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: q_to_u64:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: mflr r0
				; SPE-NEXT: stw r0, 4(r1)
				; SPE-NEXT: stwu r1, -16(r1)
				; SPE-NEXT: .cfi_def_cfa_offset 16
				; SPE-NEXT: .cfi_offset lr, 4
				; SPE-NEXT: bl __fixunskfdi
				; SPE-NEXT: lwz r0, 20(r1)
				; SPE-NEXT: addi r1, r1, 16
				; SPE-NEXT: mtlr r0
				; SPE-NEXT: blr
				entry:
				%conv = tail call i64 @llvm.experimental.constrained.fptoui.i64.f128(fp128 %m, metadata !"fpexcept.ignore") #0
				ret i64 %conv
				}

				define zeroext i32 @q_to_u32(fp128 %m) #0 {
				; P8-LABEL: q_to_u32:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixunskfsi
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: q_to_u32:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixunskfsi
				; P9-NEXT: nop
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: q_to_u32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixunskfsi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: q_to_u32:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: mflr r0
				; SPE-NEXT: stw r0, 4(r1)
				; SPE-NEXT: stwu r1, -16(r1)
				; SPE-NEXT: .cfi_def_cfa_offset 16
				; SPE-NEXT: .cfi_offset lr, 4
				; SPE-NEXT: bl __fixunskfsi
				; SPE-NEXT: lwz r0, 20(r1)
				; SPE-NEXT: addi r1, r1, 16
				; SPE-NEXT: mtlr r0
				; SPE-NEXT: blr
				entry:
				%conv = tail call i32 @llvm.experimental.constrained.fptoui.i32.f128(fp128 %m, metadata !"fpexcept.ignore") #0
				ret i32 %conv
				}

				attributes #0 = { strictfp }

llvm/test/CodeGen/PowerPC/fp-strict-conv.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64-unknown-linux -mcpu=pwr8 \| FileCheck %s
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64le-unknown-linux -mcpu=pwr9 \| FileCheck %s
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64le-unknown-linux -mcpu=pwr8 -mattr=-vsx \| \
				; RUN: FileCheck %s -check-prefix=NOVSX
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names < %s -mcpu=e500 \
				steven.zhangUnsubmitted Done Reply Inline Actions Add run for SPE target. steven.zhang: Add run for SPE target.
				; RUN: -mtriple=powerpc-unknown-linux-gnu -mattr=spe \| FileCheck %s \
				; RUN: -check-prefix=SPE
				steven.zhangUnsubmitted Done Reply Inline Actions A a test for fp128 steven.zhang: A a test for fp128

				declare i32 @llvm.experimental.constrained.fptosi.i32.f64(double, metadata)
				declare i64 @llvm.experimental.constrained.fptosi.i64.f64(double, metadata)
				declare i64 @llvm.experimental.constrained.fptoui.i64.f64(double, metadata)
				declare i32 @llvm.experimental.constrained.fptoui.i32.f64(double, metadata)

				declare i32 @llvm.experimental.constrained.fptosi.i32.f32(float, metadata)
				declare i64 @llvm.experimental.constrained.fptosi.i64.f32(float, metadata)
				declare i64 @llvm.experimental.constrained.fptoui.i64.f32(float, metadata)
				declare i32 @llvm.experimental.constrained.fptoui.i32.f32(float, metadata)

				declare i32 @llvm.experimental.constrained.fptosi.i32.f128(fp128, metadata)
				declare i64 @llvm.experimental.constrained.fptosi.i64.f128(fp128, metadata)
				declare i64 @llvm.experimental.constrained.fptoui.i64.f128(fp128, metadata)
				declare i32 @llvm.experimental.constrained.fptoui.i32.f128(fp128, metadata)

				define i32 @d_to_i32(double %m) #0 {
				; CHECK-LABEL: d_to_i32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpsxws f0, f1
				; CHECK-NEXT: mffprwz r3, f0
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: d_to_i32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiwz f0, f1
				; NOVSX-NEXT: addi r3, r1, -4
				; NOVSX-NEXT: stfiwx f0, 0, r3
				; NOVSX-NEXT: lwz r3, -4(r1)
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: d_to_i32:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: evmergelo r3, r3, r4
				; SPE-NEXT: efdctsiz r3, r3
				; SPE-NEXT: blr
				entry:
				%conv = call i32 @llvm.experimental.constrained.fptosi.i32.f64(double %m, metadata !"fpexcept.ignore") #0
				ret i32 %conv
				}

				define i64 @d_to_i64(double %m) #0 {
				; CHECK-LABEL: d_to_i64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpsxds f0, f1
				; CHECK-NEXT: mffprd r3, f0
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: d_to_i64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctidz f0, f1
				; NOVSX-NEXT: stfd f0, -8(r1)
				; NOVSX-NEXT: ld r3, -8(r1)
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: d_to_i64:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: mflr r0
				; SPE-NEXT: stw r0, 4(r1)
				; SPE-NEXT: stwu r1, -16(r1)
				; SPE-NEXT: .cfi_def_cfa_offset 16
				; SPE-NEXT: .cfi_offset lr, 4
				; SPE-NEXT: evmergelo r4, r3, r4
				; SPE-NEXT: evmergehi r3, r4, r4
				; SPE-NEXT: # kill: def $r4 killed $r4 killed $s4
				; SPE-NEXT: # kill: def $r3 killed $r3 killed $s3
				; SPE-NEXT: bl __fixdfdi
				; SPE-NEXT: lwz r0, 20(r1)
				; SPE-NEXT: addi r1, r1, 16
				; SPE-NEXT: mtlr r0
				; SPE-NEXT: blr
				entry:
				%conv = call i64 @llvm.experimental.constrained.fptosi.i64.f64(double %m, metadata !"fpexcept.ignore") #0
				ret i64 %conv
				}

				define i64 @d_to_u64(double %m) #0 {
				; CHECK-LABEL: d_to_u64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpuxds f0, f1
				; CHECK-NEXT: mffprd r3, f0
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: d_to_u64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiduz f0, f1
				; NOVSX-NEXT: stfd f0, -8(r1)
				; NOVSX-NEXT: ld r3, -8(r1)
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: d_to_u64:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: mflr r0
				; SPE-NEXT: stw r0, 4(r1)
				; SPE-NEXT: stwu r1, -16(r1)
				; SPE-NEXT: .cfi_def_cfa_offset 16
				; SPE-NEXT: .cfi_offset lr, 4
				; SPE-NEXT: evmergelo r4, r3, r4
				; SPE-NEXT: evmergehi r3, r4, r4
				; SPE-NEXT: # kill: def $r4 killed $r4 killed $s4
				; SPE-NEXT: # kill: def $r3 killed $r3 killed $s3
				; SPE-NEXT: bl __fixunsdfdi
				; SPE-NEXT: lwz r0, 20(r1)
				; SPE-NEXT: addi r1, r1, 16
				; SPE-NEXT: mtlr r0
				; SPE-NEXT: blr
				entry:
				%conv = call i64 @llvm.experimental.constrained.fptoui.i64.f64(double %m, metadata !"fpexcept.ignore") #0
				ret i64 %conv
				}

				define zeroext i32 @d_to_u32(double %m) #0 {
				; CHECK-LABEL: d_to_u32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpuxws f0, f1
				; CHECK-NEXT: mffprwz r3, f0
				; CHECK-NEXT: clrldi r3, r3, 32
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: d_to_u32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiwuz f0, f1
				; NOVSX-NEXT: addi r3, r1, -4
				; NOVSX-NEXT: stfiwx f0, 0, r3
				; NOVSX-NEXT: lwz r3, -4(r1)
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: d_to_u32:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: evmergelo r3, r3, r4
				; SPE-NEXT: efdctuiz r3, r3
				; SPE-NEXT: blr
				entry:
				%conv = call i32 @llvm.experimental.constrained.fptoui.i32.f64(double %m, metadata !"fpexcept.ignore") #0
				ret i32 %conv
				}

				define signext i32 @f_to_i32(float %m) #0 {
				; CHECK-LABEL: f_to_i32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpsxws f0, f1
				; CHECK-NEXT: mffprwz r3, f0
				; CHECK-NEXT: extsw r3, r3
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: f_to_i32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiwz f0, f1
				; NOVSX-NEXT: addi r3, r1, -4
				; NOVSX-NEXT: stfiwx f0, 0, r3
				; NOVSX-NEXT: lwa r3, -4(r1)
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: f_to_i32:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: efsctsiz r3, r3
				; SPE-NEXT: blr
				entry:
				%conv = call i32 @llvm.experimental.constrained.fptosi.i32.f32(float %m, metadata !"fpexcept.ignore") #0
				ret i32 %conv
				}
				steven.zhangUnsubmitted Done Reply Inline Actions Does this attribute need ? steven.zhang: Does this attribute need ?
				qiucfAuthorUnsubmitted Done Reply Inline Actions All function calls done in a function that uses constrained floating point intrinsics must have the strictfp attribute. Although output won't change if we remove this attr. It's better to keep it according to langref. qiucf: > All function calls done in a function that uses constrained floating point intrinsics must…

				define i64 @f_to_i64(float %m) #0 {
				; CHECK-LABEL: f_to_i64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpsxds f0, f1
				; CHECK-NEXT: mffprd r3, f0
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: f_to_i64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctidz f0, f1
				; NOVSX-NEXT: stfd f0, -8(r1)
				; NOVSX-NEXT: ld r3, -8(r1)
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: f_to_i64:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: mflr r0
				; SPE-NEXT: stw r0, 4(r1)
				; SPE-NEXT: stwu r1, -16(r1)
				; SPE-NEXT: .cfi_def_cfa_offset 16
				; SPE-NEXT: .cfi_offset lr, 4
				; SPE-NEXT: bl __fixsfdi
				; SPE-NEXT: lwz r0, 20(r1)
				; SPE-NEXT: addi r1, r1, 16
				; SPE-NEXT: mtlr r0
				; SPE-NEXT: blr
				entry:
				%conv = call i64 @llvm.experimental.constrained.fptosi.i64.f32(float %m, metadata !"fpexcept.ignore") #0
				ret i64 %conv
				}

				define i64 @f_to_u64(float %m) #0 {
				; CHECK-LABEL: f_to_u64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpuxds f0, f1
				; CHECK-NEXT: mffprd r3, f0
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: f_to_u64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiduz f0, f1
				; NOVSX-NEXT: stfd f0, -8(r1)
				; NOVSX-NEXT: ld r3, -8(r1)
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: f_to_u64:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: mflr r0
				; SPE-NEXT: stw r0, 4(r1)
				; SPE-NEXT: stwu r1, -16(r1)
				; SPE-NEXT: .cfi_def_cfa_offset 16
				; SPE-NEXT: .cfi_offset lr, 4
				; SPE-NEXT: bl __fixunssfdi
				; SPE-NEXT: lwz r0, 20(r1)
				; SPE-NEXT: addi r1, r1, 16
				; SPE-NEXT: mtlr r0
				; SPE-NEXT: blr
				entry:
				%conv = call i64 @llvm.experimental.constrained.fptoui.i64.f32(float %m, metadata !"fpexcept.ignore") #0
				ret i64 %conv
				}

				define zeroext i32 @f_to_u32(float %m) #0 {
				; CHECK-LABEL: f_to_u32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpuxws f0, f1
				; CHECK-NEXT: mffprwz r3, f0
				; CHECK-NEXT: clrldi r3, r3, 32
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: f_to_u32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiwuz f0, f1
				; NOVSX-NEXT: addi r3, r1, -4
				; NOVSX-NEXT: stfiwx f0, 0, r3
				; NOVSX-NEXT: lwz r3, -4(r1)
				; NOVSX-NEXT: blr
				;
				; SPE-LABEL: f_to_u32:
				; SPE: # %bb.0: # %entry
				; SPE-NEXT: efsctuiz r3, r3
				; SPE-NEXT: blr
				entry:
				%conv = call i32 @llvm.experimental.constrained.fptoui.i32.f32(float %m, metadata !"fpexcept.ignore") #0
				ret i32 %conv
				}

				attributes #0 = { strictfp }