This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
1/2
PPCISelLowering.h
22/39
PPCISelLowering.cpp
-
PPCInstr64Bit.td
-
PPCInstrInfo.td
3/3
PPCInstrVSX.td
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
2/2
fp-strict-conv-f128.ll
4/4
fp-strict-conv.ll

Differential D81537

[PowerPC] Support constrained fp operation for scalar fptosi/fptoui
ClosedPublic

Authored by qiucf on Jun 10 2020, 12:27 AM.

Download Raw Diff

Details

Reviewers

steven.zhang
nemanjai
kbarton
kpn
jsji
uweigand

Group Reviewers

Restricted Project

Commits

rG131b3b9ed4ef: [PowerPC] Support constrained scalar fptosi/fptoui

Summary

This patch adds support for constrained conversion operation (fptoui/fptosi) from f32/f64 to i32/i64.

Vector support will be done in following patches. For targets older than ISA 2.06, we need to make strict_fsetcc/strict_fsetccs work well first.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

qiucf created this revision.Jun 10 2020, 12:27 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2020, 12:27 AM

Herald added subscribers: llvm-commits, shchenz, hiraditya. · View Herald Transcript

steven.zhang added inline comments.Jun 10 2020, 1:14 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8144	The parameter SDLoc is not needed. And can we change the function like: getFpNode() or something else ? You don't need to have the strict in the function name as it is already one of the function parameter.
8148	!Strict is not needed.
8172	This is not good code practice. Please use: Strict ? 1 : 0 or seeking some API in the SDValue.
8196	The logic between LowerFP_TO_INTForReuse and LowerFP_TO_INTDirectMove is nearly the same between line 8168 ~ 8196. And that is expected as the difference between the two is how to move the data from FPR to GPR. So, can we add another function to do the convert ? Something like: LowerFP_TO_INTDirectMove: V = convertToFp() MFVSR V LowerFP_TO_INTForReuse: V = convertToFp() Store V Load V
8278	This could be something that we can improve later. We should mark it as legal instead of checking it here if I understand the intention correctly.
llvm/test/CodeGen/PowerPC/fp-strict-conv.ll
9	Add run for SPE target.
11	A a test for fp128

steven.zhang added inline comments.Jun 10 2020, 1:14 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8150	Please give the default initializer to avoid uninitialized local variable if llvm_unreachable is off.
llvm/test/CodeGen/PowerPC/fp-strict-conv.ll
172	Does this attribute need ?

Harbormaster failed remote builds in B59746: Diff 269746!Jun 10 2020, 1:36 AM

Address Steven's comments.

qiucf added inline comments.Jun 10 2020, 8:07 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8148	If `Strict` is `false`, we can allow Chain to be null.
llvm/test/CodeGen/PowerPC/fp-strict-conv.ll
172	All function calls done in a function that uses constrained floating point intrinsics must have the strictfp attribute. Although output won't change if we remove this attr. It's better to keep it according to langref.

Remove redundant chain logic.

Harbormaster failed remote builds in B59912: Diff 270025!Jun 10 2020, 9:02 PM

steven.zhang added inline comments.Jun 10 2020, 10:18 PM

llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll
16	So, what is it if it is ppcfp128 ?

Harbormaster failed remote builds in B59914: Diff 270027!Jun 10 2020, 10:39 PM

qiucf added a child revision: D81669: [PowerPC] Support constrained fp operation for scalar sitofp/uitofp.Jun 11 2020, 9:48 AM

Some code style comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8253	Don't use Tmp but with some meaningful name.
8254	Op.getValueType() ?

steven.zhang added inline comments.Jun 12 2020, 2:23 AM

llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll
4	Please specify option -enable-ppc-quad-precision to enable the quad precision support in powerpc.

qiucf mentioned this in D81818: [NFC] [PowerPC] Use shared method in FP_TO_INT and INT_TO_FP lowering.Jun 14 2020, 7:56 PM

Rebase after D81818 and add f128 support

qiucf added a parent revision: D81818: [NFC] [PowerPC] Use shared method in FP_TO_INT and INT_TO_FP lowering.Jun 14 2020, 10:32 PM

Harbormaster failed remote builds in B60259: Diff 270660!Jun 14 2020, 10:56 PM

steven.zhang added inline comments.Jun 15 2020, 1:57 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8145	The Strict parameter is not needed as you can check the value of the Chain to know it.
8174	IsStrict, IsSigned
llvm/lib/Target/PowerPC/PPCInstrVSX.td
3278	Please move them into some group has the semantics of truncating. It is not bitconvert.

Address some style-related comments

Harbormaster failed remote builds in B61196: Diff 272360!Jun 22 2020, 3:11 AM

steven.zhang added inline comments.Jun 22 2020, 3:52 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	Don’t do this for spe target and remove the test for spe. Sorry about the back and forth.
8158	You don't need this assertion now.
8203	So, do we have problem if it is strict opcode in this code path?
8261	move the assertion into concertFPToInt
8428	ConvertIntToFP and ConvertFPToInt should have the same parameters.
8533	Please remove such kind of change as it is not part of your change.

nemanjai added subscribers: jhibbits, chmeee.Jun 24 2020, 5:39 AM

nemanjai added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	I would defer to @jhibbits @chmeee (not sure which of Justin's ID's is active) regarding SPE bits.
8150	This function appears to just take a node and produce a strict version of that node. It seems like there isn't anything target dependent about such an operation so it is very suspicious to me why this is in the PPC back end. If for some reason it has to be here, please explain why in a comment. If the only target specific part of this is the `STRICT_MFVSR` node, then at least the rest can be handled by target independent code, can't it?
8153	This seems dangerous to me. You are deciding whether to return a strict node based on whether a valid Chain is provided. I am personally against making decisions based on orthogonal concerns. If the caller wants a strict node, that should be explicit rather than this strange implicit contract of "If you want a strict node, provide a valid chain."
llvm/lib/Target/PowerPC/PPCISelLowering.h
440	Why? The instruction simply moves bits around. It does not cause any exceptions, it is not subject to rounding, etc. If this is necessary, it needs to be clear from the comment why.

jhibbits added inline comments.Jun 24 2020, 6:48 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	What are the semantic differences between STRICT_FP_TO_UINT and FP_TO_UINT? EFDCTUIZ/EFSCTUIZ and their signed counterparts, which we currently use for the FP_TO_{U,S}INT, saturate if they can't be represented as a 32-bit integer, and round toward zero always (the non-Z variants round via the current rounding mode).

uweigand added inline comments.Jun 24 2020, 7:31 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	FP_TO_UINT assumes the current rounding mode is default, and exception conditions can be ignored. With STRICT_FP_TO_UINT those assumptions no longer apply, so it would appear that those instructions you mention should not be used there.

uweigand added inline comments.Jun 25 2020, 8:43 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	Ah -- my last comment was incorrect, sorry for any confusion. STRICT_FP_TO_UINT/SINT are in fact an exception to most "STRICT" operations in that they do not use the current rounding mode, but always round towards zero. (Following the C standard as well as the LLVM IR specification.) So for these operations the only difference between strict and non-strict variants is whether exception conditions can be ignored or not.

jhibbits added inline comments.Jun 25 2020, 12:10 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	Ah, okay. So, if I understand correctly, EF{S,D}CT{S,U}I should be used for fp_to_{s,u}int, and the current 'Z' variants should be used for the strict_fp_to_*int.

uweigand added inline comments.Jun 26 2020, 5:50 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	Hmm, if I'm reading the SPE_PEM correctly, I think the "Z" variants are in fact correct for both strict and non-strict variants: they round towards zero (which both variants do), and they handle exceptions (which the strict variant requires, while the non-strict variant doesn't care). The non-"Z" variants seem wrong either way since they use the current rounding mode, which is incorrect for both strict and non-strict variants.

jhibbits added inline comments.Jun 26 2020, 2:35 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	Thanks for that explanation @uweigand now I understand. So the change here looks fine to me.

steven.zhang added inline comments.Jun 28 2020, 6:53 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	The reason why I asked to remove the spe here is to split this patch into two, one for PowerPC and another one for spe which need some inputs from spe experts. Does it make sense ?

qiucf marked 9 inline comments as done.Jun 29 2020, 2:33 AM

qiucf added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8150	Thanks for the comments. The method (including some overloads) doesn't help much. So I removed it and wrote a simple helper method to get strict version of ppc-specific opcode. There's method `mutateStrictFPToFP` doing similar things but maybe not suitable here. I'll send an NFC to make opcode conversion a hook so that each target can benefit from it.
8203	Do you mean these PPC-specific opcodes are not strict? But the result is either load/store or direct moved. What we do here is to keep operands of value consistent. So changing these opcodes to strict may be unnecessary.
8428	FPToInt is round-then-move, while IntToFP is move-then-round. So when IntToFP we need extra information from original `Op` besides the moved `Src`.
llvm/lib/Target/PowerPC/PPCISelLowering.h
440	Because (1) this prevents it being combined somewhere unexpectedly; (2) all strict nodes have extra operand for their chains, so replacing original `strict_*` node with non-strict one will cause operands mismatch. I added necessary comments. Thanks.

Removed SPE logic from this revision.
Add some comments for strict nodes.
Removed getFPNode method.
Addressed other minor comments.

qiucf mentioned this in D82747: [PowerPC] Support constrained int/fp conversion in SPE targets.Jun 29 2020, 2:39 AM

Harbormaster failed remote builds in B62108: Diff 274020!Jun 29 2020, 2:40 AM

qiucf marked 2 inline comments as done.Jun 29 2020, 2:41 AM

qiucf added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
582	I split D82747 out from this. It's much clearer and independent from this.

qiucf mentioned this in D81669: [PowerPC] Support constrained fp operation for scalar sitofp/uitofp.Jun 29 2020, 2:42 AM

Reflect changes after strict conversion for SPE and enable-ppc-quad-precision's removal.

Harbormaster failed remote builds in B64649: Diff 278704!Jul 17 2020, 3:14 AM

Ping..

LGTM now. Please hold on for several days to see if @nemanjai or @uweigand have comments.

This revision is now accepted and ready to land.Aug 4 2020, 2:34 AM

This doesn't look correct. As far as I can see, none of the conversion functions were actually changed to handle strict operations. For one, you'll need strict variants of all the PowerPC-specific conversion operations, use them in all the conversion subroutines, and consistently track their chain nodes.

The patch only adds a strict variant of the direct move, which seems to me the only operation where actually a strict version is not required ...

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8230	This doesn't look right. The Chain produced by this strict node just vanishes, this cannot be correct.
8247	And you need strict versions of all these conversion operations, I'd assume.
8258	Nothing in the function actually handles strict nodes, that cannot be right.
8307	Why do we need a strict version of a plain move?
8312	Again, nothing in this function actually handles strict nodes ...

In D81537#2193216, @uweigand wrote:

This doesn't look correct. As far as I can see, none of the conversion functions were actually changed to handle strict operations. For one, you'll need strict variants of all the PowerPC-specific conversion operations, use them in all the conversion subroutines, and consistently track their chain nodes.

The patch only adds a strict variant of the direct move, which seems to me the only operation where actually a strict version is not required ...

Thanks for pointing them out! I have something unclear about chains:

(1) If a constrained operation is expanded into several FP nodes a-b-c, they should all have chain set to former operation (b's chain is a, c's chain is b) even if they have def relationship?

(2) In MachineInstr emitting after ISel, chains are identified just by operand type (countOperand), so some chains are not ignored and assert hit. Is this expected?

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8307	Yes, a strict move doesn't look reasonable. But the original `strict_fptosi` node will be replaced by the result. So if directly return the move, operands will not match (no chain in `mfvsr`). Is there a better way here?

In D81537#2207249, @qiucf wrote:

In D81537#2193216, @uweigand wrote:

This doesn't look correct. As far as I can see, none of the conversion functions were actually changed to handle strict operations. For one, you'll need strict variants of all the PowerPC-specific conversion operations, use them in all the conversion subroutines, and consistently track their chain nodes.

The patch only adds a strict variant of the direct move, which seems to me the only operation where actually a strict version is not required ...

Thanks for pointing them out! I have something unclear about chains:

(1) If a constrained operation is expanded into several FP nodes a-b-c, they should all have chain set to former operation (b's chain is a, c's chain is b) even if they have def relationship?

That may depend on the specific semantics on which of those nodes may or may not trap. In some cases, the original sequence may in fact not be valid at all for strict mode. But if it is, then they'll need to be chained up properly. If they have data dependencies, then it usually makes sense for the chain to follow that dependency. In other cases, the may be an option for more flexibility by allowing certain operations to be re-scheduled. In those cases you'd give the same input chain to all operations and collect all output chains via a TokenFactor.

(2) In MachineInstr emitting after ISel, chains are identified just by operand type (countOperand), so some chains are not ignored and assert hit. Is this expected?

I'm not sure I understand what specific case you're refering to. But in any case, a chain should *never* be simply ignored, that would always be a bug.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8307	So this gets expanded to a PPCISD::FCTI... variant (inside convertFPToInt) followed by the MFVSR. Now, with proper chain handling, the input chain of the strict_fptosi is consumed by a strict variant of FCTI..., and the output chain of that STRICT_FCTI... is then the correct output chain for the whole operation. The data output (only) of the STRICT_FCTI... acts then as the input of the MFVSR, and the output of the MFVSR is the correct value output of the whole operation. So if short, you need to replace (out-val, out-chain) = strict_fptosi (in-val, in-chain) by (tmp-val, out-chain) = STRICT_FCTI... (in-val, in-chain) out-val = MFVSR (tmp-val) This probably will require some ReplaceAllUses... instead of just returning a result, as is already done elsewhere with chain output instructions.

Thanks for the detailed explanation!

Remove strict mfvsr
Update tests
Add strict fc*
Add chains to some expanded operation

uweigand added inline comments.Aug 13 2020, 5:00 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8068	This looks still wrong to me. I think you should replace the chain with the one from Conv, and return the value (i.e. Mov). Something like: SDValue Mov = DAG.getNode(PPCISD::MFVSR, dl, Op.getValueType(), Conv); DAG.ReplaceAllUsesOfValueWith(SDValue(Op, 1), Conv.getValue(1)); return Mov; (Actually, returning the value is then the same in strict and non-strict cases, so this can be merged.) Or, another possible approach would be to not use ReplaceAllUses at all but reconstruct the multi-output value via DAG.getMergeValues. Something like; SDValue Mov = DAG.getNode(PPCISD::MFVSR, dl, Op.getValueType(), Conv); return DAG.getMergeValues({Mov, Conv.getValue(1)}, dl);
llvm/lib/Target/PowerPC/PPCInstrVSX.td
3772	This also doesn't look quite correct. The XSCVQP... instructions are not (yet?) marked as mayRaiseFPException, instead they're marked as hasSideEffects. This means that the exception flag is probably not going to be automatically transferred over to the MI level. I think if the instructions are changed to set mayRaiseFPException, that should work correctly. But it would be best to have a test case that validates that the "nofpexcept" marker is transferred depending on the value of the "fpexect." metadata in the strict intrinsic (in LLVM IR).

Return a merge_value for round and move.
Set fp exception bit for f128 round instructions.

llvm/lib/Target/PowerPC/PPCInstrVSX.td
3772	Thanks for the reminder. The FP exception bits in PPC instruction definition files need to be carefully re-examined with more tests..

This LGTM now. Thanks!

Closed by commit rG131b3b9ed4ef: [PowerPC] Support constrained scalar fptosi/fptoui (authored by qiucf). · Explain WhyAug 19 2020, 10:35 PM

This revision was automatically updated to reflect the committed changes.

qiucf added a commit: rG131b3b9ed4ef: [PowerPC] Support constrained scalar fptosi/fptoui.

qiucf mentioned this in D71287: [PowerPC] Use fcti[dw] instructions in additional cases.Dec 29 2020, 7:30 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

PowerPC/

6 lines

107 lines

6 lines

24 lines

27 lines

test/

CodeGen/

PowerPC/

fp-strict-conv-f128.ll

602 lines

fp-strict-conv.ll

181 lines

Diff 286711

llvm/lib/Target/PowerPC/PPCISelLowering.h

Show First 20 Lines • Show All 430 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
/// lower (IDX=1) half of v4f32 to v2f64.		/// lower (IDX=1) half of v4f32 to v2f64.
FP_EXTEND_HALF,		FP_EXTEND_HALF,

/// MAT_PCREL_ADDR = Materialize a PC Relative address. This can be done		/// MAT_PCREL_ADDR = Materialize a PC Relative address. This can be done
/// either through an add like PADDI or through a PC Relative load like		/// either through an add like PADDI or through a PC Relative load like
/// PLD.		/// PLD.
MAT_PCREL_ADDR,		MAT_PCREL_ADDR,

		// Constrained conversion from floating point to int
		STRICT_FCTIDZ = ISD::FIRST_TARGET_STRICTFP_OPCODE,
		nemanjaiUnsubmitted Not Done Reply Inline Actions Why? The instruction simply moves bits around. It does not cause any exceptions, it is not subject to rounding, etc. If this is necessary, it needs to be clear from the comment why. nemanjai: Why? The instruction simply moves bits around. It does not cause any exceptions, it is not…
		qiucfAuthorUnsubmitted Done Reply Inline Actions Because (1) this prevents it being combined somewhere unexpectedly; (2) all strict nodes have extra operand for their chains, so replacing original `strict_` node with non-strict one will cause operands mismatch. I added necessary comments. Thanks. qiucf:* Because (1) this prevents it being combined somewhere unexpectedly; (2) all strict nodes have…
		STRICT_FCTIWZ,
		STRICT_FCTIDUZ,
		STRICT_FCTIWUZ,

/// CHAIN = STBRX CHAIN, GPRC, Ptr, Type - This is a		/// CHAIN = STBRX CHAIN, GPRC, Ptr, Type - This is a
/// byte-swapping store instruction. It byte-swaps the low "Type" bits of		/// byte-swapping store instruction. It byte-swaps the low "Type" bits of
/// the GPRC input, then stores it through Ptr. Type can be either i16 or		/// the GPRC input, then stores it through Ptr. Type can be either i16 or
/// i32.		/// i32.
STBRX = ISD::FIRST_TARGET_MEMORY_OPCODE,		STBRX = ISD::FIRST_TARGET_MEMORY_OPCODE,

/// GPRC, CHAIN = LBRX CHAIN, Ptr, Type - This is a		/// GPRC, CHAIN = LBRX CHAIN, Ptr, Type - This is a
/// byte-swapping load instruction. It loads "Type" bits, byte swaps it,		/// byte-swapping load instruction. It loads "Type" bits, byte swaps it,
▲ Show 20 Lines • Show All 834 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 444 Lines • ▼ Show 20 Lines	if (Subtarget.hasSPE()) {
setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i32, Legal);		setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i32, Legal);
setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i32, Legal);		setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i32, Legal);
setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i32, Legal);		setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i32, Legal);
setOperationAction(ISD::FP_TO_SINT, MVT::i32, Legal);		setOperationAction(ISD::FP_TO_SINT, MVT::i32, Legal);
setOperationAction(ISD::SINT_TO_FP, MVT::i32, Legal);		setOperationAction(ISD::SINT_TO_FP, MVT::i32, Legal);
setOperationAction(ISD::UINT_TO_FP, MVT::i32, Legal);		setOperationAction(ISD::UINT_TO_FP, MVT::i32, Legal);
} else {		} else {
// PowerPC turns FP_TO_SINT into FCTIWZ and some load/stores.		// PowerPC turns FP_TO_SINT into FCTIWZ and some load/stores.
		setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i32, Custom);
setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);		setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);

// PowerPC does not have [U\|S]INT_TO_FP		// PowerPC does not have [U\|S]INT_TO_FP
setOperationAction(ISD::SINT_TO_FP, MVT::i32, Expand);		setOperationAction(ISD::SINT_TO_FP, MVT::i32, Expand);
setOperationAction(ISD::UINT_TO_FP, MVT::i32, Expand);		setOperationAction(ISD::UINT_TO_FP, MVT::i32, Expand);
}		}

if (Subtarget.hasDirectMove() && isPPC64) {		if (Subtarget.hasDirectMove() && isPPC64) {
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
setCondCodeAction(ISD::SETUGT, MVT::f64, Expand);		setCondCodeAction(ISD::SETUGT, MVT::f64, Expand);
setCondCodeAction(ISD::SETUEQ, MVT::f32, Expand);		setCondCodeAction(ISD::SETUEQ, MVT::f32, Expand);
setCondCodeAction(ISD::SETUEQ, MVT::f64, Expand);		setCondCodeAction(ISD::SETUEQ, MVT::f64, Expand);
setCondCodeAction(ISD::SETOGE, MVT::f32, Expand);		setCondCodeAction(ISD::SETOGE, MVT::f32, Expand);
setCondCodeAction(ISD::SETOGE, MVT::f64, Expand);		setCondCodeAction(ISD::SETOGE, MVT::f64, Expand);
setCondCodeAction(ISD::SETOLE, MVT::f32, Expand);		setCondCodeAction(ISD::SETOLE, MVT::f32, Expand);
setCondCodeAction(ISD::SETOLE, MVT::f64, Expand);		setCondCodeAction(ISD::SETOLE, MVT::f64, Expand);
setCondCodeAction(ISD::SETONE, MVT::f32, Expand);		setCondCodeAction(ISD::SETONE, MVT::f32, Expand);
setCondCodeAction(ISD::SETONE, MVT::f64, Expand);		setCondCodeAction(ISD::SETONE, MVT::f64, Expand);
		steven.zhangUnsubmitted Not Done Reply Inline Actions Don’t do this for spe target and remove the test for spe. Sorry about the back and forth. steven.zhang: Don’t do this for spe target and remove the test for spe. Sorry about the back and forth.
		nemanjaiUnsubmitted Not Done Reply Inline Actions I would defer to @jhibbits @chmeee (not sure which of Justin's ID's is active) regarding SPE bits. nemanjai: I would defer to @jhibbits @chmeee (not sure which of Justin's ID's is active) regarding SPE…
		jhibbitsUnsubmitted Not Done Reply Inline Actions What are the semantic differences between STRICT_FP_TO_UINT and FP_TO_UINT? EFDCTUIZ/EFSCTUIZ and their signed counterparts, which we currently use for the FP_TO_{U,S}INT, saturate if they can't be represented as a 32-bit integer, and round toward zero always (the non-Z variants round via the current rounding mode). jhibbits: What are the semantic differences between STRICT_FP_TO_UINT and FP_TO_UINT? EFDCTUIZ/EFSCTUIZ…
		uweigandUnsubmitted Not Done Reply Inline Actions FP_TO_UINT assumes the current rounding mode is default, and exception conditions can be ignored. With STRICT_FP_TO_UINT those assumptions no longer apply, so it would appear that those instructions you mention should not be used there. uweigand: FP_TO_UINT assumes the current rounding mode is default, and exception conditions can be…
		uweigandUnsubmitted Not Done Reply Inline Actions Ah -- my last comment was incorrect, sorry for any confusion. STRICT_FP_TO_UINT/SINT are in fact an exception to most "STRICT" operations in that they do not use the current rounding mode, but always round towards zero. (Following the C standard as well as the LLVM IR specification.) So for these operations the only difference between strict and non-strict variants is whether exception conditions can be ignored or not. uweigand: Ah -- my last comment was incorrect, sorry for any confusion. STRICT_FP_TO_UINT/SINT are in…
		jhibbitsUnsubmitted Not Done Reply Inline Actions Ah, okay. So, if I understand correctly, EF{S,D}CT{S,U}I should be used for fp_to_{s,u}int, and the current 'Z' variants should be used for the strict_fp_to_int. jhibbits:* Ah, okay. So, if I understand correctly, EF{S,D}CT{S,U}I should be used for fp_to_{s,u}int…
		uweigandUnsubmitted Not Done Reply Inline Actions Hmm, if I'm reading the SPE_PEM correctly, I think the "Z" variants are in fact correct for both strict and non-strict variants: they round towards zero (which both variants do), and they handle exceptions (which the strict variant requires, while the non-strict variant doesn't care). The non-"Z" variants seem wrong either way since they use the current rounding mode, which is incorrect for both strict and non-strict variants. uweigand: Hmm, if I'm reading the SPE_PEM correctly, I think the "Z" variants are in fact correct for…
		jhibbitsUnsubmitted Not Done Reply Inline Actions Thanks for that explanation @uweigand now I understand. So the change here looks fine to me. jhibbits: Thanks for that explanation @uweigand now I understand. So the change here looks fine to me.
		steven.zhangUnsubmitted Done Reply Inline Actions The reason why I asked to remove the spe here is to split this patch into two, one for PowerPC and another one for spe which need some inputs from spe experts. Does it make sense ? steven.zhang: The reason why I asked to remove the spe here is to split this patch into two, one for PowerPC…
		qiucfAuthorUnsubmitted Done Reply Inline Actions I split D82747 out from this. It's much clearer and independent from this. qiucf: I split D82747 out from this. It's much clearer and independent from this.

if (Subtarget.has64BitSupport()) {		if (Subtarget.has64BitSupport()) {
// They also have instructions for converting between i64 and fp.		// They also have instructions for converting between i64 and fp.
		setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i64, Custom);
		setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i64, Expand);
setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);		setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);
setOperationAction(ISD::FP_TO_UINT, MVT::i64, Expand);		setOperationAction(ISD::FP_TO_UINT, MVT::i64, Expand);
setOperationAction(ISD::SINT_TO_FP, MVT::i64, Custom);		setOperationAction(ISD::SINT_TO_FP, MVT::i64, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::i64, Expand);		setOperationAction(ISD::UINT_TO_FP, MVT::i64, Expand);
// This is just the low 32 bits of a (signed) fp->i64 conversion.		// This is just the low 32 bits of a (signed) fp->i64 conversion.
// We cannot do this with Promote because i64 is not a legal type.		// We cannot do this with Promote because i64 is not a legal type.
		setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i32, Custom);
setOperationAction(ISD::FP_TO_UINT, MVT::i32, Custom);		setOperationAction(ISD::FP_TO_UINT, MVT::i32, Custom);

if (Subtarget.hasLFIWAX() \|\| Subtarget.isPPC64())		if (Subtarget.hasLFIWAX() \|\| Subtarget.isPPC64())
setOperationAction(ISD::SINT_TO_FP, MVT::i32, Custom);		setOperationAction(ISD::SINT_TO_FP, MVT::i32, Custom);
} else {		} else {
// PowerPC does not have FP_TO_UINT on 32-bit implementations.		// PowerPC does not have FP_TO_UINT on 32-bit implementations.
if (Subtarget.hasSPE()) {		if (Subtarget.hasSPE()) {
setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i32, Legal);		setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i32, Legal);
setOperationAction(ISD::FP_TO_UINT, MVT::i32, Legal);		setOperationAction(ISD::FP_TO_UINT, MVT::i32, Legal);
} else		} else {
		setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i32, Expand);
setOperationAction(ISD::FP_TO_UINT, MVT::i32, Expand);		setOperationAction(ISD::FP_TO_UINT, MVT::i32, Expand);
}		}
		}

// With the instructions enabled under FPCVT, we can do everything.		// With the instructions enabled under FPCVT, we can do everything.
if (Subtarget.hasFPCVT()) {		if (Subtarget.hasFPCVT()) {
if (Subtarget.has64BitSupport()) {		if (Subtarget.has64BitSupport()) {
		setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i64, Custom);
		setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i64, Custom);
setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);		setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);
setOperationAction(ISD::FP_TO_UINT, MVT::i64, Custom);		setOperationAction(ISD::FP_TO_UINT, MVT::i64, Custom);
setOperationAction(ISD::SINT_TO_FP, MVT::i64, Custom);		setOperationAction(ISD::SINT_TO_FP, MVT::i64, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::i64, Custom);		setOperationAction(ISD::UINT_TO_FP, MVT::i64, Custom);
}		}

		setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i32, Custom);
		setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i32, Custom);
setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);		setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
setOperationAction(ISD::FP_TO_UINT, MVT::i32, Custom);		setOperationAction(ISD::FP_TO_UINT, MVT::i32, Custom);
setOperationAction(ISD::SINT_TO_FP, MVT::i32, Custom);		setOperationAction(ISD::SINT_TO_FP, MVT::i32, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::i32, Custom);		setOperationAction(ISD::UINT_TO_FP, MVT::i32, Custom);
}		}

if (Subtarget.use64BitRegs()) {		if (Subtarget.use64BitRegs()) {
// 64-bit PowerPC implementations can support i64 types directly		// 64-bit PowerPC implementations can support i64 types directly
▲ Show 20 Lines • Show All 838 Lines • ▼ Show 20 Lines	const char *PPCTargetLowering::getTargetNodeName(unsigned Opcode) const {
case PPCISD::BUILD_SPE64: return "PPCISD::BUILD_SPE64";		case PPCISD::BUILD_SPE64: return "PPCISD::BUILD_SPE64";
case PPCISD::EXTRACT_SPE: return "PPCISD::EXTRACT_SPE";		case PPCISD::EXTRACT_SPE: return "PPCISD::EXTRACT_SPE";
case PPCISD::EXTSWSLI: return "PPCISD::EXTSWSLI";		case PPCISD::EXTSWSLI: return "PPCISD::EXTSWSLI";
case PPCISD::LD_VSX_LH: return "PPCISD::LD_VSX_LH";		case PPCISD::LD_VSX_LH: return "PPCISD::LD_VSX_LH";
case PPCISD::FP_EXTEND_HALF: return "PPCISD::FP_EXTEND_HALF";		case PPCISD::FP_EXTEND_HALF: return "PPCISD::FP_EXTEND_HALF";
case PPCISD::MAT_PCREL_ADDR: return "PPCISD::MAT_PCREL_ADDR";		case PPCISD::MAT_PCREL_ADDR: return "PPCISD::MAT_PCREL_ADDR";
case PPCISD::LD_SPLAT: return "PPCISD::LD_SPLAT";		case PPCISD::LD_SPLAT: return "PPCISD::LD_SPLAT";
case PPCISD::FNMSUB: return "PPCISD::FNMSUB";		case PPCISD::FNMSUB: return "PPCISD::FNMSUB";
		case PPCISD::STRICT_FCTIDZ:
		return "PPCISD::STRICT_FCTIDZ";
		case PPCISD::STRICT_FCTIWZ:
		return "PPCISD::STRICT_FCTIWZ";
		case PPCISD::STRICT_FCTIDUZ:
		return "PPCISD::STRICT_FCTIDUZ";
		case PPCISD::STRICT_FCTIWUZ:
		return "PPCISD::STRICT_FCTIWUZ";
}		}
return nullptr;		return nullptr;
}		}

EVT PPCTargetLowering::getSetCCResultType(const DataLayout &DL, LLVMContext &C,		EVT PPCTargetLowering::getSetCCResultType(const DataLayout &DL, LLVMContext &C,
EVT VT) const {		EVT VT) const {
if (!VT.isVector())		if (!VT.isVector())
return Subtarget.useCRBits() ? MVT::i1 : MVT::i32;		return Subtarget.useCRBits() ? MVT::i1 : MVT::i32;
▲ Show 20 Lines • Show All 6,458 Lines • ▼ Show 20 Lines	case ISD::SETLE:
Cmp = DAG.getNode(ISD::FSUB, dl, CmpVT, RHS, LHS, Flags);		Cmp = DAG.getNode(ISD::FSUB, dl, CmpVT, RHS, LHS, Flags);
if (Cmp.getValueType() == MVT::f32) // Comparison is always 64-bits		if (Cmp.getValueType() == MVT::f32) // Comparison is always 64-bits
Cmp = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Cmp);		Cmp = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Cmp);
return DAG.getNode(PPCISD::FSEL, dl, ResVT, Cmp, TV, FV);		return DAG.getNode(PPCISD::FSEL, dl, ResVT, Cmp, TV, FV);
}		}
return Op;		return Op;
}		}

		static unsigned getPPCStrictOpcode(unsigned Opc) {
		switch (Opc) {
		default:
		llvm_unreachable("No strict version of this opcode!");
		case PPCISD::FCTIDZ:
		return PPCISD::STRICT_FCTIDZ;
		case PPCISD::FCTIWZ:
		return PPCISD::STRICT_FCTIWZ;
		case PPCISD::FCTIDUZ:
		return PPCISD::STRICT_FCTIDUZ;
		case PPCISD::FCTIWUZ:
		return PPCISD::STRICT_FCTIWUZ;
		}
		}

static SDValue convertFPToInt(SDValue Op, SelectionDAG &DAG,		static SDValue convertFPToInt(SDValue Op, SelectionDAG &DAG,
const PPCSubtarget &Subtarget) {		const PPCSubtarget &Subtarget) {
SDLoc dl(Op);		SDLoc dl(Op);
bool IsSigned = Op.getOpcode() == ISD::FP_TO_SINT;		bool IsStrict = Op->isStrictFPOpcode();
SDValue Src = Op.getOperand(0);		bool IsSigned = Op.getOpcode() == ISD::FP_TO_SINT \|\|
		Op.getOpcode() == ISD::STRICT_FP_TO_SINT;
		// For strict nodes, source is the second operand.
		SDValue Src = Op.getOperand(IsStrict ? 1 : 0);
		SDValue Chain = IsStrict ? Op.getOperand(0) : SDValue();
assert(Src.getValueType().isFloatingPoint());		assert(Src.getValueType().isFloatingPoint());
if (Src.getValueType() == MVT::f32)		if (Src.getValueType() == MVT::f32) {
		if (IsStrict) {
		Src = DAG.getNode(ISD::STRICT_FP_EXTEND, dl, {MVT::f64, MVT::Other},
		{Chain, Src});
		Chain = Src.getValue(1);
		} else
Src = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Src);		Src = DAG.getNode(ISD::FP_EXTEND, dl, MVT::f64, Src);
		}
SDValue Conv;		SDValue Conv;
		unsigned Opc = ISD::DELETED_NODE;
switch (Op.getSimpleValueType().SimpleTy) {		switch (Op.getSimpleValueType().SimpleTy) {
default: llvm_unreachable("Unhandled FP_TO_INT type in custom expander!");		default: llvm_unreachable("Unhandled FP_TO_INT type in custom expander!");
case MVT::i32:		case MVT::i32:
Conv = DAG.getNode(		Opc = IsSigned ? PPCISD::FCTIWZ
IsSigned ? PPCISD::FCTIWZ		: (Subtarget.hasFPCVT() ? PPCISD::FCTIWUZ : PPCISD::FCTIDZ);
: (Subtarget.hasFPCVT() ? PPCISD::FCTIWUZ : PPCISD::FCTIDZ),
dl, MVT::f64, Src);
break;		break;
case MVT::i64:		case MVT::i64:
assert((IsSigned \|\| Subtarget.hasFPCVT()) &&		assert((IsSigned \|\| Subtarget.hasFPCVT()) &&
"i64 FP_TO_UINT is supported only with FPCVT");		"i64 FP_TO_UINT is supported only with FPCVT");
Conv = DAG.getNode(IsSigned ? PPCISD::FCTIDZ : PPCISD::FCTIDUZ, dl,		Opc = IsSigned ? PPCISD::FCTIDZ : PPCISD::FCTIDUZ;
MVT::f64, Src);		}
		if (IsStrict) {
		Opc = getPPCStrictOpcode(Opc);
		Conv = DAG.getNode(Opc, dl, {MVT::f64, MVT::Other}, {Chain, Src});
		} else {
		Conv = DAG.getNode(Opc, dl, MVT::f64, Src);
}		}
return Conv;		return Conv;
}		}

void PPCTargetLowering::LowerFP_TO_INTForReuse(SDValue Op, ReuseLoadInfo &RLI,		void PPCTargetLowering::LowerFP_TO_INTForReuse(SDValue Op, ReuseLoadInfo &RLI,
SelectionDAG &DAG,		SelectionDAG &DAG,
const SDLoc &dl) const {		const SDLoc &dl) const {
SDValue Tmp = convertFPToInt(Op, DAG, Subtarget);		SDValue Tmp = convertFPToInt(Op, DAG, Subtarget);
bool IsSigned = Op.getOpcode() == ISD::FP_TO_SINT;		bool IsSigned = Op.getOpcode() == ISD::FP_TO_SINT \|\|
		Op.getOpcode() == ISD::STRICT_FP_TO_SINT;
		bool IsStrict = Op->isStrictFPOpcode();

// Convert the FP value to an int value through memory.		// Convert the FP value to an int value through memory.
bool i32Stack = Op.getValueType() == MVT::i32 && Subtarget.hasSTFIWX() &&		bool i32Stack = Op.getValueType() == MVT::i32 && Subtarget.hasSTFIWX() &&
(IsSigned \|\| Subtarget.hasFPCVT());		(IsSigned \|\| Subtarget.hasFPCVT());
SDValue FIPtr = DAG.CreateStackTemporary(i32Stack ? MVT::i32 : MVT::f64);		SDValue FIPtr = DAG.CreateStackTemporary(i32Stack ? MVT::i32 : MVT::f64);
int FI = cast<FrameIndexSDNode>(FIPtr)->getIndex();		int FI = cast<FrameIndexSDNode>(FIPtr)->getIndex();
MachinePointerInfo MPI =		MachinePointerInfo MPI =
MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI);		MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI);

// Emit a store to the stack slot.		// Emit a store to the stack slot.
SDValue Chain;		SDValue Chain = IsStrict ? Tmp.getValue(1) : DAG.getEntryNode();
Align Alignment(DAG.getEVTAlign(Tmp.getValueType()));		Align Alignment(DAG.getEVTAlign(Tmp.getValueType()));
if (i32Stack) {		if (i32Stack) {
MachineFunction &MF = DAG.getMachineFunction();		MachineFunction &MF = DAG.getMachineFunction();
Alignment = Align(4);		Alignment = Align(4);
MachineMemOperand *MMO =		MachineMemOperand *MMO =
MF.getMachineMemOperand(MPI, MachineMemOperand::MOStore, 4, Alignment);		MF.getMachineMemOperand(MPI, MachineMemOperand::MOStore, 4, Alignment);
SDValue Ops[] = { DAG.getEntryNode(), Tmp, FIPtr };		SDValue Ops[] = { Chain, Tmp, FIPtr };
Chain = DAG.getMemIntrinsicNode(PPCISD::STFIWX, dl,		Chain = DAG.getMemIntrinsicNode(PPCISD::STFIWX, dl,
DAG.getVTList(MVT::Other), Ops, MVT::i32, MMO);		DAG.getVTList(MVT::Other), Ops, MVT::i32, MMO);
} else		} else
Chain = DAG.getStore(DAG.getEntryNode(), dl, Tmp, FIPtr, MPI, Alignment);		Chain = DAG.getStore(Chain, dl, Tmp, FIPtr, MPI, Alignment);

// Result is a load from the stack slot. If loading 4 bytes, make sure to		// Result is a load from the stack slot. If loading 4 bytes, make sure to
// add in a bias on big endian.		// add in a bias on big endian.
if (Op.getValueType() == MVT::i32 && !i32Stack) {		if (Op.getValueType() == MVT::i32 && !i32Stack) {
FIPtr = DAG.getNode(ISD::ADD, dl, FIPtr.getValueType(), FIPtr,		FIPtr = DAG.getNode(ISD::ADD, dl, FIPtr.getValueType(), FIPtr,
DAG.getConstant(4, dl, FIPtr.getValueType()));		DAG.getConstant(4, dl, FIPtr.getValueType()));
MPI = MPI.getWithOffset(Subtarget.isLittleEndian() ? 0 : 4);		MPI = MPI.getWithOffset(Subtarget.isLittleEndian() ? 0 : 4);
}		}

RLI.Chain = Chain;		RLI.Chain = Chain;
RLI.Ptr = FIPtr;		RLI.Ptr = FIPtr;
RLI.MPI = MPI;		RLI.MPI = MPI;
RLI.Alignment = Alignment;		RLI.Alignment = Alignment;
}		}

/// Custom lowers floating point to integer conversions to use		/// Custom lowers floating point to integer conversions to use
/// the direct move instructions available in ISA 2.07 to avoid the		/// the direct move instructions available in ISA 2.07 to avoid the
/// need for load/store combinations.		/// need for load/store combinations.
SDValue PPCTargetLowering::LowerFP_TO_INTDirectMove(SDValue Op,		SDValue PPCTargetLowering::LowerFP_TO_INTDirectMove(SDValue Op,
SelectionDAG &DAG,		SelectionDAG &DAG,
const SDLoc &dl) const {		const SDLoc &dl) const {
assert(Op.getOperand(0).getValueType().isFloatingPoint());		SDValue Conv = convertFPToInt(Op, DAG, Subtarget);
return DAG.getNode(PPCISD::MFVSR, dl, Op.getSimpleValueType().SimpleTy,		SDValue Mov = DAG.getNode(PPCISD::MFVSR, dl, Op.getValueType(), Conv);
convertFPToInt(Op, DAG, Subtarget));		if (Op->isStrictFPOpcode())
		return DAG.getMergeValues({Mov, Conv.getValue(1)}, dl);
		else
		uweigandUnsubmitted Done Reply Inline Actions This looks still wrong to me. I think you should replace the chain with the one from Conv, and return the value (i.e. Mov). Something like: SDValue Mov = DAG.getNode(PPCISD::MFVSR, dl, Op.getValueType(), Conv); DAG.ReplaceAllUsesOfValueWith(SDValue(Op, 1), Conv.getValue(1)); return Mov; (Actually, returning the value is then the same in strict and non-strict cases, so this can be merged.) Or, another possible approach would be to not use ReplaceAllUses at all but reconstruct the multi-output value via DAG.getMergeValues. Something like; SDValue Mov = DAG.getNode(PPCISD::MFVSR, dl, Op.getValueType(), Conv); return DAG.getMergeValues({Mov, Conv.getValue(1)}, dl); uweigand: This looks still wrong to me. I think you should replace the chain with the one from Conv…
		return Mov;
}		}

SDValue PPCTargetLowering::LowerFP_TO_INT(SDValue Op, SelectionDAG &DAG,		SDValue PPCTargetLowering::LowerFP_TO_INT(SDValue Op, SelectionDAG &DAG,
const SDLoc &dl) const {		const SDLoc &dl) const {
SDValue Src = Op.getOperand(0);		bool IsStrict = Op->isStrictFPOpcode();
		bool IsSigned = Op.getOpcode() == ISD::FP_TO_SINT \|\|
		Op.getOpcode() == ISD::STRICT_FP_TO_SINT;
		SDValue Src = Op.getOperand(IsStrict ? 1 : 0);
// FP to INT conversions are legal for f128.		// FP to INT conversions are legal for f128.
if (Src.getValueType() == MVT::f128)		if (Src.getValueType() == MVT::f128)
return Op;		return Op;

// Expand ppcf128 to i32 by hand for the benefit of llvm-gcc bootstrap on		// Expand ppcf128 to i32 by hand for the benefit of llvm-gcc bootstrap on
// PPC (the libcall is not available).		// PPC (the libcall is not available).
if (Src.getValueType() == MVT::ppcf128) {		if (Src.getValueType() == MVT::ppcf128 && !IsStrict) {
if (Op.getValueType() == MVT::i32) {		if (Op.getValueType() == MVT::i32) {
if (Op.getOpcode() == ISD::FP_TO_SINT) {		if (IsSigned) {
SDValue Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, MVT::f64, Src,		SDValue Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, MVT::f64, Src,
DAG.getIntPtrConstant(0, dl));		DAG.getIntPtrConstant(0, dl));
SDValue Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, MVT::f64, Src,		SDValue Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, MVT::f64, Src,
DAG.getIntPtrConstant(1, dl));		DAG.getIntPtrConstant(1, dl));

// Add the two halves of the long double in round-to-zero mode.		// Add the two halves of the long double in round-to-zero mode.
SDValue Res = DAG.getNode(PPCISD::FADDRTZ, dl, MVT::f64, Lo, Hi);		SDValue Res = DAG.getNode(PPCISD::FADDRTZ, dl, MVT::f64, Lo, Hi);

// Now use a smaller FP_TO_SINT.		// Now use a smaller FP_TO_SINT.
return DAG.getNode(ISD::FP_TO_SINT, dl, MVT::i32, Res);		return DAG.getNode(ISD::FP_TO_SINT, dl, MVT::i32, Res);
}		} else {
if (Op.getOpcode() == ISD::FP_TO_UINT) {
const uint64_t TwoE31[] = {0x41e0000000000000LL, 0};		const uint64_t TwoE31[] = {0x41e0000000000000LL, 0};
APFloat APF = APFloat(APFloat::PPCDoubleDouble(), APInt(128, TwoE31));		APFloat APF = APFloat(APFloat::PPCDoubleDouble(), APInt(128, TwoE31));
SDValue Tmp = DAG.getConstantFP(APF, dl, MVT::ppcf128);		SDValue Tmp = DAG.getConstantFP(APF, dl, MVT::ppcf128);
// X>=2^31 ? (int)(X-2^31)+0x80000000 : (int)X		// X>=2^31 ? (int)(X-2^31)+0x80000000 : (int)X
// FIXME: generated code sucks.		// FIXME: generated code sucks.
// TODO: Are there fast-math-flags to propagate to this FSUB?		// TODO: Are there fast-math-flags to propagate to this FSUB?
SDValue True = DAG.getNode(ISD::FSUB, dl, MVT::ppcf128, Src, Tmp);		SDValue True = DAG.getNode(ISD::FSUB, dl, MVT::ppcf128, Src, Tmp);
True = DAG.getNode(ISD::FP_TO_SINT, dl, MVT::i32, True);		True = DAG.getNode(ISD::FP_TO_SINT, dl, MVT::i32, True);
Show All 30 Lines	bool PPCTargetLowering::canReuseLoadAddress(SDValue Op, EVT MemVT,
SelectionDAG &DAG,		SelectionDAG &DAG,
ISD::LoadExtType ET) const {		ISD::LoadExtType ET) const {
SDLoc dl(Op);		SDLoc dl(Op);
bool ValidFPToUint = Op.getOpcode() == ISD::FP_TO_UINT &&		bool ValidFPToUint = Op.getOpcode() == ISD::FP_TO_UINT &&
(Subtarget.hasFPCVT() \|\| Op.getValueType() == MVT::i32);		(Subtarget.hasFPCVT() \|\| Op.getValueType() == MVT::i32);
if (ET == ISD::NON_EXTLOAD &&		if (ET == ISD::NON_EXTLOAD &&
(ValidFPToUint \|\| Op.getOpcode() == ISD::FP_TO_SINT) &&		(ValidFPToUint \|\| Op.getOpcode() == ISD::FP_TO_SINT) &&
isOperationLegalOrCustom(Op.getOpcode(),		isOperationLegalOrCustom(Op.getOpcode(),
Op.getOperand(0).getValueType())) {		Op.getOperand(0).getValueType())) {
		steven.zhangUnsubmitted Done Reply Inline Actions The parameter SDLoc is not needed. And can we change the function like: getFpNode() or something else ? You don't need to have the strict in the function name as it is already one of the function parameter. steven.zhang: The parameter SDLoc is not needed. And can we change the function like: getFpNode() or…

		steven.zhangUnsubmitted Done Reply Inline Actions The Strict parameter is not needed as you can check the value of the Chain to know it. steven.zhang: The Strict parameter is not needed as you can check the value of the Chain to know it.
LowerFP_TO_INTForReuse(Op, RLI, DAG, dl);		LowerFP_TO_INTForReuse(Op, RLI, DAG, dl);
return true;		return true;
}		}
		steven.zhangUnsubmitted Done Reply Inline Actions !Strict is not needed. steven.zhang: !Strict is not needed.
		qiucfAuthorUnsubmitted Done Reply Inline Actions If `Strict` is `false`, we can allow Chain to be null. qiucf: If `Strict` is `false`, we can allow Chain to be null.

LoadSDNode *LD = dyn_cast<LoadSDNode>(Op);		LoadSDNode *LD = dyn_cast<LoadSDNode>(Op);
		steven.zhangUnsubmitted Done Reply Inline Actions Please give the default initializer to avoid uninitialized local variable if llvm_unreachable is off. steven.zhang: Please give the default initializer to avoid uninitialized local variable if llvm_unreachable…
		nemanjaiUnsubmitted Done Reply Inline Actions This function appears to just take a node and produce a strict version of that node. It seems like there isn't anything target dependent about such an operation so it is very suspicious to me why this is in the PPC back end. If for some reason it has to be here, please explain why in a comment. If the only target specific part of this is the `STRICT_MFVSR` node, then at least the rest can be handled by target independent code, can't it? nemanjai: This function appears to just take a node and produce a strict version of that node. It seems…
		qiucfAuthorUnsubmitted Done Reply Inline Actions Thanks for the comments. The method (including some overloads) doesn't help much. So I removed it and wrote a simple helper method to get strict version of ppc-specific opcode. There's method `mutateStrictFPToFP` doing similar things but maybe not suitable here. I'll send an NFC to make opcode conversion a hook so that each target can benefit from it. qiucf: Thanks for the comments. The method (including some overloads) doesn't help much. So I removed…
if (!LD \|\| LD->getExtensionType() != ET \|\| LD->isVolatile() \|\|		if (!LD \|\| LD->getExtensionType() != ET \|\| LD->isVolatile() \|\|
LD->isNonTemporal())		LD->isNonTemporal())
return false;		return false;
		nemanjaiUnsubmitted Done Reply Inline Actions This seems dangerous to me. You are deciding whether to return a strict node based on whether a valid Chain is provided. I am personally against making decisions based on orthogonal concerns. If the caller wants a strict node, that should be explicit rather than this strange implicit contract of "If you want a strict node, provide a valid chain." nemanjai: This seems dangerous to me. You are deciding whether to return a strict node based on whether a…
if (LD->getMemoryVT() != MemVT)		if (LD->getMemoryVT() != MemVT)
return false;		return false;

RLI.Ptr = LD->getBasePtr();		RLI.Ptr = LD->getBasePtr();
if (LD->isIndexed() && !LD->getOffset().isUndef()) {		if (LD->isIndexed() && !LD->getOffset().isUndef()) {
		steven.zhangUnsubmitted Done Reply Inline Actions You don't need this assertion now. steven.zhang: You don't need this assertion now.
assert(LD->getAddressingMode() == ISD::PRE_INC &&		assert(LD->getAddressingMode() == ISD::PRE_INC &&
"Non-pre-inc AM on PPC?");		"Non-pre-inc AM on PPC?");
RLI.Ptr = DAG.getNode(ISD::ADD, dl, RLI.Ptr.getValueType(), RLI.Ptr,		RLI.Ptr = DAG.getNode(ISD::ADD, dl, RLI.Ptr.getValueType(), RLI.Ptr,
LD->getOffset());		LD->getOffset());
}		}

RLI.Chain = LD->getChain();		RLI.Chain = LD->getChain();
RLI.MPI = LD->getPointerInfo();		RLI.MPI = LD->getPointerInfo();
RLI.IsDereferenceable = LD->isDereferenceable();		RLI.IsDereferenceable = LD->isDereferenceable();
RLI.IsInvariant = LD->isInvariant();		RLI.IsInvariant = LD->isInvariant();
RLI.Alignment = LD->getAlign();		RLI.Alignment = LD->getAlign();
RLI.AAInfo = LD->getAAInfo();		RLI.AAInfo = LD->getAAInfo();
RLI.Ranges = LD->getRanges();		RLI.Ranges = LD->getRanges();

		steven.zhangUnsubmitted Done Reply Inline Actions This is not good code practice. Please use: Strict ? 1 : 0 or seeking some API in the SDValue. steven.zhang: This is not good code practice. Please use: Strict ? 1 : 0 or seeking some API in the SDValue.
RLI.ResChain = SDValue(LD, LD->isIndexed() ? 2 : 1);		RLI.ResChain = SDValue(LD, LD->isIndexed() ? 2 : 1);
return true;		return true;
		steven.zhangUnsubmitted Done Reply Inline Actions IsStrict, IsSigned steven.zhang: IsStrict, IsSigned
}		}

// Given the head of the old chain, ResChain, insert a token factor containing		// Given the head of the old chain, ResChain, insert a token factor containing
// it and NewResChain, and make users of ResChain now be users of that token		// it and NewResChain, and make users of ResChain now be users of that token
// factor.		// factor.
// TODO: Remove and use DAG::makeEquivalentMemoryOrdering() instead.		// TODO: Remove and use DAG::makeEquivalentMemoryOrdering() instead.
void PPCTargetLowering::spliceIntoChain(SDValue ResChain,		void PPCTargetLowering::spliceIntoChain(SDValue ResChain,
SDValue NewResChain,		SDValue NewResChain,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
if (!ResChain)		if (!ResChain)
return;		return;

SDLoc dl(NewResChain);		SDLoc dl(NewResChain);

SDValue TF = DAG.getNode(ISD::TokenFactor, dl, MVT::Other,		SDValue TF = DAG.getNode(ISD::TokenFactor, dl, MVT::Other,
NewResChain, DAG.getUNDEF(MVT::Other));		NewResChain, DAG.getUNDEF(MVT::Other));
assert(TF.getNode() != NewResChain.getNode() &&		assert(TF.getNode() != NewResChain.getNode() &&
"A new TF really is required here");		"A new TF really is required here");

DAG.ReplaceAllUsesOfValueWith(ResChain, TF);		DAG.ReplaceAllUsesOfValueWith(ResChain, TF);
DAG.UpdateNodeOperands(TF.getNode(), ResChain, NewResChain);		DAG.UpdateNodeOperands(TF.getNode(), ResChain, NewResChain);
}		}
		steven.zhangUnsubmitted Done Reply Inline Actions The logic between LowerFP_TO_INTForReuse and LowerFP_TO_INTDirectMove is nearly the same between line 8168 ~ 8196. And that is expected as the difference between the two is how to move the data from FPR to GPR. So, can we add another function to do the convert ? Something like: LowerFP_TO_INTDirectMove: V = convertToFp() MFVSR V LowerFP_TO_INTForReuse: V = convertToFp() Store V Load V steven.zhang: The logic between LowerFP_TO_INTForReuse and LowerFP_TO_INTDirectMove is nearly the same…

/// Analyze profitability of direct move		/// Analyze profitability of direct move
/// prefer float load to int load plus direct move		/// prefer float load to int load plus direct move
/// when there is no integer use of int load		/// when there is no integer use of int load
bool PPCTargetLowering::directMoveIsProfitable(const SDValue &Op) const {		bool PPCTargetLowering::directMoveIsProfitable(const SDValue &Op) const {
SDNode *Origin = Op.getOperand(0).getNode();		SDNode *Origin = Op.getOperand(0).getNode();
if (Origin->getOpcode() != ISD::LOAD)		if (Origin->getOpcode() != ISD::LOAD)
		steven.zhangUnsubmitted Not Done Reply Inline Actions So, do we have problem if it is strict opcode in this code path? steven.zhang: So, do we have problem if it is strict opcode in this code path?
		qiucfAuthorUnsubmitted Done Reply Inline Actions Do you mean these PPC-specific opcodes are not strict? But the result is either load/store or direct moved. What we do here is to keep operands of value consistent. So changing these opcodes to strict may be unnecessary. qiucf: Do you mean these PPC-specific opcodes are not strict? But the result is either load/store or…
return true;		return true;

// If there is no LXSIBZX/LXSIHZX, like Power8,		// If there is no LXSIBZX/LXSIHZX, like Power8,
// prefer direct move if the memory size is 1 or 2 bytes.		// prefer direct move if the memory size is 1 or 2 bytes.
MachineMemOperand *MMO = cast<LoadSDNode>(Origin)->getMemOperand();		MachineMemOperand *MMO = cast<LoadSDNode>(Origin)->getMemOperand();
if (!Subtarget.hasP9Vector() && MMO->getSize() <= 2)		if (!Subtarget.hasP9Vector() && MMO->getSize() <= 2)
return true;		return true;

Show All 10 Lines	if (UI->getOpcode() != ISD::SINT_TO_FP &&
return true;		return true;
}		}

return false;		return false;
}		}

static SDValue convertIntToFP(SDValue Op, SDValue Src, SelectionDAG &DAG,		static SDValue convertIntToFP(SDValue Op, SDValue Src, SelectionDAG &DAG,
const PPCSubtarget &Subtarget) {		const PPCSubtarget &Subtarget) {
bool IsSigned = Op.getOpcode() == ISD::SINT_TO_FP;		bool IsSigned = Op.getOpcode() == ISD::SINT_TO_FP;
		uweigandUnsubmitted Not Done Reply Inline Actions This doesn't look right. The Chain produced by this strict node just vanishes, this cannot be correct. uweigand: This doesn't look right. The Chain produced by this strict node just vanishes, this cannot be…
SDLoc dl(Op);		SDLoc dl(Op);
// If we have FCFIDS, then use it when converting to single-precision.		// If we have FCFIDS, then use it when converting to single-precision.
// Otherwise, convert to double-precision and then round.		// Otherwise, convert to double-precision and then round.
bool IsSingle = Op.getValueType() == MVT::f32 && Subtarget.hasFPCVT();		bool IsSingle = Op.getValueType() == MVT::f32 && Subtarget.hasFPCVT();
unsigned ConvOpc = IsSingle ? (IsSigned ? PPCISD::FCFIDS : PPCISD::FCFIDUS)		unsigned ConvOpc = IsSingle ? (IsSigned ? PPCISD::FCFIDS : PPCISD::FCFIDUS)
: (IsSigned ? PPCISD::FCFID : PPCISD::FCFIDU);		: (IsSigned ? PPCISD::FCFID : PPCISD::FCFIDU);
EVT ConvTy = IsSingle ? MVT::f32 : MVT::f64;		EVT ConvTy = IsSingle ? MVT::f32 : MVT::f64;
return DAG.getNode(ConvOpc, dl, ConvTy, Src);		return DAG.getNode(ConvOpc, dl, ConvTy, Src);
}		}

/// Custom lowers integer to floating point conversions to use		/// Custom lowers integer to floating point conversions to use
/// the direct move instructions available in ISA 2.07 to avoid the		/// the direct move instructions available in ISA 2.07 to avoid the
/// need for load/store combinations.		/// need for load/store combinations.
SDValue PPCTargetLowering::LowerINT_TO_FPDirectMove(SDValue Op,		SDValue PPCTargetLowering::LowerINT_TO_FPDirectMove(SDValue Op,
SelectionDAG &DAG,		SelectionDAG &DAG,
const SDLoc &dl) const {		const SDLoc &dl) const {
assert((Op.getValueType() == MVT::f32 \|\|		assert((Op.getValueType() == MVT::f32 \|\|
		uweigandUnsubmitted Not Done Reply Inline Actions And you need strict versions of all these conversion operations, I'd assume. uweigand: And you need strict versions of all these conversion operations, I'd assume.
Op.getValueType() == MVT::f64) &&		Op.getValueType() == MVT::f64) &&
"Invalid floating point type as target of conversion");		"Invalid floating point type as target of conversion");
assert(Subtarget.hasFPCVT() &&		assert(Subtarget.hasFPCVT() &&
"Int to FP conversions with direct moves require FPCVT");		"Int to FP conversions with direct moves require FPCVT");
SDValue Src = Op.getOperand(0);		SDValue Src = Op.getOperand(0);
bool WordInt = Src.getSimpleValueType().SimpleTy == MVT::i32;		bool WordInt = Src.getSimpleValueType().SimpleTy == MVT::i32;
		steven.zhangUnsubmitted Done Reply Inline Actions Don't use Tmp but with some meaningful name. steven.zhang: Don't use Tmp but with some meaningful name.
bool Signed = Op.getOpcode() == ISD::SINT_TO_FP;		bool Signed = Op.getOpcode() == ISD::SINT_TO_FP;
		steven.zhangUnsubmitted Done Reply Inline Actions Op.getValueType() ? steven.zhang: Op.getValueType() ?
unsigned MovOpc = (WordInt && !Signed) ? PPCISD::MTVSRZ : PPCISD::MTVSRA;		unsigned MovOpc = (WordInt && !Signed) ? PPCISD::MTVSRZ : PPCISD::MTVSRA;
SDValue Mov = DAG.getNode(MovOpc, dl, MVT::f64, Src);		SDValue Mov = DAG.getNode(MovOpc, dl, MVT::f64, Src);
return convertIntToFP(Op, Mov, DAG, Subtarget);		return convertIntToFP(Op, Mov, DAG, Subtarget);
}		}
		uweigandUnsubmitted Not Done Reply Inline Actions Nothing in the function actually handles strict nodes, that cannot be right. uweigand: Nothing in the function actually handles strict nodes, that cannot be right.

static SDValue widenVec(SelectionDAG &DAG, SDValue Vec, const SDLoc &dl) {		static SDValue widenVec(SelectionDAG &DAG, SDValue Vec, const SDLoc &dl) {

		steven.zhangUnsubmitted Done Reply Inline Actions move the assertion into concertFPToInt steven.zhang: move the assertion into concertFPToInt
EVT VecVT = Vec.getValueType();		EVT VecVT = Vec.getValueType();
assert(VecVT.isVector() && "Expected a vector type.");		assert(VecVT.isVector() && "Expected a vector type.");
assert(VecVT.getSizeInBits() < 128 && "Vector is already full width.");		assert(VecVT.getSizeInBits() < 128 && "Vector is already full width.");

EVT EltVT = VecVT.getVectorElementType();		EVT EltVT = VecVT.getVectorElementType();
unsigned WideNumElts = 128 / EltVT.getSizeInBits();		unsigned WideNumElts = 128 / EltVT.getSizeInBits();
EVT WideVT = EVT::getVectorVT(*DAG.getContext(), EltVT, WideNumElts);		EVT WideVT = EVT::getVectorVT(*DAG.getContext(), EltVT, WideNumElts);

unsigned NumConcat = WideNumElts / VecVT.getVectorNumElements();		unsigned NumConcat = WideNumElts / VecVT.getVectorNumElements();
SmallVector<SDValue, 16> Ops(NumConcat);		SmallVector<SDValue, 16> Ops(NumConcat);
Ops[0] = Vec;		Ops[0] = Vec;
SDValue UndefVec = DAG.getUNDEF(VecVT);		SDValue UndefVec = DAG.getUNDEF(VecVT);
for (unsigned i = 1; i < NumConcat; ++i)		for (unsigned i = 1; i < NumConcat; ++i)
Ops[i] = UndefVec;		Ops[i] = UndefVec;

return DAG.getNode(ISD::CONCAT_VECTORS, dl, WideVT, Ops);		return DAG.getNode(ISD::CONCAT_VECTORS, dl, WideVT, Ops);
}		}
		steven.zhangUnsubmitted Not Done Reply Inline Actions This could be something that we can improve later. We should mark it as legal instead of checking it here if I understand the intention correctly. steven.zhang: This could be something that we can improve later. We should mark it as legal instead of…

SDValue PPCTargetLowering::LowerINT_TO_FPVector(SDValue Op, SelectionDAG &DAG,		SDValue PPCTargetLowering::LowerINT_TO_FPVector(SDValue Op, SelectionDAG &DAG,
const SDLoc &dl) const {		const SDLoc &dl) const {

unsigned Opc = Op.getOpcode();		unsigned Opc = Op.getOpcode();
assert((Opc == ISD::UINT_TO_FP \|\| Opc == ISD::SINT_TO_FP) &&		assert((Opc == ISD::UINT_TO_FP \|\| Opc == ISD::SINT_TO_FP) &&
"Unexpected conversion type");		"Unexpected conversion type");
assert((Op.getValueType() == MVT::v2f64 \|\| Op.getValueType() == MVT::v4f32) &&		assert((Op.getValueType() == MVT::v2f64 \|\| Op.getValueType() == MVT::v4f32) &&
Show All 12 Lines	for (unsigned i = 0; i < WideNumElts; ++i)
ShuffV.push_back(i + WideNumElts);		ShuffV.push_back(i + WideNumElts);

int Stride = FourEltRes ? WideNumElts / 4 : WideNumElts / 2;		int Stride = FourEltRes ? WideNumElts / 4 : WideNumElts / 2;
int SaveElts = FourEltRes ? 4 : 2;		int SaveElts = FourEltRes ? 4 : 2;
if (Subtarget.isLittleEndian())		if (Subtarget.isLittleEndian())
for (int i = 0; i < SaveElts; i++)		for (int i = 0; i < SaveElts; i++)
ShuffV[i * Stride] = i;		ShuffV[i * Stride] = i;
else		else
for (int i = 1; i <= SaveElts; i++)		for (int i = 1; i <= SaveElts; i++)
		uweigandUnsubmitted Not Done Reply Inline Actions Why do we need a strict version of a plain move? uweigand: Why do we need a strict version of a plain move?
		qiucfAuthorUnsubmitted Done Reply Inline Actions Yes, a strict move doesn't look reasonable. But the original `strict_fptosi` node will be replaced by the result. So if directly return the move, operands will not match (no chain in `mfvsr`). Is there a better way here? qiucf: Yes, a strict move doesn't look reasonable. But the original `strict_fptosi` node will be…
		uweigandUnsubmitted Not Done Reply Inline Actions So this gets expanded to a PPCISD::FCTI... variant (inside convertFPToInt) followed by the MFVSR. Now, with proper chain handling, the input chain of the strict_fptosi is consumed by a strict variant of FCTI..., and the output chain of that STRICT_FCTI... is then the correct output chain for the whole operation. The data output (only) of the STRICT_FCTI... acts then as the input of the MFVSR, and the output of the MFVSR is the correct value output of the whole operation. So if short, you need to replace (out-val, out-chain) = strict_fptosi (in-val, in-chain) by (tmp-val, out-chain) = STRICT_FCTI... (in-val, in-chain) out-val = MFVSR (tmp-val) This probably will require some ReplaceAllUses... instead of just returning a result, as is already done elsewhere with chain output instructions. uweigand: So this gets expanded to a PPCISD::FCTI... variant (inside convertFPToInt) followed by the…
ShuffV[i * Stride - 1] = i - 1;		ShuffV[i * Stride - 1] = i - 1;

SDValue ShuffleSrc2 =		SDValue ShuffleSrc2 =
SignedConv ? DAG.getUNDEF(WideVT) : DAG.getConstant(0, dl, WideVT);		SignedConv ? DAG.getUNDEF(WideVT) : DAG.getConstant(0, dl, WideVT);
SDValue Arrange = DAG.getVectorShuffle(WideVT, dl, Wide, ShuffleSrc2, ShuffV);		SDValue Arrange = DAG.getVectorShuffle(WideVT, dl, Wide, ShuffleSrc2, ShuffV);
		uweigandUnsubmitted Not Done Reply Inline Actions Again, nothing in this function actually handles strict nodes ... uweigand: Again, nothing in this function actually handles strict nodes ...

SDValue Extend;		SDValue Extend;
if (SignedConv) {		if (SignedConv) {
Arrange = DAG.getBitcast(IntermediateVT, Arrange);		Arrange = DAG.getBitcast(IntermediateVT, Arrange);
EVT ExtVT = Op.getOperand(0).getValueType();		EVT ExtVT = Op.getOperand(0).getValueType();
if (Subtarget.hasP9Altivec())		if (Subtarget.hasP9Altivec())
ExtVT = EVT::getVectorVT(*DAG.getContext(), WideVT.getVectorElementType(),		ExtVT = EVT::getVectorVT(*DAG.getContext(), WideVT.getVectorElementType(),
IntermediateVT.getVectorNumElements());		IntermediateVT.getVectorNumElements());
▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	if (canReuseLoadAddress(SINT, MVT::i64, RLI, DAG)) {
RLI.Alignment, RLI.MMOFlags(), RLI.AAInfo, RLI.Ranges);		RLI.Alignment, RLI.MMOFlags(), RLI.AAInfo, RLI.Ranges);
spliceIntoChain(RLI.ResChain, Bits.getValue(1), DAG);		spliceIntoChain(RLI.ResChain, Bits.getValue(1), DAG);
} else if (Subtarget.hasLFIWAX() &&		} else if (Subtarget.hasLFIWAX() &&
canReuseLoadAddress(SINT, MVT::i32, RLI, DAG, ISD::SEXTLOAD)) {		canReuseLoadAddress(SINT, MVT::i32, RLI, DAG, ISD::SEXTLOAD)) {
MachineMemOperand *MMO =		MachineMemOperand *MMO =
MF.getMachineMemOperand(RLI.MPI, MachineMemOperand::MOLoad, 4,		MF.getMachineMemOperand(RLI.MPI, MachineMemOperand::MOLoad, 4,
RLI.Alignment, RLI.AAInfo, RLI.Ranges);		RLI.Alignment, RLI.AAInfo, RLI.Ranges);
SDValue Ops[] = { RLI.Chain, RLI.Ptr };		SDValue Ops[] = { RLI.Chain, RLI.Ptr };
Bits = DAG.getMemIntrinsicNode(PPCISD::LFIWAX, dl,		Bits = DAG.getMemIntrinsicNode(PPCISD::LFIWAX, dl,
		steven.zhangUnsubmitted Not Done Reply Inline Actions ConvertIntToFP and ConvertFPToInt should have the same parameters. steven.zhang: ConvertIntToFP and ConvertFPToInt should have the same parameters.
		qiucfAuthorUnsubmitted Done Reply Inline Actions FPToInt is round-then-move, while IntToFP is move-then-round. So when IntToFP we need extra information from original `Op` besides the moved `Src`. qiucf: FPToInt is round-then-move, while IntToFP is move-then-round. So when IntToFP we need extra…
DAG.getVTList(MVT::f64, MVT::Other),		DAG.getVTList(MVT::f64, MVT::Other),
Ops, MVT::i32, MMO);		Ops, MVT::i32, MMO);
spliceIntoChain(RLI.ResChain, Bits.getValue(1), DAG);		spliceIntoChain(RLI.ResChain, Bits.getValue(1), DAG);
} else if (Subtarget.hasFPCVT() &&		} else if (Subtarget.hasFPCVT() &&
canReuseLoadAddress(SINT, MVT::i32, RLI, DAG, ISD::ZEXTLOAD)) {		canReuseLoadAddress(SINT, MVT::i32, RLI, DAG, ISD::ZEXTLOAD)) {
MachineMemOperand *MMO =		MachineMemOperand *MMO =
MF.getMachineMemOperand(RLI.MPI, MachineMemOperand::MOLoad, 4,		MF.getMachineMemOperand(RLI.MPI, MachineMemOperand::MOLoad, 4,
RLI.Alignment, RLI.AAInfo, RLI.Ranges);		RLI.Alignment, RLI.AAInfo, RLI.Ranges);
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	if (Subtarget.hasLFIWAX() \|\| Subtarget.hasFPCVT()) {
if (ReusingLoad)		if (ReusingLoad)
spliceIntoChain(RLI.ResChain, Ld.getValue(1), DAG);		spliceIntoChain(RLI.ResChain, Ld.getValue(1), DAG);
} else {		} else {
assert(Subtarget.isPPC64() &&		assert(Subtarget.isPPC64() &&
"i32->FP without LFIWAX supported only on PPC64");		"i32->FP without LFIWAX supported only on PPC64");

int FrameIdx = MFI.CreateStackObject(8, Align(8), false);		int FrameIdx = MFI.CreateStackObject(8, Align(8), false);
SDValue FIdx = DAG.getFrameIndex(FrameIdx, PtrVT);		SDValue FIdx = DAG.getFrameIndex(FrameIdx, PtrVT);

		steven.zhangUnsubmitted Done Reply Inline Actions Please remove such kind of change as it is not part of your change. steven.zhang: Please remove such kind of change as it is not part of your change.
SDValue Ext64 = DAG.getNode(ISD::SIGN_EXTEND, dl, MVT::i64, Src);		SDValue Ext64 = DAG.getNode(ISD::SIGN_EXTEND, dl, MVT::i64, Src);

// STD the extended value into the stack slot.		// STD the extended value into the stack slot.
SDValue Store = DAG.getStore(		SDValue Store = DAG.getStore(
DAG.getEntryNode(), dl, Ext64, FIdx,		DAG.getEntryNode(), dl, Ext64, FIdx,
MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FrameIdx));		MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FrameIdx));

// Load the value as a double.		// Load the value as a double.
▲ Show 20 Lines • Show All 1,965 Lines • ▼ Show 20 Lines	SDValue PPCTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
case ISD::EH_DWARF_CFA: return LowerEH_DWARF_CFA(Op, DAG);		case ISD::EH_DWARF_CFA: return LowerEH_DWARF_CFA(Op, DAG);
case ISD::EH_SJLJ_SETJMP: return lowerEH_SJLJ_SETJMP(Op, DAG);		case ISD::EH_SJLJ_SETJMP: return lowerEH_SJLJ_SETJMP(Op, DAG);
case ISD::EH_SJLJ_LONGJMP: return lowerEH_SJLJ_LONGJMP(Op, DAG);		case ISD::EH_SJLJ_LONGJMP: return lowerEH_SJLJ_LONGJMP(Op, DAG);

case ISD::LOAD: return LowerLOAD(Op, DAG);		case ISD::LOAD: return LowerLOAD(Op, DAG);
case ISD::STORE: return LowerSTORE(Op, DAG);		case ISD::STORE: return LowerSTORE(Op, DAG);
case ISD::TRUNCATE: return LowerTRUNCATE(Op, DAG);		case ISD::TRUNCATE: return LowerTRUNCATE(Op, DAG);
case ISD::SELECT_CC: return LowerSELECT_CC(Op, DAG);		case ISD::SELECT_CC: return LowerSELECT_CC(Op, DAG);
		case ISD::STRICT_FP_TO_UINT:
		case ISD::STRICT_FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::FP_TO_SINT: return LowerFP_TO_INT(Op, DAG, SDLoc(Op));		case ISD::FP_TO_SINT: return LowerFP_TO_INT(Op, DAG, SDLoc(Op));
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::SINT_TO_FP: return LowerINT_TO_FP(Op, DAG);		case ISD::SINT_TO_FP: return LowerINT_TO_FP(Op, DAG);
case ISD::FLT_ROUNDS_: return LowerFLT_ROUNDS_(Op, DAG);		case ISD::FLT_ROUNDS_: return LowerFLT_ROUNDS_(Op, DAG);

// Lower 64-bit shifts.		// Lower 64-bit shifts.
case ISD::SHL_PARTS: return LowerSHL_PARTS(Op, DAG);		case ISD::SHL_PARTS: return LowerSHL_PARTS(Op, DAG);
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	case ISD::VAARG: {
if (VT == MVT::i64) {		if (VT == MVT::i64) {
SDValue NewNode = LowerVAARG(SDValue(N, 1), DAG);		SDValue NewNode = LowerVAARG(SDValue(N, 1), DAG);

Results.push_back(NewNode);		Results.push_back(NewNode);
Results.push_back(NewNode.getValue(1));		Results.push_back(NewNode.getValue(1));
}		}
return;		return;
}		}
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
// LowerFP_TO_INT() can only handle f32 and f64.		// LowerFP_TO_INT() can only handle f32 and f64.
if (N->getOperand(0).getValueType() == MVT::ppcf128)		if (N->getOperand(N->isStrictFPOpcode() ? 1 : 0).getValueType() ==
		MVT::ppcf128)
return;		return;
Results.push_back(LowerFP_TO_INT(SDValue(N, 0), DAG, dl));		Results.push_back(LowerFP_TO_INT(SDValue(N, 0), DAG, dl));
return;		return;
case ISD::TRUNCATE: {		case ISD::TRUNCATE: {
EVT TrgVT = N->getValueType(0);		EVT TrgVT = N->getValueType(0);
EVT OpVT = N->getOperand(0).getValueType();		EVT OpVT = N->getOperand(0).getValueType();
if (TrgVT.isVector() &&		if (TrgVT.isVector() &&
isOperationCustom(N->getOpcode(), TrgVT) &&		isOperationCustom(N->getOpcode(), TrgVT) &&
▲ Show 20 Lines • Show All 5,767 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstr64Bit.td

	Show First 20 Lines • Show All 1,451 Lines • ▼ Show 20 Lines
	defm FCTID : XForm_26r<63, 814, (outs f8rc:$frD), (ins f8rc:$frB),			defm FCTID : XForm_26r<63, 814, (outs f8rc:$frD), (ins f8rc:$frB),
	"fctid", "$frD, $frB", IIC_FPGeneral,			"fctid", "$frD, $frB", IIC_FPGeneral,
	[]>, isPPC64;			[]>, isPPC64;
	defm FCTIDU : XForm_26r<63, 942, (outs f8rc:$frD), (ins f8rc:$frB),			defm FCTIDU : XForm_26r<63, 942, (outs f8rc:$frD), (ins f8rc:$frB),
	"fctidu", "$frD, $frB", IIC_FPGeneral,			"fctidu", "$frD, $frB", IIC_FPGeneral,
	[]>, isPPC64;			[]>, isPPC64;
	defm FCTIDZ : XForm_26r<63, 815, (outs f8rc:$frD), (ins f8rc:$frB),			defm FCTIDZ : XForm_26r<63, 815, (outs f8rc:$frD), (ins f8rc:$frB),
	"fctidz", "$frD, $frB", IIC_FPGeneral,			"fctidz", "$frD, $frB", IIC_FPGeneral,
	[(set f64:$frD, (PPCfctidz f64:$frB))]>, isPPC64;			[(set f64:$frD, (PPCany_fctidz f64:$frB))]>, isPPC64;

	defm FCFIDU : XForm_26r<63, 974, (outs f8rc:$frD), (ins f8rc:$frB),			defm FCFIDU : XForm_26r<63, 974, (outs f8rc:$frD), (ins f8rc:$frB),
	"fcfidu", "$frD, $frB", IIC_FPGeneral,			"fcfidu", "$frD, $frB", IIC_FPGeneral,
	[(set f64:$frD, (PPCfcfidu f64:$frB))]>, isPPC64;			[(set f64:$frD, (PPCfcfidu f64:$frB))]>, isPPC64;
	defm FCFIDS : XForm_26r<59, 846, (outs f4rc:$frD), (ins f8rc:$frB),			defm FCFIDS : XForm_26r<59, 846, (outs f4rc:$frD), (ins f8rc:$frB),
	"fcfids", "$frD, $frB", IIC_FPGeneral,			"fcfids", "$frD, $frB", IIC_FPGeneral,
	[(set f32:$frD, (PPCfcfids f64:$frB))]>, isPPC64;			[(set f32:$frD, (PPCfcfids f64:$frB))]>, isPPC64;
	defm FCFIDUS : XForm_26r<59, 974, (outs f4rc:$frD), (ins f8rc:$frB),			defm FCFIDUS : XForm_26r<59, 974, (outs f4rc:$frD), (ins f8rc:$frB),
	"fcfidus", "$frD, $frB", IIC_FPGeneral,			"fcfidus", "$frD, $frB", IIC_FPGeneral,
	[(set f32:$frD, (PPCfcfidus f64:$frB))]>, isPPC64;			[(set f32:$frD, (PPCfcfidus f64:$frB))]>, isPPC64;
	defm FCTIDUZ : XForm_26r<63, 943, (outs f8rc:$frD), (ins f8rc:$frB),			defm FCTIDUZ : XForm_26r<63, 943, (outs f8rc:$frD), (ins f8rc:$frB),
	"fctiduz", "$frD, $frB", IIC_FPGeneral,			"fctiduz", "$frD, $frB", IIC_FPGeneral,
	[(set f64:$frD, (PPCfctiduz f64:$frB))]>, isPPC64;			[(set f64:$frD, (PPCany_fctiduz f64:$frB))]>, isPPC64;
	defm FCTIWUZ : XForm_26r<63, 143, (outs f8rc:$frD), (ins f8rc:$frB),			defm FCTIWUZ : XForm_26r<63, 143, (outs f8rc:$frD), (ins f8rc:$frB),
	"fctiwuz", "$frD, $frB", IIC_FPGeneral,			"fctiwuz", "$frD, $frB", IIC_FPGeneral,
	[(set f64:$frD, (PPCfctiwuz f64:$frB))]>, isPPC64;			[(set f64:$frD, (PPCany_fctiwuz f64:$frB))]>, isPPC64;
	}			}


	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Instruction Patterns			// Instruction Patterns
	//			//

	// Extensions and truncates to/from 32-bit regs.			// Extensions and truncates to/from 32-bit regs.
	▲ Show 20 Lines • Show All 122 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrInfo.td

Show First 20 Lines • Show All 209 Lines • ▼ Show 20 Lines
def PPCsrl : SDNode<"PPCISD::SRL" , SDTIntShiftOp>;		def PPCsrl : SDNode<"PPCISD::SRL" , SDTIntShiftOp>;
def PPCsra : SDNode<"PPCISD::SRA" , SDTIntShiftOp>;		def PPCsra : SDNode<"PPCISD::SRA" , SDTIntShiftOp>;
def PPCshl : SDNode<"PPCISD::SHL" , SDTIntShiftOp>;		def PPCshl : SDNode<"PPCISD::SHL" , SDTIntShiftOp>;

def PPCfnmsub : SDNode<"PPCISD::FNMSUB" , SDTFPTernaryOp>;		def PPCfnmsub : SDNode<"PPCISD::FNMSUB" , SDTFPTernaryOp>;

def PPCextswsli : SDNode<"PPCISD::EXTSWSLI" , SDT_PPCextswsli>;		def PPCextswsli : SDNode<"PPCISD::EXTSWSLI" , SDT_PPCextswsli>;

		def PPCstrict_fctidz : SDNode<"PPCISD::STRICT_FCTIDZ",
		SDTFPUnaryOp, [SDNPHasChain]>;
		def PPCstrict_fctiwz : SDNode<"PPCISD::STRICT_FCTIWZ",
		SDTFPUnaryOp, [SDNPHasChain]>;
		def PPCstrict_fctiduz : SDNode<"PPCISD::STRICT_FCTIDUZ",
		SDTFPUnaryOp, [SDNPHasChain]>;
		def PPCstrict_fctiwuz : SDNode<"PPCISD::STRICT_FCTIWUZ",
		SDTFPUnaryOp, [SDNPHasChain]>;

		def PPCany_fctidz : PatFrags<(ops node:$op),
		[(PPCstrict_fctidz node:$op),
		(PPCfctidz node:$op)]>;
		def PPCany_fctiwz : PatFrags<(ops node:$op),
		[(PPCstrict_fctiwz node:$op),
		(PPCfctiwz node:$op)]>;
		def PPCany_fctiduz : PatFrags<(ops node:$op),
		[(PPCstrict_fctiduz node:$op),
		(PPCfctiduz node:$op)]>;
		def PPCany_fctiwuz : PatFrags<(ops node:$op),
		[(PPCstrict_fctiwuz node:$op),
		(PPCfctiwuz node:$op)]>;

// Move 2 i64 values into a VSX register		// Move 2 i64 values into a VSX register
def PPCbuild_fp128: SDNode<"PPCISD::BUILD_FP128",		def PPCbuild_fp128: SDNode<"PPCISD::BUILD_FP128",
SDTypeProfile<1, 2,		SDTypeProfile<1, 2,
[SDTCisFP<0>, SDTCisSameSizeAs<1,2>,		[SDTCisFP<0>, SDTCisSameSizeAs<1,2>,
SDTCisSameAs<1,2>]>,		SDTCisSameAs<1,2>]>,
[]>;		[]>;

def PPCbuild_spe64: SDNode<"PPCISD::BUILD_SPE64",		def PPCbuild_spe64: SDNode<"PPCISD::BUILD_SPE64",
▲ Show 20 Lines • Show All 2,401 Lines • ▼ Show 20 Lines	let Uses = [RM], mayRaiseFPException = 1, hasSideEffects = 0 in {
defm FCTIW : XForm_26r<63, 14, (outs f8rc:$frD), (ins f8rc:$frB),		defm FCTIW : XForm_26r<63, 14, (outs f8rc:$frD), (ins f8rc:$frB),
"fctiw", "$frD, $frB", IIC_FPGeneral,		"fctiw", "$frD, $frB", IIC_FPGeneral,
[]>;		[]>;
defm FCTIWU : XForm_26r<63, 142, (outs f8rc:$frD), (ins f8rc:$frB),		defm FCTIWU : XForm_26r<63, 142, (outs f8rc:$frD), (ins f8rc:$frB),
"fctiwu", "$frD, $frB", IIC_FPGeneral,		"fctiwu", "$frD, $frB", IIC_FPGeneral,
[]>;		[]>;
defm FCTIWZ : XForm_26r<63, 15, (outs f8rc:$frD), (ins f8rc:$frB),		defm FCTIWZ : XForm_26r<63, 15, (outs f8rc:$frD), (ins f8rc:$frB),
"fctiwz", "$frD, $frB", IIC_FPGeneral,		"fctiwz", "$frD, $frB", IIC_FPGeneral,
[(set f64:$frD, (PPCfctiwz f64:$frB))]>;		[(set f64:$frD, (PPCany_fctiwz f64:$frB))]>;

defm FRSP : XForm_26r<63, 12, (outs f4rc:$frD), (ins f8rc:$frB),		defm FRSP : XForm_26r<63, 12, (outs f4rc:$frD), (ins f8rc:$frB),
"frsp", "$frD, $frB", IIC_FPGeneral,		"frsp", "$frD, $frB", IIC_FPGeneral,
[(set f32:$frD, (any_fpround f64:$frB))]>;		[(set f32:$frD, (any_fpround f64:$frB))]>;

defm FSQRT : XForm_26r<63, 22, (outs f8rc:$frD), (ins f8rc:$frB),		defm FSQRT : XForm_26r<63, 22, (outs f8rc:$frD), (ins f8rc:$frB),
"fsqrt", "$frD, $frB", IIC_FPSqrtD,		"fsqrt", "$frD, $frB", IIC_FPSqrtD,
[(set f64:$frD, (any_fsqrt f64:$frB))]>;		[(set f64:$frD, (any_fsqrt f64:$frB))]>;
▲ Show 20 Lines • Show All 2,582 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrVSX.td

Show First 20 Lines • Show All 763 Lines • ▼ Show 20 Lines	let hasSideEffects = 0 in {

// Conversion Instructions		// Conversion Instructions
def XSCVDPSP : XX2Form<60, 265,		def XSCVDPSP : XX2Form<60, 265,
(outs vsfrc:$XT), (ins vsfrc:$XB),		(outs vsfrc:$XT), (ins vsfrc:$XB),
"xscvdpsp $XT, $XB", IIC_VecFP, []>;		"xscvdpsp $XT, $XB", IIC_VecFP, []>;
def XSCVDPSXDS : XX2Form<60, 344,		def XSCVDPSXDS : XX2Form<60, 344,
(outs vsfrc:$XT), (ins vsfrc:$XB),		(outs vsfrc:$XT), (ins vsfrc:$XB),
"xscvdpsxds $XT, $XB", IIC_VecFP,		"xscvdpsxds $XT, $XB", IIC_VecFP,
[(set f64:$XT, (PPCfctidz f64:$XB))]>;		[(set f64:$XT, (PPCany_fctidz f64:$XB))]>;
let isCodeGenOnly = 1 in		let isCodeGenOnly = 1 in
def XSCVDPSXDSs : XX2Form<60, 344,		def XSCVDPSXDSs : XX2Form<60, 344,
(outs vssrc:$XT), (ins vssrc:$XB),		(outs vssrc:$XT), (ins vssrc:$XB),
"xscvdpsxds $XT, $XB", IIC_VecFP,		"xscvdpsxds $XT, $XB", IIC_VecFP,
[(set f32:$XT, (PPCfctidz f32:$XB))]>;		[(set f32:$XT, (PPCany_fctidz f32:$XB))]>;
def XSCVDPSXWS : XX2Form<60, 88,		def XSCVDPSXWS : XX2Form<60, 88,
(outs vsfrc:$XT), (ins vsfrc:$XB),		(outs vsfrc:$XT), (ins vsfrc:$XB),
"xscvdpsxws $XT, $XB", IIC_VecFP,		"xscvdpsxws $XT, $XB", IIC_VecFP,
[(set f64:$XT, (PPCfctiwz f64:$XB))]>;		[(set f64:$XT, (PPCany_fctiwz f64:$XB))]>;
let isCodeGenOnly = 1 in		let isCodeGenOnly = 1 in
def XSCVDPSXWSs : XX2Form<60, 88,		def XSCVDPSXWSs : XX2Form<60, 88,
(outs vssrc:$XT), (ins vssrc:$XB),		(outs vssrc:$XT), (ins vssrc:$XB),
"xscvdpsxws $XT, $XB", IIC_VecFP,		"xscvdpsxws $XT, $XB", IIC_VecFP,
[(set f32:$XT, (PPCfctiwz f32:$XB))]>;		[(set f32:$XT, (PPCany_fctiwz f32:$XB))]>;
def XSCVDPUXDS : XX2Form<60, 328,		def XSCVDPUXDS : XX2Form<60, 328,
(outs vsfrc:$XT), (ins vsfrc:$XB),		(outs vsfrc:$XT), (ins vsfrc:$XB),
"xscvdpuxds $XT, $XB", IIC_VecFP,		"xscvdpuxds $XT, $XB", IIC_VecFP,
[(set f64:$XT, (PPCfctiduz f64:$XB))]>;		[(set f64:$XT, (PPCany_fctiduz f64:$XB))]>;
let isCodeGenOnly = 1 in		let isCodeGenOnly = 1 in
def XSCVDPUXDSs : XX2Form<60, 328,		def XSCVDPUXDSs : XX2Form<60, 328,
(outs vssrc:$XT), (ins vssrc:$XB),		(outs vssrc:$XT), (ins vssrc:$XB),
"xscvdpuxds $XT, $XB", IIC_VecFP,		"xscvdpuxds $XT, $XB", IIC_VecFP,
[(set f32:$XT, (PPCfctiduz f32:$XB))]>;		[(set f32:$XT, (PPCany_fctiduz f32:$XB))]>;
def XSCVDPUXWS : XX2Form<60, 72,		def XSCVDPUXWS : XX2Form<60, 72,
(outs vsfrc:$XT), (ins vsfrc:$XB),		(outs vsfrc:$XT), (ins vsfrc:$XB),
"xscvdpuxws $XT, $XB", IIC_VecFP,		"xscvdpuxws $XT, $XB", IIC_VecFP,
[(set f64:$XT, (PPCfctiwuz f64:$XB))]>;		[(set f64:$XT, (PPCany_fctiwuz f64:$XB))]>;
let isCodeGenOnly = 1 in		let isCodeGenOnly = 1 in
def XSCVDPUXWSs : XX2Form<60, 72,		def XSCVDPUXWSs : XX2Form<60, 72,
(outs vssrc:$XT), (ins vssrc:$XB),		(outs vssrc:$XT), (ins vssrc:$XB),
"xscvdpuxws $XT, $XB", IIC_VecFP,		"xscvdpuxws $XT, $XB", IIC_VecFP,
[(set f32:$XT, (PPCfctiwuz f32:$XB))]>;		[(set f32:$XT, (PPCany_fctiwuz f32:$XB))]>;
def XSCVSPDP : XX2Form<60, 329,		def XSCVSPDP : XX2Form<60, 329,
(outs vsfrc:$XT), (ins vsfrc:$XB),		(outs vsfrc:$XT), (ins vsfrc:$XB),
"xscvspdp $XT, $XB", IIC_VecFP, []>;		"xscvspdp $XT, $XB", IIC_VecFP, []>;
def XSCVSXDDP : XX2Form<60, 376,		def XSCVSXDDP : XX2Form<60, 376,
(outs vsfrc:$XT), (ins vsfrc:$XB),		(outs vsfrc:$XT), (ins vsfrc:$XB),
"xscvsxddp $XT, $XB", IIC_VecFP,		"xscvsxddp $XT, $XB", IIC_VecFP,
[(set f64:$XT, (PPCfcfid f64:$XB))]>;		[(set f64:$XT, (PPCfcfid f64:$XB))]>;
def XSCVUXDDP : XX2Form<60, 360,		def XSCVUXDDP : XX2Form<60, 360,
▲ Show 20 Lines • Show All 661 Lines • ▼ Show 20 Lines	let mayRaiseFPException = 1 in {
// Round & Convert QP -> DP (dword[1] is set to zero)		// Round & Convert QP -> DP (dword[1] is set to zero)
def XSCVQPDP : X_VT5_XO5_VB5_VSFR<63, 20, 836, "xscvqpdp" , []>;		def XSCVQPDP : X_VT5_XO5_VB5_VSFR<63, 20, 836, "xscvqpdp" , []>;
def XSCVQPDPO : X_VT5_XO5_VB5_VSFR_Ro<63, 20, 836, "xscvqpdpo",		def XSCVQPDPO : X_VT5_XO5_VB5_VSFR_Ro<63, 20, 836, "xscvqpdpo",
[(set f64:$vT,		[(set f64:$vT,
(int_ppc_truncf128_round_to_odd		(int_ppc_truncf128_round_to_odd
f128:$vB))]>;		f128:$vB))]>;
}		}

// FIXME: Setting the hasSideEffects flag here to match current behaviour.
// Truncate & Convert QP -> (Un)Signed (D)Word (dword[1] is set to zero)		// Truncate & Convert QP -> (Un)Signed (D)Word (dword[1] is set to zero)
let hasSideEffects = 1 in {		let mayRaiseFPException = 1 in {
def XSCVQPSDZ : X_VT5_XO5_VB5<63, 25, 836, "xscvqpsdz", []>;		def XSCVQPSDZ : X_VT5_XO5_VB5<63, 25, 836, "xscvqpsdz", []>;
def XSCVQPSWZ : X_VT5_XO5_VB5<63, 9, 836, "xscvqpswz", []>;		def XSCVQPSWZ : X_VT5_XO5_VB5<63, 9, 836, "xscvqpswz", []>;
def XSCVQPUDZ : X_VT5_XO5_VB5<63, 17, 836, "xscvqpudz", []>;		def XSCVQPUDZ : X_VT5_XO5_VB5<63, 17, 836, "xscvqpudz", []>;
def XSCVQPUWZ : X_VT5_XO5_VB5<63, 1, 836, "xscvqpuwz", []>;		def XSCVQPUWZ : X_VT5_XO5_VB5<63, 1, 836, "xscvqpuwz", []>;
}		}

// Convert (Un)Signed DWord -> QP.		// Convert (Un)Signed DWord -> QP.
def XSCVSDQP : X_VT5_XO5_VB5_TyVB<63, 10, 836, "xscvsdqp", vfrc, []>;		def XSCVSDQP : X_VT5_XO5_VB5_TyVB<63, 10, 836, "xscvsdqp", vfrc, []>;
▲ Show 20 Lines • Show All 1,778 Lines • ▼ Show 20 Lines
def : Pat<(store (f64 (extractelt v2f64:$A, 1)), xoaddr:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 1)), xoaddr:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),
xoaddr:$src)>;		xoaddr:$src)>;
} // HasVSX, HasP8Vector, NoP9Vector, IsBigEndian		} // HasVSX, HasP8Vector, NoP9Vector, IsBigEndian

// Little endian pre-Power9 VSX subtarget.		// Little endian pre-Power9 VSX subtarget.
let Predicates = [HasVSX, HasP8Vector, NoP9Vector, IsLittleEndian] in {		let Predicates = [HasVSX, HasP8Vector, NoP9Vector, IsLittleEndian] in {
def : Pat<(store (i64 (extractelt v2i64:$A, 0)), xoaddr:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 0)), xoaddr:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),
		steven.zhangUnsubmitted Done Reply Inline Actions Please move them into some group has the semantics of truncating. It is not bitconvert. steven.zhang: Please move them into some group has the semantics of truncating. It is not bitconvert.
xoaddr:$src)>;		xoaddr:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 0)), xoaddr:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 0)), xoaddr:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),
xoaddr:$src)>;		xoaddr:$src)>;
def : Pat<(store (i64 (extractelt v2i64:$A, 1)), xoaddr:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 1)), xoaddr:$src),
(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;		(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 1)), xoaddr:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 1)), xoaddr:$src),
(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;		(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;
▲ Show 20 Lines • Show All 471 Lines • ▼ Show 20 Lines
def : Pat<(f128 (uint_to_fp ScalarLoads.ZELi16)),		def : Pat<(f128 (uint_to_fp ScalarLoads.ZELi16)),
(f128 (XSCVUDQP (LXSIHZX xaddr:$src)))>;		(f128 (XSCVUDQP (LXSIHZX xaddr:$src)))>;

// Convert Unsigned Byte in memory -> QP		// Convert Unsigned Byte in memory -> QP
def : Pat<(f128 (uint_to_fp ScalarLoads.ZELi8)),		def : Pat<(f128 (uint_to_fp ScalarLoads.ZELi8)),
(f128 (XSCVUDQP (LXSIBZX xoaddr:$src)))>;		(f128 (XSCVUDQP (LXSIBZX xoaddr:$src)))>;

// Truncate & Convert QP -> (Un)Signed (D)Word.		// Truncate & Convert QP -> (Un)Signed (D)Word.
def : Pat<(i64 (fp_to_sint f128:$src)), (i64 (MFVRD (XSCVQPSDZ $src)))>;		def : Pat<(i64 (any_fp_to_sint f128:$src)), (i64 (MFVRD (XSCVQPSDZ $src)))>;
def : Pat<(i64 (fp_to_uint f128:$src)), (i64 (MFVRD (XSCVQPUDZ $src)))>;		def : Pat<(i64 (any_fp_to_uint f128:$src)), (i64 (MFVRD (XSCVQPUDZ $src)))>;
def : Pat<(i32 (fp_to_sint f128:$src)),		def : Pat<(i32 (any_fp_to_sint f128:$src)),
(i32 (MFVSRWZ (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC)))>;		(i32 (MFVSRWZ (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC)))>;
def : Pat<(i32 (fp_to_uint f128:$src)),		def : Pat<(i32 (any_fp_to_uint f128:$src)),
(i32 (MFVSRWZ (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC)))>;		(i32 (MFVSRWZ (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC)))>;

		uweigandUnsubmitted Done Reply Inline Actions This also doesn't look quite correct. The XSCVQP... instructions are not (yet?) marked as mayRaiseFPException, instead they're marked as hasSideEffects. This means that the exception flag is probably not going to be automatically transferred over to the MI level. I think if the instructions are changed to set mayRaiseFPException, that should work correctly. But it would be best to have a test case that validates that the "nofpexcept" marker is transferred depending on the value of the "fpexect." metadata in the strict intrinsic (in LLVM IR). uweigand: This also doesn't look quite correct. The XSCVQP... instructions are not (yet?) marked as…
		qiucfAuthorUnsubmitted Done Reply Inline Actions Thanks for the reminder. The FP exception bits in PPC instruction definition files need to be carefully re-examined with more tests.. qiucf: Thanks for the reminder. The FP exception bits in PPC instruction definition files need to be…
// Instructions for store(fptosi).		// Instructions for store(fptosi).
// The 8-byte version is repeated here due to availability of D-Form STXSD.		// The 8-byte version is repeated here due to availability of D-Form STXSD.
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), xaddrX4:$dst, 8),		(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), xaddrX4:$dst, 8),
(STXSDX (COPY_TO_REGCLASS (XSCVQPSDZ f128:$src), VFRC),		(STXSDX (COPY_TO_REGCLASS (XSCVQPSDZ f128:$src), VFRC),
xaddrX4:$dst)>;		xaddrX4:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), iaddrX4:$dst, 8),		(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), iaddrX4:$dst, 8),
▲ Show 20 Lines • Show All 862 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64-unknown-linux -mcpu=pwr8 \| FileCheck %s\
				; RUN: -check-prefix=P8
				steven.zhangUnsubmitted Done Reply Inline Actions Please specify option -enable-ppc-quad-precision to enable the quad precision support in powerpc. steven.zhang: Please specify option -enable-ppc-quad-precision to enable the quad precision support in…
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64le-unknown-linux -mcpu=pwr9 \| FileCheck %s \
				; RUN: -check-prefix=P9
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64le-unknown-linux -mcpu=pwr8 -mattr=-vsx \
				; RUN: \| FileCheck %s -check-prefix=NOVSX
				; RUN: llc -mtriple=powerpc64le-unknown-linux -mcpu=pwr9 < %s -simplify-mir \
				; RUN: -stop-after=machine-cp \| FileCheck %s -check-prefix=MIR

				declare i32 @llvm.experimental.constrained.fptosi.i32.f128(fp128, metadata)
				declare i64 @llvm.experimental.constrained.fptosi.i64.f128(fp128, metadata)
				declare i64 @llvm.experimental.constrained.fptoui.i64.f128(fp128, metadata)
				steven.zhangUnsubmitted Done Reply Inline Actions So, what is it if it is ppcfp128 ? steven.zhang: So, what is it if it is ppcfp128 ?
				declare i32 @llvm.experimental.constrained.fptoui.i32.f128(fp128, metadata)

				declare i32 @llvm.experimental.constrained.fptosi.i32.ppcf128(ppc_fp128, metadata)
				declare i64 @llvm.experimental.constrained.fptosi.i64.ppcf128(ppc_fp128, metadata)
				declare i64 @llvm.experimental.constrained.fptoui.i64.ppcf128(ppc_fp128, metadata)
				declare i32 @llvm.experimental.constrained.fptoui.i32.ppcf128(ppc_fp128, metadata)

				declare i128 @llvm.experimental.constrained.fptosi.i128.ppcf128(ppc_fp128, metadata)
				declare i128 @llvm.experimental.constrained.fptoui.i128.ppcf128(ppc_fp128, metadata)
				declare i128 @llvm.experimental.constrained.fptosi.i128.f128(fp128, metadata)
				declare i128 @llvm.experimental.constrained.fptoui.i128.f128(fp128, metadata)

				define i128 @q_to_i128(fp128 %m) #0 {
				; P8-LABEL: q_to_i128:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixtfti
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: q_to_i128:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixtfti
				; P9-NEXT: nop
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: q_to_i128:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixtfti
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				entry:
				%conv = tail call i128 @llvm.experimental.constrained.fptosi.i128.f128(fp128 %m, metadata !"fpexcept.strict") #0
				ret i128 %conv
				}

				define i128 @q_to_u128(fp128 %m) #0 {
				; P8-LABEL: q_to_u128:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixunstfti
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: q_to_u128:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixunstfti
				; P9-NEXT: nop
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: q_to_u128:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixunstfti
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				entry:
				%conv = tail call i128 @llvm.experimental.constrained.fptoui.i128.f128(fp128 %m, metadata !"fpexcept.strict") #0
				ret i128 %conv
				}

				define i128 @ppcq_to_i128(ppc_fp128 %m) #0 {
				; P8-LABEL: ppcq_to_i128:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixtfti
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: ppcq_to_i128:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixtfti
				; P9-NEXT: nop
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: ppcq_to_i128:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixtfti
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				entry:
				%conv = tail call i128 @llvm.experimental.constrained.fptosi.i128.ppcf128(ppc_fp128 %m, metadata !"fpexcept.strict") #0
				ret i128 %conv
				}

				define i128 @ppcq_to_u128(ppc_fp128 %m) #0 {
				; P8-LABEL: ppcq_to_u128:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixtfti
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: ppcq_to_u128:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixtfti
				; P9-NEXT: nop
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: ppcq_to_u128:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixtfti
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				entry:
				%conv = tail call i128 @llvm.experimental.constrained.fptosi.i128.ppcf128(ppc_fp128 %m, metadata !"fpexcept.strict") #0
				ret i128 %conv
				}

				define signext i32 @q_to_i32(fp128 %m) #0 {
				; P8-LABEL: q_to_i32:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixkfsi
				; P8-NEXT: nop
				; P8-NEXT: extsw r3, r3
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: q_to_i32:
				; P9: # %bb.0: # %entry
				; P9-NEXT: xscvqpswz v2, v2
				; P9-NEXT: mfvsrwz r3, v2
				; P9-NEXT: extsw r3, r3
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: q_to_i32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixkfsi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: extsw r3, r3
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				;
				; MIR-LABEL: name: q_to_i32
				; MIR: renamable $v{{[0-9]+}} = XSCVQPSWZ
				; MIR-NEXT: renamable $r{{[0-9]+}} = MFVSRWZ
				entry:
				%conv = tail call i32 @llvm.experimental.constrained.fptosi.i32.f128(fp128 %m, metadata !"fpexcept.strict") #0
				ret i32 %conv
				}

				define i64 @q_to_i64(fp128 %m) #0 {
				; P8-LABEL: q_to_i64:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixkfdi
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: q_to_i64:
				; P9: # %bb.0: # %entry
				; P9-NEXT: xscvqpsdz v2, v2
				; P9-NEXT: mfvsrd r3, v2
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: q_to_i64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixkfdi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				;
				; MIR-LABEL: name: q_to_i64
				; MIR: renamable $v{{[0-9]+}} = XSCVQPSDZ
				; MIR-NEXT: renamable $x{{[0-9]+}} = MFVRD
				entry:
				%conv = tail call i64 @llvm.experimental.constrained.fptosi.i64.f128(fp128 %m, metadata !"fpexcept.strict") #0
				ret i64 %conv
				}

				define i64 @q_to_u64(fp128 %m) #0 {
				; P8-LABEL: q_to_u64:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixunskfdi
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: q_to_u64:
				; P9: # %bb.0: # %entry
				; P9-NEXT: xscvqpudz v2, v2
				; P9-NEXT: mfvsrd r3, v2
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: q_to_u64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixunskfdi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				;
				; MIR-LABEL: name: q_to_u64
				; MIR: renamable $v{{[0-9]+}} = XSCVQPUDZ
				; MIR-NEXT: renamable $x{{[0-9]+}} = MFVRD
				entry:
				%conv = tail call i64 @llvm.experimental.constrained.fptoui.i64.f128(fp128 %m, metadata !"fpexcept.strict") #0
				ret i64 %conv
				}

				define zeroext i32 @q_to_u32(fp128 %m) #0 {
				; P8-LABEL: q_to_u32:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixunskfsi
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: q_to_u32:
				; P9: # %bb.0: # %entry
				; P9-NEXT: xscvqpuwz v2, v2
				; P9-NEXT: mfvsrwz r3, v2
				; P9-NEXT: clrldi r3, r3, 32
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: q_to_u32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixunskfsi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				;
				; MIR-LABEL: name: q_to_u32
				; MIR: renamable $v{{[0-9]+}} = XSCVQPUWZ
				; MIR-NEXT: renamable $r{{[0-9]+}} = MFVSRWZ
				entry:
				%conv = tail call i32 @llvm.experimental.constrained.fptoui.i32.f128(fp128 %m, metadata !"fpexcept.strict") #0
				ret i32 %conv
				}

				define signext i32 @ppcq_to_i32(ppc_fp128 %m) #0 {
				; P8-LABEL: ppcq_to_i32:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __gcc_qtou
				; P8-NEXT: nop
				; P8-NEXT: extsw r3, r3
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: ppcq_to_i32:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __gcc_qtou
				; P9-NEXT: nop
				; P9-NEXT: extsw r3, r3
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: ppcq_to_i32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __gcc_qtou
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: extsw r3, r3
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				entry:
				%conv = tail call i32 @llvm.experimental.constrained.fptosi.i32.ppcf128(ppc_fp128 %m, metadata !"fpexcept.strict") #0
				ret i32 %conv
				}

				define i64 @ppcq_to_i64(ppc_fp128 %m) #0 {
				; P8-LABEL: ppcq_to_i64:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixtfdi
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: ppcq_to_i64:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixtfdi
				; P9-NEXT: nop
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: ppcq_to_i64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixtfdi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				entry:
				%conv = tail call i64 @llvm.experimental.constrained.fptosi.i64.ppcf128(ppc_fp128 %m, metadata !"fpexcept.strict") #0
				ret i64 %conv
				}

				define i64 @ppcq_to_u64(ppc_fp128 %m) #0 {
				; P8-LABEL: ppcq_to_u64:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixunstfdi
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: ppcq_to_u64:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixunstfdi
				; P9-NEXT: nop
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: ppcq_to_u64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixunstfdi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				entry:
				%conv = tail call i64 @llvm.experimental.constrained.fptoui.i64.ppcf128(ppc_fp128 %m, metadata !"fpexcept.strict") #0
				ret i64 %conv
				}

				define zeroext i32 @ppcq_to_u32(ppc_fp128 %m) #0 {
				; P8-LABEL: ppcq_to_u32:
				; P8: # %bb.0: # %entry
				; P8-NEXT: mflr r0
				; P8-NEXT: std r0, 16(r1)
				; P8-NEXT: stdu r1, -112(r1)
				; P8-NEXT: .cfi_def_cfa_offset 112
				; P8-NEXT: .cfi_offset lr, 16
				; P8-NEXT: bl __fixunstfsi
				; P8-NEXT: nop
				; P8-NEXT: addi r1, r1, 112
				; P8-NEXT: ld r0, 16(r1)
				; P8-NEXT: mtlr r0
				; P8-NEXT: blr
				;
				; P9-LABEL: ppcq_to_u32:
				; P9: # %bb.0: # %entry
				; P9-NEXT: mflr r0
				; P9-NEXT: std r0, 16(r1)
				; P9-NEXT: stdu r1, -32(r1)
				; P9-NEXT: .cfi_def_cfa_offset 32
				; P9-NEXT: .cfi_offset lr, 16
				; P9-NEXT: bl __fixunstfsi
				; P9-NEXT: nop
				; P9-NEXT: addi r1, r1, 32
				; P9-NEXT: ld r0, 16(r1)
				; P9-NEXT: mtlr r0
				; P9-NEXT: blr
				;
				; NOVSX-LABEL: ppcq_to_u32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: mflr r0
				; NOVSX-NEXT: std r0, 16(r1)
				; NOVSX-NEXT: stdu r1, -32(r1)
				; NOVSX-NEXT: .cfi_def_cfa_offset 32
				; NOVSX-NEXT: .cfi_offset lr, 16
				; NOVSX-NEXT: bl __fixunstfsi
				; NOVSX-NEXT: nop
				; NOVSX-NEXT: addi r1, r1, 32
				; NOVSX-NEXT: ld r0, 16(r1)
				; NOVSX-NEXT: mtlr r0
				; NOVSX-NEXT: blr
				entry:
				%conv = tail call i32 @llvm.experimental.constrained.fptoui.i32.ppcf128(ppc_fp128 %m, metadata !"fpexcept.strict") #0
				ret i32 %conv
				}

				define void @fptoint_nofpexcept(fp128 %m, i32* %addr1, i64* %addr2) {
				; MIR-LABEL: name: fptoint_nofpexcept
				; MIR: renamable $v{{[0-9]+}} = nofpexcept XSCVQPSWZ
				; MIR: renamable $v{{[0-9]+}} = nofpexcept XSCVQPUWZ
				; MIR: renamable $v{{[0-9]+}} = nofpexcept XSCVQPSDZ
				; MIR: renamable $v{{[0-9]+}} = nofpexcept XSCVQPUDZ
				entry:
				%conv1 = tail call i32 @llvm.experimental.constrained.fptosi.i32.f128(fp128 %m, metadata !"fpexcept.ignore") #0
				store volatile i32 %conv1, i32* %addr1, align 4
				%conv2 = tail call i32 @llvm.experimental.constrained.fptoui.i32.f128(fp128 %m, metadata !"fpexcept.ignore") #0
				store volatile i32 %conv2, i32* %addr1, align 4
				%conv3 = tail call i64 @llvm.experimental.constrained.fptosi.i64.f128(fp128 %m, metadata !"fpexcept.ignore") #0
				store volatile i64 %conv3, i64* %addr2, align 8
				%conv4 = tail call i64 @llvm.experimental.constrained.fptoui.i64.f128(fp128 %m, metadata !"fpexcept.ignore") #0
				store volatile i64 %conv4, i64* %addr2, align 8
				ret void
				}

				attributes #0 = { strictfp }

llvm/test/CodeGen/PowerPC/fp-strict-conv.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64-unknown-linux -mcpu=pwr8 \| FileCheck %s
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64le-unknown-linux -mcpu=pwr9 \| FileCheck %s
				; RUN: llc -verify-machineinstrs -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
				; RUN: < %s -mtriple=powerpc64le-unknown-linux -mcpu=pwr8 -mattr=-vsx \| \
				; RUN: FileCheck %s -check-prefix=NOVSX

				steven.zhangUnsubmitted Done Reply Inline Actions Add run for SPE target. steven.zhang: Add run for SPE target.
				declare i32 @llvm.experimental.constrained.fptosi.i32.f64(double, metadata)
				declare i64 @llvm.experimental.constrained.fptosi.i64.f64(double, metadata)
				steven.zhangUnsubmitted Done Reply Inline Actions A a test for fp128 steven.zhang: A a test for fp128
				declare i64 @llvm.experimental.constrained.fptoui.i64.f64(double, metadata)
				declare i32 @llvm.experimental.constrained.fptoui.i32.f64(double, metadata)

				declare i32 @llvm.experimental.constrained.fptosi.i32.f32(float, metadata)
				declare i64 @llvm.experimental.constrained.fptosi.i64.f32(float, metadata)
				declare i64 @llvm.experimental.constrained.fptoui.i64.f32(float, metadata)
				declare i32 @llvm.experimental.constrained.fptoui.i32.f32(float, metadata)

				declare double @llvm.experimental.constrained.sitofp.f64.i32(i32, metadata, metadata)
				declare double @llvm.experimental.constrained.sitofp.f64.i64(i64, metadata, metadata)
				declare double @llvm.experimental.constrained.uitofp.f64.i32(i32, metadata, metadata)
				declare double @llvm.experimental.constrained.uitofp.f64.i64(i64, metadata, metadata)

				declare float @llvm.experimental.constrained.sitofp.f32.i64(i64, metadata, metadata)
				declare float @llvm.experimental.constrained.sitofp.f32.i32(i32, metadata, metadata)
				declare float @llvm.experimental.constrained.uitofp.f32.i32(i32, metadata, metadata)
				declare float @llvm.experimental.constrained.uitofp.f32.i64(i64, metadata, metadata)

				define i32 @d_to_i32(double %m) #0 {
				; CHECK-LABEL: d_to_i32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpsxws f0, f1
				; CHECK-NEXT: mffprwz r3, f0
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: d_to_i32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiwz f0, f1
				; NOVSX-NEXT: addi r3, r1, -4
				; NOVSX-NEXT: stfiwx f0, 0, r3
				; NOVSX-NEXT: lwz r3, -4(r1)
				; NOVSX-NEXT: blr
				entry:
				%conv = call i32 @llvm.experimental.constrained.fptosi.i32.f64(double %m, metadata !"fpexcept.strict") #0
				ret i32 %conv
				}

				define i64 @d_to_i64(double %m) #0 {
				; CHECK-LABEL: d_to_i64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpsxds f0, f1
				; CHECK-NEXT: mffprd r3, f0
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: d_to_i64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctidz f0, f1
				; NOVSX-NEXT: stfd f0, -8(r1)
				; NOVSX-NEXT: ld r3, -8(r1)
				; NOVSX-NEXT: blr
				entry:
				%conv = call i64 @llvm.experimental.constrained.fptosi.i64.f64(double %m, metadata !"fpexcept.strict") #0
				ret i64 %conv
				}

				define i64 @d_to_u64(double %m) #0 {
				; CHECK-LABEL: d_to_u64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpuxds f0, f1
				; CHECK-NEXT: mffprd r3, f0
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: d_to_u64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiduz f0, f1
				; NOVSX-NEXT: stfd f0, -8(r1)
				; NOVSX-NEXT: ld r3, -8(r1)
				; NOVSX-NEXT: blr
				entry:
				%conv = call i64 @llvm.experimental.constrained.fptoui.i64.f64(double %m, metadata !"fpexcept.strict") #0
				ret i64 %conv
				}

				define zeroext i32 @d_to_u32(double %m) #0 {
				; CHECK-LABEL: d_to_u32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpuxws f0, f1
				; CHECK-NEXT: mffprwz r3, f0
				; CHECK-NEXT: clrldi r3, r3, 32
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: d_to_u32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiwuz f0, f1
				; NOVSX-NEXT: addi r3, r1, -4
				; NOVSX-NEXT: stfiwx f0, 0, r3
				; NOVSX-NEXT: lwz r3, -4(r1)
				; NOVSX-NEXT: blr
				entry:
				%conv = call i32 @llvm.experimental.constrained.fptoui.i32.f64(double %m, metadata !"fpexcept.strict") #0
				ret i32 %conv
				}

				define signext i32 @f_to_i32(float %m) #0 {
				; CHECK-LABEL: f_to_i32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpsxws f0, f1
				; CHECK-NEXT: mffprwz r3, f0
				; CHECK-NEXT: extsw r3, r3
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: f_to_i32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiwz f0, f1
				; NOVSX-NEXT: addi r3, r1, -4
				; NOVSX-NEXT: stfiwx f0, 0, r3
				; NOVSX-NEXT: lwa r3, -4(r1)
				; NOVSX-NEXT: blr
				entry:
				%conv = call i32 @llvm.experimental.constrained.fptosi.i32.f32(float %m, metadata !"fpexcept.strict") #0
				ret i32 %conv
				}

				define i64 @f_to_i64(float %m) #0 {
				; CHECK-LABEL: f_to_i64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpsxds f0, f1
				; CHECK-NEXT: mffprd r3, f0
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: f_to_i64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctidz f0, f1
				; NOVSX-NEXT: stfd f0, -8(r1)
				; NOVSX-NEXT: ld r3, -8(r1)
				; NOVSX-NEXT: blr
				entry:
				%conv = call i64 @llvm.experimental.constrained.fptosi.i64.f32(float %m, metadata !"fpexcept.strict") #0
				ret i64 %conv
				}

				define i64 @f_to_u64(float %m) #0 {
				; CHECK-LABEL: f_to_u64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpuxds f0, f1
				; CHECK-NEXT: mffprd r3, f0
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: f_to_u64:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiduz f0, f1
				; NOVSX-NEXT: stfd f0, -8(r1)
				; NOVSX-NEXT: ld r3, -8(r1)
				; NOVSX-NEXT: blr
				entry:
				%conv = call i64 @llvm.experimental.constrained.fptoui.i64.f32(float %m, metadata !"fpexcept.strict") #0
				ret i64 %conv
				}

				define zeroext i32 @f_to_u32(float %m) #0 {
				; CHECK-LABEL: f_to_u32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscvdpuxws f0, f1
				; CHECK-NEXT: mffprwz r3, f0
				; CHECK-NEXT: clrldi r3, r3, 32
				; CHECK-NEXT: blr
				;
				; NOVSX-LABEL: f_to_u32:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fctiwuz f0, f1
				; NOVSX-NEXT: addi r3, r1, -4
				steven.zhangUnsubmitted Done Reply Inline Actions Does this attribute need ? steven.zhang: Does this attribute need ?
				qiucfAuthorUnsubmitted Done Reply Inline Actions All function calls done in a function that uses constrained floating point intrinsics must have the strictfp attribute. Although output won't change if we remove this attr. It's better to keep it according to langref. qiucf: > All function calls done in a function that uses constrained floating point intrinsics must…
				; NOVSX-NEXT: stfiwx f0, 0, r3
				; NOVSX-NEXT: lwz r3, -4(r1)
				; NOVSX-NEXT: blr
				entry:
				%conv = call i32 @llvm.experimental.constrained.fptoui.i32.f32(float %m, metadata !"fpexcept.strict") #0
				ret i32 %conv
				}

				attributes #0 = { strictfp }