This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/Target/RISCV/
-
lib/
-
Target/
-
RISCV/
-
RISCVInstrInfoD.td
-
RISCVInstrInfoF.td
-
RISCVInstrInfoZfh.td

Differential D94163

[RISCV] Set dependency on floating point CSRs, 1/3
Needs ReviewPublic

Authored by sepavloff on Jan 6 2021, 2:31 AM.

Download Raw Diff

Details

Reviewers

asb
Hsiang-Kai
craig.topper
jrtc27

Summary

There are dependencies between floating point instructions that were
missed from the target description. They involve special registers that
keep exception flags and rounding mode. Most FP instructions can set
accrued exception flags, so they are implicit definitions for 'fflags'.
Instructions that use dynamic rounding mode depend on the content of
'frm', they represent implicit uses of this register. These dependencies
impose restrictions on the ordering of FP instructions and must be
provided to the compiler.

In general there can be 4 variants of an instruction depending on
whether it depends on frm and whether the changes of fflags should
be ignored. The two most important of them are:

defines 'fflags', depends on 'frm';
does not depend on frm, changes of fflags should be ignored.

The first one is the general case, so it must be supported in any case.
The second corresponds to default FP environment, which is the most
widespread case. Other two might be beneficial in some cases, but are
expected to be rarer.

This change defines these two variants for relevant FP instructions. It
is split into several parts to facilitate review. This part implements
the dependency for instructions with three input registers.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sepavloff created this revision.Jan 6 2021, 2:31 AM

Herald added subscribers: frasercrmck, NickHung, evandro and 25 others. · View Herald TranscriptJan 6 2021, 2:31 AM

sepavloff requested review of this revision.Jan 6 2021, 2:31 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2021, 2:31 AM

Herald added a subscriber: MaskRay. · View Herald Transcript

sepavloff mentioned this in D93624: [RISCV] Fix rounding mode in lowering of float operations.Jan 6 2021, 2:35 AM

sepavloff added a child revision: D94164: [RISCV] Set dependency on floating point CSRs, 2/3.Jan 6 2021, 2:42 AM

Harbormaster completed remote builds in B84177: Diff 314831.Jan 6 2021, 3:27 AM

I still don't understand why the existence of static rounding modes in the ISA requires that we have to use them for the default environment. X86 doesn't have static rounding mode prior to AVX512 so uses dynamic in the default mode.

Do any other targets create 3 variants of instructions like this? X86 definitely doesn't. What makes RISC-V special that it needs to go this extreme?

In D94163#2482528, @craig.topper wrote:

I still don't understand why the existence of static rounding modes in the ISA requires that we have to use them for the default environment. X86 doesn't have static rounding mode prior to AVX512 so uses dynamic in the default mode.

It is more convenient. Instructions with static rounding mode do not depend on frm so they may be scheduled more freely. Besides function with static only FP instructions may be safely called from non-default FP environment. Targets without static rounding mode don't have such possibility.

Do any other targets create 3 variants of instructions like this? X86 definitely doesn't. What makes RISC-V special that it needs to go this extreme?

It must be a target that supports both static and dynamic rounding mode. Are there other targets in officially llvm repository that support both of them?

In D94163#2489020, @sepavloff wrote:

In D94163#2482528, @craig.topper wrote:

I still don't understand why the existence of static rounding modes in the ISA requires that we have to use them for the default environment. X86 doesn't have static rounding mode prior to AVX512 so uses dynamic in the default mode.

It is more convenient. Instructions with static rounding mode do not depend on frm so they may be scheduled more freely. Besides function with static only FP instructions may be safely called from non-default FP environment. Targets without static rounding mode don't have such possibility.

If there’s no write to frm then there shouldn’t be a scheduling issue. Can you demonstrate such an issue on a target without static rounding mode?

For the exception side I thought we already solved the scheduling issue with the mayRaiseFPException bit that is treated differently than using a register def of the fflags register.

Do any other targets create 3 variants of instructions like this? X86 definitely doesn't. What makes RISC-V special that it needs to go this extreme?

It must be a target that supports both static and dynamic rounding mode. Are there other targets in officially llvm repository that support both of them?

In D94163#2489101, @craig.topper wrote:

In D94163#2489020, @sepavloff wrote:

In D94163#2482528, @craig.topper wrote:

I still don't understand why the existence of static rounding modes in the ISA requires that we have to use them for the default environment. X86 doesn't have static rounding mode prior to AVX512 so uses dynamic in the default mode.

It is more convenient. Instructions with static rounding mode do not depend on frm so they may be scheduled more freely. Besides function with static only FP instructions may be safely called from non-default FP environment. Targets without static rounding mode don't have such possibility.

If there’s no write to frm then there shouldn’t be a scheduling issue.

Sure. Such issue rises when there is write to frm. Consider the following pseudo code:

float a = ...
for (int i = ...) {
  fesetround(FE_TOWARDZERO); // csrw frm, 1
  ...
  x[i] += floor(a); // fcvt ..., rdn

floor(a) is a loop invariant and could be hoisted off the loop. It is possible as fcvt uses static rounding. However if fcvt uses dynamic rounding, it depends on frm, which is changed above, so it cannot be moved out of the loop.

Herald added a subscriber: vkmr. · View Herald TranscriptFeb 1 2021, 9:10 PM

In D94163#2535646, @sepavloff wrote:
In D94163#2489101, @craig.topper wrote:

In D94163#2489020, @sepavloff wrote:

In D94163#2482528, @craig.topper wrote:

I still don't understand why the existence of static rounding modes in the ISA requires that we have to use them for the default environment. X86 doesn't have static rounding mode prior to AVX512 so uses dynamic in the default mode.

It is more convenient. Instructions with static rounding mode do not depend on frm so they may be scheduled more freely. Besides function with static only FP instructions may be safely called from non-default FP environment. Targets without static rounding mode don't have such possibility.

If there’s no write to frm then there shouldn’t be a scheduling issue.

Sure. Such issue rises when there is write to frm. Consider the following pseudo code:
float a = ...
for (int i = ...) {
  fesetround(FE_TOWARDZERO); // csrw frm, 1
  ...
  x[i] += floor(a); // fcvt ..., rdn
floor(a) is a loop invariant and could be hoisted off the loop. It is possible as fcvt uses static rounding. However if fcvt uses dynamic rounding, it depends on frm, which is changed above, so it cannot be moved out of the loop.

Why wouldn't that have been hoisted out of the loop by IR LICM? Machine LICM is primarily intended to move stack reloads and constant pool loads. It only runs on the outermost loop with a preheader.

In D94163#2535653, @craig.topper wrote:
In D94163#2535646, @sepavloff wrote:
In D94163#2489101, @craig.topper wrote:

In D94163#2489020, @sepavloff wrote:

In D94163#2482528, @craig.topper wrote:

I still don't understand why the existence of static rounding modes in the ISA requires that we have to use them for the default environment. X86 doesn't have static rounding mode prior to AVX512 so uses dynamic in the default mode.

It is more convenient. Instructions with static rounding mode do not depend on frm so they may be scheduled more freely. Besides function with static only FP instructions may be safely called from non-default FP environment. Targets without static rounding mode don't have such possibility.

If there’s no write to frm then there shouldn’t be a scheduling issue.

Sure. Such issue rises when there is write to frm. Consider the following pseudo code:
float a = ...
for (int i = ...) {
  fesetround(FE_TOWARDZERO); // csrw frm, 1
  ...
  x[i] += floor(a); // fcvt ..., rdn
floor(a) is a loop invariant and could be hoisted off the loop. It is possible as fcvt uses static rounding. However if fcvt uses dynamic rounding, it depends on frm, which is changed above, so it cannot be moved out of the loop.
Why wouldn't that have been hoisted out of the loop by IR LICM? Machine LICM is primarily intended to move stack reloads and constant pool loads. It only runs on the outermost loop with a preheader.

This is another example:

%1:fpr32 = …
%2:fpr32 = …
...
csrw frm, rdn
…
%3:fpr32 = FADD_S killed %1:fpr32, killed %2:fpr32, rne

In this code scheduler can move the instruction FADD_S upward, live ranges of %1 and %2 becomes shorter and register pressure decreases. If FADD instruction implicitly depends on frm, the scheduler cannot move FADD_S above csrw, so such optimization is not possible.

Reduced number of instruction variants from 3 to 2 (generic and default)

sepavloff edited the summary of this revision. (Show Details)Feb 26 2021, 2:37 AM

lenary removed a subscriber: lenary.Feb 26 2021, 2:59 AM

Harbormaster completed remote builds in B90996: Diff 326636.Feb 26 2021, 4:31 AM

We discussed this briefly in the RISC-V call as I noted this patchset has been sat open for some time. One thing that might be helpful is whether you could say a little bit more about the goal for this patchset. Once this lands, what's the next step? Is there some relevant RFC, or equivalent changes being made to other in-tree architectures?

We're also still unclear about the advantage of changing codegen to default to a static rounding mode (which might be a surprising change, as all software compiled to date on both GCC and LLVM has used used the dynamic rounding mode by default).

In D94163#2603738, @asb wrote:

We discussed this briefly in the RISC-V call as I noted this patchset has been sat open for some time. One thing that might be helpful is whether you could say a little bit more about the goal for this patchset.

Most floating point instructions set accrued exception bits in fflags register. If an instruction is specified with dynamic rounding mode, it also depends on the content of frm register. So in the following code:

csrwi  frm, a1
fadd.d ft2, ft2, ft3

changing the order of instruction is not allowed, because fadd.d depends on the content of frm, which is changed by the previous instruction. Similarly, the code:

fadd.d ft2, ft2, ft3
csrrs t0, fflags, zero

does not allow to change the order of the instructions, as crsrs reads content of fflags, which is set by the first instruction.

Now nothing prevents the compiler from changing the order of instructions in these examples. Existing instruction definitions are unable to express these dependencies. It does not allow to write programs that use non-default floating point environment, for example, dynamic rounding mode.

The goal of this patchset is to establish means to express such dependencies.

Once this lands, what's the next step?

This is the first and necessary step toward full-fledged implementation of floating point support. In particular it would allow progress in D91242 and D90854. Actually this work was undertaken because the lack of dependencies prevents implementation of llvm.set.rounding (https://reviews.llvm.org/D91242#2400476). The next steps, of course, include support of constrained intrinsics for RSCV.

Is there some relevant RFC,

If RFC can facilitate review of this patchset, I will prepare it.

or equivalent changes being made to other in-tree architectures?

Targets that support full-fledged floating point operations add implicit uses and definitions of FP state and control register(s). For example:
• X86: https://github.com/llvm/llvm-project/blob/bc172e532a89754d47fef1306064a26a4dc0a76b/llvm/lib/Target/X86/X86InstrFPStack.td#L728 (see let Defs = [FPSW] and let Uses = [FPCW]),
• PowerPC: https://github.com/llvm/llvm-project/blob/e7361c8eccb7663146096622549dc03240414157/llvm/lib/Target/PowerPC/PPCInstrInfo.td#L3169 (see Uses = [RM]),
• SystemZ: https://github.com/llvm/llvm-project/blob/9e28b89827a3be4ab602b40c263839665af06b4a/llvm/lib/Target/SystemZ/SystemZInstrFP.td#L434 (see let Uses = [FPC])

We're also still unclear about the advantage of changing codegen to default to a static rounding mode (which might be a surprising change, as all software compiled to date on both GCC and LLVM has used used the dynamic rounding mode by default).

Instructions in assembler without explicit rounding mode specification get dynamic rounding mode as now. Lowering of FP operations like fadd uses static rounding mode RNE, because these operations assume default floating point environment (https://llvm.org/docs/LangRef.html#floating-point-environment). Using static rounding mode has some advantages over assuming frm to have particular value. The code that requires default rounding mode does not require setting rfm in a program where some pieces uses non-default rounding mode. Such code works as designed even if it is called from a region where other rounding mode is set. Such implementation simplifies implementation of things like #pragma STDC FENV_ROUND and make programs more robust.

In D94163#2605594, @sepavloff wrote:
In D94163#2603738, @asb wrote:

We discussed this briefly in the RISC-V call as I noted this patchset has been sat open for some time. One thing that might be helpful is whether you could say a little bit more about the goal for this patchset.

Most floating point instructions set accrued exception bits in fflags register. If an instruction is specified with dynamic rounding mode, it also depends on the content of frm register. So in the following code:
csrwi  frm, a1
fadd.d ft2, ft2, ft3
changing the order of instruction is not allowed, because fadd.d depends on the content of frm, which is changed by the previous instruction. Similarly, the code:
fadd.d ft2, ft2, ft3
csrrs t0, fflags, zero
does not allow to change the order of the instructions, as crsrs reads content of fflags, which is set by the first instruction.

Now nothing prevents the compiler from changing the order of instructions in these examples. Existing instruction definitions are unable to express these dependencies. It does not allow to write programs that use non-default floating point environment, for example, dynamic rounding mode.

The goal of this patchset is to establish means to express such dependencies.

Once this lands, what's the next step?

This is the first and necessary step toward full-fledged implementation of floating point support. In particular it would allow progress in D91242 and D90854. Actually this work was undertaken because the lack of dependencies prevents implementation of llvm.set.rounding (https://reviews.llvm.org/D91242#2400476). The next steps, of course, include support of constrained intrinsics for RSCV.

Is there some relevant RFC,

If RFC can facilitate review of this patchset, I will prepare it.

or equivalent changes being made to other in-tree architectures?

Targets that support full-fledged floating point operations add implicit uses and definitions of FP state and control register(s). For example:
• X86: https://github.com/llvm/llvm-project/blob/bc172e532a89754d47fef1306064a26a4dc0a76b/llvm/lib/Target/X86/X86InstrFPStack.td#L728 (see let Defs = [FPSW] and let Uses = [FPCW]),
• PowerPC: https://github.com/llvm/llvm-project/blob/e7361c8eccb7663146096622549dc03240414157/llvm/lib/Target/PowerPC/PPCInstrInfo.td#L3169 (see Uses = [RM]),
• SystemZ: https://github.com/llvm/llvm-project/blob/9e28b89827a3be4ab602b40c263839665af06b4a/llvm/lib/Target/SystemZ/SystemZInstrFP.td#L434 (see let Uses = [FPC])

We're also still unclear about the advantage of changing codegen to default to a static rounding mode (which might be a surprising change, as all software compiled to date on both GCC and LLVM has used used the dynamic rounding mode by default).

Instructions in assembler without explicit rounding mode specification get dynamic rounding mode as now. Lowering of FP operations like fadd uses static rounding mode RNE, because these operations assume default floating point environment (https://llvm.org/docs/LangRef.html#floating-point-environment). Using static rounding mode has some advantages over assuming frm to have particular value. The code that requires default rounding mode does not require setting rfm in a program where some pieces uses non-default rounding mode. Such code works as designed even if it is called from a region where other rounding mode is set. Such implementation simplifies implementation of things like #pragma STDC FENV_ROUND and make programs more robust.

That's going to break huge piles of C/C++ code that sets the (dynamic) rounding mode and expects it to have an effect on subsequent computations. I do not think that is a good idea.

In D94163#2606346, @jrtc27 wrote:

In D94163#2605594, @sepavloff wrote:

In D94163#2603738, @asb wrote:

We're also still unclear about the advantage of changing codegen to default to a static rounding mode (which might be a surprising change, as all software compiled to date on both GCC and LLVM has used used the dynamic rounding mode by default).

Instructions in assembler without explicit rounding mode specification get dynamic rounding mode as now. Lowering of FP operations like fadd uses static rounding mode RNE, because these operations assume default floating point environment (https://llvm.org/docs/LangRef.html#floating-point-environment). Using static rounding mode has some advantages over assuming frm to have particular value. The code that requires default rounding mode does not require setting rfm in a program where some pieces uses non-default rounding mode. Such code works as designed even if it is called from a region where other rounding mode is set. Such implementation simplifies implementation of things like #pragma STDC FENV_ROUND and make programs more robust.

That's going to break huge piles of C/C++ code that sets the (dynamic) rounding mode and expects it to have an effect on subsequent computations. I do not think that is a good idea.

You are right, it might be dangerous. While RISC-V does not support constrained intrinsics, it would be safer to use instructions with dynamic rounding mode.

Updated patch for alternative CSR solution

Harbormaster completed remote builds in B96493: Diff 334391.Mar 31 2021, 3:04 AM

sepavloff added a parent revision: D99083: [RISCV] Introduce floating point control and state registers.Mar 31 2021, 3:04 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVInstrInfoD.td

44 lines

RISCVInstrInfoF.td

71 lines

RISCVInstrInfoZfh.td

44 lines

Diff 334391

llvm/lib/Target/RISCV/RISCVInstrInfoD.td

	Show All 23 Lines

	def RISCVBuildPairF64 : SDNode<"RISCVISD::BuildPairF64", SDT_RISCVBuildPairF64>;			def RISCVBuildPairF64 : SDNode<"RISCVISD::BuildPairF64", SDT_RISCVBuildPairF64>;
	def RISCVSplitF64 : SDNode<"RISCVISD::SplitF64", SDT_RISCVSplitF64>;			def RISCVSplitF64 : SDNode<"RISCVISD::SplitF64", SDT_RISCVSplitF64>;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Instruction Class Templates			// Instruction Class Templates
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in			multiclass FPFMAD_mc<RISCVOpcode opcode, string OpcodeStr> {
	class FPFMAD_rrr_frm<RISCVOpcode opcode, string opcodestr>			def _gen : FPFMA_gen<opcode, OpcodeStr, 0b01, FPR64>;
	: RVInstR4<0b01, opcode, (outs FPR64:$rd),			def _def : FPFMA_def<opcode, OpcodeStr, 0b01, FPR64>;
	(ins FPR64:$rs1, FPR64:$rs2, FPR64:$rs3, frmarg:$funct3),			def : InstAlias<OpcodeStr#" $rd, $rs1, $rs2, $rs3",
	opcodestr, "$rd, $rs1, $rs2, $rs3, $funct3">;			(!cast<Instruction>(NAME # "_gen")
				FPR64:$rd, FPR64:$rs1, FPR64:$rs2, FPR64:$rs3, 0b111)>;
	class FPFMADDynFrmAlias<FPFMAD_rrr_frm Inst, string OpcodeStr>			}
	: InstAlias<OpcodeStr#" $rd, $rs1, $rs2, $rs3",
	(Inst FPR64:$rd, FPR64:$rs1, FPR64:$rs2, FPR64:$rs3, 0b111)>;

	let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in			let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
	class FPALUD_rr<bits<7> funct7, bits<3> funct3, string opcodestr>			class FPALUD_rr<bits<7> funct7, bits<3> funct3, string opcodestr>
	: RVInstR<funct7, funct3, OPC_OP_FP, (outs FPR64:$rd),			: RVInstR<funct7, funct3, OPC_OP_FP, (outs FPR64:$rd),
	(ins FPR64:$rs1, FPR64:$rs2), opcodestr, "$rd, $rs1, $rs2">;			(ins FPR64:$rs1, FPR64:$rs2), opcodestr, "$rd, $rs1, $rs2">;

	let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in			let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
	class FPALUD_rr_frm<bits<7> funct7, string opcodestr>			class FPALUD_rr_frm<bits<7> funct7, string opcodestr>
	Show All 27 Lines
	// reflecting the order these fields are specified in the instruction			// reflecting the order these fields are specified in the instruction
	// encoding.			// encoding.
	let hasSideEffects = 0, mayLoad = 0, mayStore = 1 in			let hasSideEffects = 0, mayLoad = 0, mayStore = 1 in
	def FSD : RVInstS<0b011, OPC_STORE_FP, (outs),			def FSD : RVInstS<0b011, OPC_STORE_FP, (outs),
	(ins FPR64:$rs2, GPR:$rs1, simm12:$imm12),			(ins FPR64:$rs2, GPR:$rs1, simm12:$imm12),
	"fsd", "$rs2, ${imm12}(${rs1})">,			"fsd", "$rs2, ${imm12}(${rs1})">,
	Sched<[WriteFST64, ReadStoreData, ReadFMemBase]>;			Sched<[WriteFST64, ReadStoreData, ReadFMemBase]>;

	def FMADD_D : FPFMAD_rrr_frm<OPC_MADD, "fmadd.d">,			defm FMADD_D : FPFMAD_mc<OPC_MADD, "fmadd.d">,
	Sched<[WriteFMulAdd64, ReadFMulAdd64, ReadFMulAdd64, ReadFMulAdd64]>;			Sched<[WriteFMulAdd64, ReadFMulAdd64, ReadFMulAdd64, ReadFMulAdd64]>;
	def : FPFMADDynFrmAlias<FMADD_D, "fmadd.d">;			defm FMSUB_D : FPFMAD_mc<OPC_MSUB, "fmsub.d">,
	def FMSUB_D : FPFMAD_rrr_frm<OPC_MSUB, "fmsub.d">,
	Sched<[WriteFMulSub64, ReadFMulSub64, ReadFMulSub64, ReadFMulSub64]>;			Sched<[WriteFMulSub64, ReadFMulSub64, ReadFMulSub64, ReadFMulSub64]>;
	def : FPFMADDynFrmAlias<FMSUB_D, "fmsub.d">;			defm FNMSUB_D : FPFMAD_mc<OPC_NMSUB, "fnmsub.d">,
	def FNMSUB_D : FPFMAD_rrr_frm<OPC_NMSUB, "fnmsub.d">,
	Sched<[WriteFMulSub64, ReadFMulSub64, ReadFMulSub64, ReadFMulSub64]>;			Sched<[WriteFMulSub64, ReadFMulSub64, ReadFMulSub64, ReadFMulSub64]>;
	def : FPFMADDynFrmAlias<FNMSUB_D, "fnmsub.d">;			defm FNMADD_D : FPFMAD_mc<OPC_NMADD, "fnmadd.d">,
	def FNMADD_D : FPFMAD_rrr_frm<OPC_NMADD, "fnmadd.d">,
	Sched<[WriteFMulAdd64, ReadFMulAdd64, ReadFMulAdd64, ReadFMulAdd64]>;			Sched<[WriteFMulAdd64, ReadFMulAdd64, ReadFMulAdd64, ReadFMulAdd64]>;
	def : FPFMADDynFrmAlias<FNMADD_D, "fnmadd.d">;

	def FADD_D : FPALUD_rr_frm<0b0000001, "fadd.d">,			def FADD_D : FPALUD_rr_frm<0b0000001, "fadd.d">,
	Sched<[WriteFALU64, ReadFALU64, ReadFALU64]>;			Sched<[WriteFALU64, ReadFALU64, ReadFALU64]>;
	def : FPALUDDynFrmAlias<FADD_D, "fadd.d">;			def : FPALUDDynFrmAlias<FADD_D, "fadd.d">;
	def FSUB_D : FPALUD_rr_frm<0b0000101, "fsub.d">,			def FSUB_D : FPALUD_rr_frm<0b0000101, "fsub.d">,
	Sched<[WriteFALU64, ReadFALU64, ReadFALU64]>;			Sched<[WriteFALU64, ReadFALU64, ReadFALU64]>;
	def : FPALUDDynFrmAlias<FSUB_D, "fsub.d">;			def : FPALUDDynFrmAlias<FSUB_D, "fsub.d">;
	def FMUL_D : FPALUD_rr_frm<0b0001001, "fmul.d">,			def FMUL_D : FPALUD_rr_frm<0b0001001, "fmul.d">,
	▲ Show 20 Lines • Show All 158 Lines • ▼ Show 20 Lines
	def : PatFpr64Fpr64<fcopysign, FSGNJ_D>;			def : PatFpr64Fpr64<fcopysign, FSGNJ_D>;
	def : Pat<(fcopysign FPR64:$rs1, (fneg FPR64:$rs2)), (FSGNJN_D $rs1, $rs2)>;			def : Pat<(fcopysign FPR64:$rs1, (fneg FPR64:$rs2)), (FSGNJN_D $rs1, $rs2)>;
	def : Pat<(fcopysign FPR64:$rs1, FPR32:$rs2), (FSGNJ_D $rs1, (FCVT_D_S $rs2))>;			def : Pat<(fcopysign FPR64:$rs1, FPR32:$rs2), (FSGNJ_D $rs1, (FCVT_D_S $rs2))>;
	def : Pat<(fcopysign FPR32:$rs1, FPR64:$rs2), (FSGNJ_S $rs1, (FCVT_S_D $rs2,			def : Pat<(fcopysign FPR32:$rs1, FPR64:$rs2), (FSGNJ_S $rs1, (FCVT_S_D $rs2,
	0b111))>;			0b111))>;

	// fmadd: rs1 * rs2 + rs3			// fmadd: rs1 * rs2 + rs3
	def : Pat<(fma FPR64:$rs1, FPR64:$rs2, FPR64:$rs3),			def : Pat<(fma FPR64:$rs1, FPR64:$rs2, FPR64:$rs3),
	(FMADD_D $rs1, $rs2, $rs3, 0b111)>;			(FMADD_D_def $rs1, $rs2, $rs3)>;

	// fmsub: rs1 * rs2 - rs3			// fmsub: rs1 * rs2 - rs3
	def : Pat<(fma FPR64:$rs1, FPR64:$rs2, (fneg FPR64:$rs3)),			def : Pat<(fma FPR64:$rs1, FPR64:$rs2, (fneg FPR64:$rs3)),
	(FMSUB_D FPR64:$rs1, FPR64:$rs2, FPR64:$rs3, 0b111)>;			(FMSUB_D_def FPR64:$rs1, FPR64:$rs2, FPR64:$rs3)>;

	// fnmsub: -rs1 * rs2 + rs3			// fnmsub: -rs1 * rs2 + rs3
	def : Pat<(fma (fneg FPR64:$rs1), FPR64:$rs2, FPR64:$rs3),			def : Pat<(fma (fneg FPR64:$rs1), FPR64:$rs2, FPR64:$rs3),
	(FNMSUB_D FPR64:$rs1, FPR64:$rs2, FPR64:$rs3, 0b111)>;			(FNMSUB_D_def FPR64:$rs1, FPR64:$rs2, FPR64:$rs3)>;

	// fnmadd: -rs1 * rs2 - rs3			// fnmadd: -rs1 * rs2 - rs3
	def : Pat<(fma (fneg FPR64:$rs1), FPR64:$rs2, (fneg FPR64:$rs3)),			def : Pat<(fma (fneg FPR64:$rs1), FPR64:$rs2, (fneg FPR64:$rs3)),
	(FNMADD_D FPR64:$rs1, FPR64:$rs2, FPR64:$rs3, 0b111)>;			(FNMADD_D_def FPR64:$rs1, FPR64:$rs2, FPR64:$rs3)>;

	// The RISC-V 2.2 user-level ISA spec defines fmin and fmax as returning the			// The RISC-V 2.2 user-level ISA spec defines fmin and fmax as returning the
	// canonical NaN when giving a signaling NaN. This doesn't match the LLVM			// canonical NaN when giving a signaling NaN. This doesn't match the LLVM
	// behaviour (see https://bugs.llvm.org/show_bug.cgi?id=27363). However, the			// behaviour (see https://bugs.llvm.org/show_bug.cgi?id=27363). However, the
	// draft 2.3 ISA spec changes the definition of fmin and fmax in a way that			// draft 2.3 ISA spec changes the definition of fmin and fmax in a way that
	// matches LLVM's fminnum and fmaxnum			// matches LLVM's fminnum and fmaxnum
	// <https://github.com/riscv/riscv-isa-manual/commit/cd20cee7efd9bac7c5aa127ec3b451749d2b3cce>.			// <https://github.com/riscv/riscv-isa-manual/commit/cd20cee7efd9bac7c5aa127ec3b451749d2b3cce>.
	def : PatFpr64Fpr64<fminnum, FMIN_D>;			def : PatFpr64Fpr64<fminnum, FMIN_D>;
	▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVInstrInfoF.td

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	def frmarg : Operand<XLenVT> {
let PrintMethod = "printFRMArg";		let PrintMethod = "printFRMArg";
let DecoderMethod = "decodeFRMArg";		let DecoderMethod = "decodeFRMArg";
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Instruction class templates		// Instruction class templates
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in		// Instructions that may depend on dynamic rounding mode in FRM or set accrued
class FPFMAS_rrr_frm<RISCVOpcode opcode, string opcodestr>		// flags in FFLAGs exist in two forms:
: RVInstR4<0b00, opcode, (outs FPR32:$rd),		// - generic, which depends on FRM and defined FFLAGS. Constrained intrinsics
(ins FPR32:$rs1, FPR32:$rs2, FPR32:$rs3, frmarg:$funct3),		// are lowered to this form.
opcodestr, "$rd, $rs1, $rs2, $rs3, $funct3">;		// - default, which ignores dependency on FRM and influence on FFLAGS. It is
		// used to lower regular IR FP operations.
class FPFMASDynFrmAlias<FPFMAS_rrr_frm Inst, string OpcodeStr>
: InstAlias<OpcodeStr#" $rd, $rs1, $rs2, $rs3",		let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in {
(Inst FPR32:$rd, FPR32:$rs1, FPR32:$rs2, FPR32:$rs3, 0b111)>;		class FPFMA_gen<RISCVOpcode opcode, string opcodestr,
		bits<2> kind, RegisterClass rty>
		: RVInstR4<kind, opcode, (outs rty:$rd),
		(ins rty:$rs1, rty:$rs2, rty:$rs3, frmarg:$funct3),
		opcodestr, "$rd, $rs1, $rs2, $rs3, $funct3"> {
		let Defs = [FFLAGS];
		let Uses = [FRM];
		}

		class FPFMA_def<RISCVOpcode opcode, string opcodestr,
		bits<2> kind, RegisterClass rty>
		: RVInstR4<kind, opcode, (outs rty:$rd),
		(ins rty:$rs1, rty:$rs2, rty:$rs3),
		opcodestr, "$rd, $rs1, $rs2, $rs3, dyn"> {
		let isCodeGenOnly = 1;
		// FIXME: Until constrained intrinsics are supported by RISCV codegen, use
		// dynamic rounding mode.
		let funct3 = 0b111;
		}
		}

		multiclass FPFMAS_mc<RISCVOpcode opcode, string OpcodeStr> {
		def _gen : FPFMA_gen<opcode, OpcodeStr, 0b00, FPR32>;
		def _def : FPFMA_def<opcode, OpcodeStr, 0b00, FPR32>;
		def : InstAlias<OpcodeStr#" $rd, $rs1, $rs2, $rs3",
		(!cast<Instruction>(NAME # "_gen")
		FPR32:$rd, FPR32:$rs1, FPR32:$rs2, FPR32:$rs3, 0b111)>;
		}

let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in		let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
class FPALUS_rr<bits<7> funct7, bits<3> funct3, string opcodestr>		class FPALUS_rr<bits<7> funct7, bits<3> funct3, string opcodestr>
: RVInstR<funct7, funct3, OPC_OP_FP, (outs FPR32:$rd),		: RVInstR<funct7, funct3, OPC_OP_FP, (outs FPR32:$rd),
(ins FPR32:$rs1, FPR32:$rs2), opcodestr, "$rd, $rs1, $rs2">;		(ins FPR32:$rs1, FPR32:$rs2), opcodestr, "$rd, $rs1, $rs2">;

let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in		let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
class FPALUS_rr_frm<bits<7> funct7, string opcodestr>		class FPALUS_rr_frm<bits<7> funct7, string opcodestr>
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
// reflecting the order these fields are specified in the instruction		// reflecting the order these fields are specified in the instruction
// encoding.		// encoding.
let hasSideEffects = 0, mayLoad = 0, mayStore = 1 in		let hasSideEffects = 0, mayLoad = 0, mayStore = 1 in
def FSW : RVInstS<0b010, OPC_STORE_FP, (outs),		def FSW : RVInstS<0b010, OPC_STORE_FP, (outs),
(ins FPR32:$rs2, GPR:$rs1, simm12:$imm12),		(ins FPR32:$rs2, GPR:$rs1, simm12:$imm12),
"fsw", "$rs2, ${imm12}(${rs1})">,		"fsw", "$rs2, ${imm12}(${rs1})">,
Sched<[WriteFST32, ReadStoreData, ReadFMemBase]>;		Sched<[WriteFST32, ReadStoreData, ReadFMemBase]>;

def FMADD_S : FPFMAS_rrr_frm<OPC_MADD, "fmadd.s">,		defm FMADD_S : FPFMAS_mc<OPC_MADD, "fmadd.s">,
Sched<[WriteFMulAdd32, ReadFMulAdd32, ReadFMulAdd32, ReadFMulAdd32]>;		Sched<[WriteFMulAdd32, ReadFMulAdd32, ReadFMulAdd32, ReadFMulAdd32]>;
def : FPFMASDynFrmAlias<FMADD_S, "fmadd.s">;		defm FMSUB_S : FPFMAS_mc<OPC_MSUB, "fmsub.s">,
def FMSUB_S : FPFMAS_rrr_frm<OPC_MSUB, "fmsub.s">,
Sched<[WriteFMulSub32, ReadFMulSub32, ReadFMulSub32, ReadFMulSub32]>;		Sched<[WriteFMulSub32, ReadFMulSub32, ReadFMulSub32, ReadFMulSub32]>;
def : FPFMASDynFrmAlias<FMSUB_S, "fmsub.s">;		defm FNMSUB_S : FPFMAS_mc<OPC_NMSUB, "fnmsub.s">,
def FNMSUB_S : FPFMAS_rrr_frm<OPC_NMSUB, "fnmsub.s">,
Sched<[WriteFMulSub32, ReadFMulSub32, ReadFMulSub32, ReadFMulSub32]>;		Sched<[WriteFMulSub32, ReadFMulSub32, ReadFMulSub32, ReadFMulSub32]>;
def : FPFMASDynFrmAlias<FNMSUB_S, "fnmsub.s">;		defm FNMADD_S : FPFMAS_mc<OPC_NMADD, "fnmadd.s">,
def FNMADD_S : FPFMAS_rrr_frm<OPC_NMADD, "fnmadd.s">,
Sched<[WriteFMulAdd32, ReadFMulAdd32, ReadFMulAdd32, ReadFMulAdd32]>;		Sched<[WriteFMulAdd32, ReadFMulAdd32, ReadFMulAdd32, ReadFMulAdd32]>;
def : FPFMASDynFrmAlias<FNMADD_S, "fnmadd.s">;

def FADD_S : FPALUS_rr_frm<0b0000000, "fadd.s">,		def FADD_S : FPALUS_rr_frm<0b0000000, "fadd.s">,
Sched<[WriteFALU32, ReadFALU32, ReadFALU32]>;		Sched<[WriteFALU32, ReadFALU32, ReadFALU32]>;
def : FPALUSDynFrmAlias<FADD_S, "fadd.s">;		def : FPALUSDynFrmAlias<FADD_S, "fadd.s">;
def FSUB_S : FPALUS_rr_frm<0b0000100, "fsub.s">,		def FSUB_S : FPALUS_rr_frm<0b0000100, "fsub.s">,
Sched<[WriteFALU32, ReadFALU32, ReadFALU32]>;		Sched<[WriteFALU32, ReadFALU32, ReadFALU32]>;
def : FPALUSDynFrmAlias<FSUB_S, "fsub.s">;		def : FPALUSDynFrmAlias<FSUB_S, "fsub.s">;
def FMUL_S : FPALUS_rr_frm<0b0001000, "fmul.s">,		def FMUL_S : FPALUS_rr_frm<0b0001000, "fmul.s">,
▲ Show 20 Lines • Show All 179 Lines • ▼ Show 20 Lines
def : Pat<(fneg FPR32:$rs1), (FSGNJN_S $rs1, $rs1)>;		def : Pat<(fneg FPR32:$rs1), (FSGNJN_S $rs1, $rs1)>;
def : Pat<(fabs FPR32:$rs1), (FSGNJX_S $rs1, $rs1)>;		def : Pat<(fabs FPR32:$rs1), (FSGNJX_S $rs1, $rs1)>;

def : PatFpr32Fpr32<fcopysign, FSGNJ_S>;		def : PatFpr32Fpr32<fcopysign, FSGNJ_S>;
def : Pat<(fcopysign FPR32:$rs1, (fneg FPR32:$rs2)), (FSGNJN_S $rs1, $rs2)>;		def : Pat<(fcopysign FPR32:$rs1, (fneg FPR32:$rs2)), (FSGNJN_S $rs1, $rs2)>;

// fmadd: rs1 * rs2 + rs3		// fmadd: rs1 * rs2 + rs3
def : Pat<(fma FPR32:$rs1, FPR32:$rs2, FPR32:$rs3),		def : Pat<(fma FPR32:$rs1, FPR32:$rs2, FPR32:$rs3),
(FMADD_S $rs1, $rs2, $rs3, 0b111)>;		(FMADD_S_def $rs1, $rs2, $rs3)>;

// fmsub: rs1 * rs2 - rs3		// fmsub: rs1 * rs2 - rs3
def : Pat<(fma FPR32:$rs1, FPR32:$rs2, (fneg FPR32:$rs3)),		def : Pat<(fma FPR32:$rs1, FPR32:$rs2, (fneg FPR32:$rs3)),
(FMSUB_S FPR32:$rs1, FPR32:$rs2, FPR32:$rs3, 0b111)>;		(FMSUB_S_def FPR32:$rs1, FPR32:$rs2, FPR32:$rs3)>;

// fnmsub: -rs1 * rs2 + rs3		// fnmsub: -rs1 * rs2 + rs3
def : Pat<(fma (fneg FPR32:$rs1), FPR32:$rs2, FPR32:$rs3),		def : Pat<(fma (fneg FPR32:$rs1), FPR32:$rs2, FPR32:$rs3),
(FNMSUB_S FPR32:$rs1, FPR32:$rs2, FPR32:$rs3, 0b111)>;		(FNMSUB_S_def FPR32:$rs1, FPR32:$rs2, FPR32:$rs3)>;

// fnmadd: -rs1 * rs2 - rs3		// fnmadd: -rs1 * rs2 - rs3
def : Pat<(fma (fneg FPR32:$rs1), FPR32:$rs2, (fneg FPR32:$rs3)),		def : Pat<(fma (fneg FPR32:$rs1), FPR32:$rs2, (fneg FPR32:$rs3)),
(FNMADD_S FPR32:$rs1, FPR32:$rs2, FPR32:$rs3, 0b111)>;		(FNMADD_S_def FPR32:$rs1, FPR32:$rs2, FPR32:$rs3)>;

// The RISC-V 2.2 user-level ISA spec defines fmin and fmax as returning the		// The RISC-V 2.2 user-level ISA spec defines fmin and fmax as returning the
// canonical NaN when given a signaling NaN. This doesn't match the LLVM		// canonical NaN when given a signaling NaN. This doesn't match the LLVM
// behaviour (see https://bugs.llvm.org/show_bug.cgi?id=27363). However, the		// behaviour (see https://bugs.llvm.org/show_bug.cgi?id=27363). However, the
// draft 2.3 ISA spec changes the definition of fmin and fmax in a way that		// draft 2.3 ISA spec changes the definition of fmin and fmax in a way that
// matches LLVM's fminnum and fmaxnum		// matches LLVM's fminnum and fmaxnum
// <https://github.com/riscv/riscv-isa-manual/commit/cd20cee7efd9bac7c5aa127ec3b451749d2b3cce>.		// <https://github.com/riscv/riscv-isa-manual/commit/cd20cee7efd9bac7c5aa127ec3b451749d2b3cce>.
def : PatFpr32Fpr32<fminnum, FMIN_S>;		def : PatFpr32Fpr32<fminnum, FMIN_S>;
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVInstrInfoZfh.td

Show All 25 Lines	def riscv_fmv_h_x
: SDNode<"RISCVISD::FMV_H_X", SDT_RISCVFMV_H_X>;		: SDNode<"RISCVISD::FMV_H_X", SDT_RISCVFMV_H_X>;
def riscv_fmv_x_anyexth		def riscv_fmv_x_anyexth
: SDNode<"RISCVISD::FMV_X_ANYEXTH", SDT_RISCVFMV_X_ANYEXTH>;		: SDNode<"RISCVISD::FMV_X_ANYEXTH", SDT_RISCVFMV_X_ANYEXTH>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Instruction class templates		// Instruction class templates
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in		multiclass FPFMAH_mc<RISCVOpcode opcode, string OpcodeStr> {
class FPFMAH_rrr_frm<RISCVOpcode opcode, string opcodestr>		def _gen : FPFMA_gen<opcode, OpcodeStr, 0b10, FPR16>;
: RVInstR4<0b10, opcode, (outs FPR16:$rd),		def _def : FPFMA_def<opcode, OpcodeStr, 0b10, FPR16>;
(ins FPR16:$rs1, FPR16:$rs2, FPR16:$rs3, frmarg:$funct3),		def : InstAlias<OpcodeStr#" $rd, $rs1, $rs2, $rs3",
opcodestr, "$rd, $rs1, $rs2, $rs3, $funct3">;		(!cast<Instruction>(NAME # "_gen")
		FPR16:$rd, FPR16:$rs1, FPR16:$rs2, FPR16:$rs3, 0b111)>;
class FPFMAHDynFrmAlias<FPFMAH_rrr_frm Inst, string OpcodeStr>		}
: InstAlias<OpcodeStr#" $rd, $rs1, $rs2, $rs3",
(Inst FPR16:$rd, FPR16:$rs1, FPR16:$rs2, FPR16:$rs3, 0b111)>;

let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in		let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
class FPALUH_rr<bits<7> funct7, bits<3> funct3, string opcodestr>		class FPALUH_rr<bits<7> funct7, bits<3> funct3, string opcodestr>
: RVInstR<funct7, funct3, OPC_OP_FP, (outs FPR16:$rd),		: RVInstR<funct7, funct3, OPC_OP_FP, (outs FPR16:$rd),
(ins FPR16:$rs1, FPR16:$rs2), opcodestr, "$rd, $rs1, $rs2">;		(ins FPR16:$rs1, FPR16:$rs2), opcodestr, "$rd, $rs1, $rs2">;

let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in		let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
class FPALUH_rr_frm<bits<7> funct7, string opcodestr>		class FPALUH_rr_frm<bits<7> funct7, string opcodestr>
Show All 26 Lines
// reflecting the order these fields are specified in the instruction		// reflecting the order these fields are specified in the instruction
// encoding.		// encoding.
let hasSideEffects = 0, mayLoad = 0, mayStore = 1 in		let hasSideEffects = 0, mayLoad = 0, mayStore = 1 in
def FSH : RVInstS<0b001, OPC_STORE_FP, (outs),		def FSH : RVInstS<0b001, OPC_STORE_FP, (outs),
(ins FPR16:$rs2, GPR:$rs1, simm12:$imm12),		(ins FPR16:$rs2, GPR:$rs1, simm12:$imm12),
"fsh", "$rs2, ${imm12}(${rs1})">,		"fsh", "$rs2, ${imm12}(${rs1})">,
Sched<[WriteFST16, ReadStoreData, ReadFMemBase]>;		Sched<[WriteFST16, ReadStoreData, ReadFMemBase]>;

def FMADD_H : FPFMAH_rrr_frm<OPC_MADD, "fmadd.h">,		defm FMADD_H : FPFMAH_mc<OPC_MADD, "fmadd.h">,
Sched<[WriteFMulAdd16, ReadFMulAdd16, ReadFMulAdd16, ReadFMulAdd16]>;		Sched<[WriteFMulAdd16, ReadFMulAdd16, ReadFMulAdd16, ReadFMulAdd16]>;
def : FPFMAHDynFrmAlias<FMADD_H, "fmadd.h">;		defm FMSUB_H : FPFMAH_mc<OPC_MSUB, "fmsub.h">,
def FMSUB_H : FPFMAH_rrr_frm<OPC_MSUB, "fmsub.h">,
Sched<[WriteFMulSub16, ReadFMulSub16, ReadFMulSub16, ReadFMulSub16]>;		Sched<[WriteFMulSub16, ReadFMulSub16, ReadFMulSub16, ReadFMulSub16]>;
def : FPFMAHDynFrmAlias<FMSUB_H, "fmsub.h">;		defm FNMSUB_H : FPFMAH_mc<OPC_NMSUB, "fnmsub.h">,
def FNMSUB_H : FPFMAH_rrr_frm<OPC_NMSUB, "fnmsub.h">,
Sched<[WriteFMulSub16, ReadFMulSub16, ReadFMulSub16, ReadFMulSub16]>;		Sched<[WriteFMulSub16, ReadFMulSub16, ReadFMulSub16, ReadFMulSub16]>;
def : FPFMAHDynFrmAlias<FNMSUB_H, "fnmsub.h">;		defm FNMADD_H : FPFMAH_mc<OPC_NMADD, "fnmadd.h">,
def FNMADD_H : FPFMAH_rrr_frm<OPC_NMADD, "fnmadd.h">,
Sched<[WriteFMulAdd16, ReadFMulAdd16, ReadFMulAdd16, ReadFMulAdd16]>;		Sched<[WriteFMulAdd16, ReadFMulAdd16, ReadFMulAdd16, ReadFMulAdd16]>;
def : FPFMAHDynFrmAlias<FNMADD_H, "fnmadd.h">;

def FADD_H : FPALUH_rr_frm<0b0000010, "fadd.h">,		def FADD_H : FPALUH_rr_frm<0b0000010, "fadd.h">,
Sched<[WriteFALU16, ReadFALU16, ReadFALU16]>;		Sched<[WriteFALU16, ReadFALU16, ReadFALU16]>;
def : FPALUHDynFrmAlias<FADD_H, "fadd.h">;		def : FPALUHDynFrmAlias<FADD_H, "fadd.h">;
def FSUB_H : FPALUH_rr_frm<0b0000110, "fsub.h">,		def FSUB_H : FPALUH_rr_frm<0b0000110, "fsub.h">,
Sched<[WriteFALU16, ReadFALU16, ReadFALU16]>;		Sched<[WriteFALU16, ReadFALU16, ReadFALU16]>;
def : FPALUHDynFrmAlias<FSUB_H, "fsub.h">;		def : FPALUHDynFrmAlias<FSUB_H, "fsub.h">;
def FMUL_H : FPALUH_rr_frm<0b0001010, "fmul.h">,		def FMUL_H : FPALUH_rr_frm<0b0001010, "fmul.h">,
▲ Show 20 Lines • Show All 173 Lines • ▼ Show 20 Lines
def : PatFpr16Fpr16<fcopysign, FSGNJ_H>;		def : PatFpr16Fpr16<fcopysign, FSGNJ_H>;
def : Pat<(fcopysign FPR16:$rs1, (fneg FPR16:$rs2)), (FSGNJN_H $rs1, $rs2)>;		def : Pat<(fcopysign FPR16:$rs1, (fneg FPR16:$rs2)), (FSGNJN_H $rs1, $rs2)>;
def : Pat<(fcopysign FPR16:$rs1, FPR32:$rs2),		def : Pat<(fcopysign FPR16:$rs1, FPR32:$rs2),
(FSGNJ_H $rs1, (FCVT_H_S $rs2, 0b111))>;		(FSGNJ_H $rs1, (FCVT_H_S $rs2, 0b111))>;
def : Pat<(fcopysign FPR32:$rs1, FPR16:$rs2), (FSGNJ_S $rs1, (FCVT_S_H $rs2))>;		def : Pat<(fcopysign FPR32:$rs1, FPR16:$rs2), (FSGNJ_S $rs1, (FCVT_S_H $rs2))>;

// fmadd: rs1 * rs2 + rs3		// fmadd: rs1 * rs2 + rs3
def : Pat<(fma FPR16:$rs1, FPR16:$rs2, FPR16:$rs3),		def : Pat<(fma FPR16:$rs1, FPR16:$rs2, FPR16:$rs3),
(FMADD_H $rs1, $rs2, $rs3, 0b111)>;		(FMADD_H_def $rs1, $rs2, $rs3)>;

// fmsub: rs1 * rs2 - rs3		// fmsub: rs1 * rs2 - rs3
def : Pat<(fma FPR16:$rs1, FPR16:$rs2, (fneg FPR16:$rs3)),		def : Pat<(fma FPR16:$rs1, FPR16:$rs2, (fneg FPR16:$rs3)),
(FMSUB_H FPR16:$rs1, FPR16:$rs2, FPR16:$rs3, 0b111)>;		(FMSUB_H_def FPR16:$rs1, FPR16:$rs2, FPR16:$rs3)>;

// fnmsub: -rs1 * rs2 + rs3		// fnmsub: -rs1 * rs2 + rs3
def : Pat<(fma (fneg FPR16:$rs1), FPR16:$rs2, FPR16:$rs3),		def : Pat<(fma (fneg FPR16:$rs1), FPR16:$rs2, FPR16:$rs3),
(FNMSUB_H FPR16:$rs1, FPR16:$rs2, FPR16:$rs3, 0b111)>;		(FNMSUB_H_def FPR16:$rs1, FPR16:$rs2, FPR16:$rs3)>;

// fnmadd: -rs1 * rs2 - rs3		// fnmadd: -rs1 * rs2 - rs3
def : Pat<(fma (fneg FPR16:$rs1), FPR16:$rs2, (fneg FPR16:$rs3)),		def : Pat<(fma (fneg FPR16:$rs1), FPR16:$rs2, (fneg FPR16:$rs3)),
(FNMADD_H FPR16:$rs1, FPR16:$rs2, FPR16:$rs3, 0b111)>;		(FNMADD_H_def FPR16:$rs1, FPR16:$rs2, FPR16:$rs3)>;

def : PatFpr16Fpr16<fminnum, FMIN_H>;		def : PatFpr16Fpr16<fminnum, FMIN_H>;
def : PatFpr16Fpr16<fmaxnum, FMAX_H>;		def : PatFpr16Fpr16<fmaxnum, FMAX_H>;

/// Setcc		/// Setcc

def : PatFpr16Fpr16<seteq, FEQ_H>;		def : PatFpr16Fpr16<seteq, FEQ_H>;
def : PatFpr16Fpr16<setoeq, FEQ_H>;		def : PatFpr16Fpr16<setoeq, FEQ_H>;
▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines