This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
1/3
RISCVISelDAGToDAG.cpp
4/8
RISCVISelLowering.cpp
4/5
RISCVInstrInfoZfa.td
-
test/CodeGen/RISCV/
-
CodeGen/
-
RISCV/
3/3
double-zfa.ll
1/2
float-zfa.ll
-
half-zfa.ll

Differential D141560

[RISCV][CodeGen] Add codegen pattern for FLI instruction in experimental zfa extension
ClosedPublic

Authored by joshua-arch1 on Jan 11 2023, 5:49 PM.

Download Raw Diff

Details

Reviewers

craig.topper
sunshaoce
kito-cheng
alextsao1999
reames
HsiangKai

Commits

rGada264146067: [RISCV][CodeGen] Add codegen pattern for FLI instruction in experimental zfa…

Summary

This patch implements experimental support for the RISCV Zfa extension as specified here: https://github.com/riscv/riscv-isa-manual/releases/download/draft-20221119-5234c63/riscv-spec.pdf, Ch. 25. This extension has not been ratified. Once ratified, it'll move out of experimental status.

This change adds codegen support for load-immediate instructions (fli.s/fli.d/fli.h).

Diff Detail

Event Timeline

joshua-arch1 created this revision.Jan 11 2023, 5:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 11 2023, 5:49 PM

Herald added subscribers: VincentWu, vkmr, frasercrmck and 26 others. · View Herald Transcript

joshua-arch1 requested review of this revision.Jan 11 2023, 5:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 11 2023, 5:49 PM

Herald added subscribers: llvm-commits, • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

Harbormaster completed remote builds in B207258: Diff 488444.Jan 11 2023, 5:50 PM

joshua-arch1 updated this revision to Diff 493508.Jan 31 2023, 12:42 AM

Herald added a subscriber: luke. · View Herald TranscriptJan 31 2023, 12:42 AM

Harbormaster completed remote builds in B210925: Diff 493508.Jan 31 2023, 12:43 AM

joshua-arch1 edited the summary of this revision. (Show Details)Jan 31 2023, 12:43 AM

jrtc27 added a parent revision: D140460: [RISCV][MC] Add FLI instruction support for the experimental zfa extension.Jan 31 2023, 7:31 PM

joshua-arch1 edited the summary of this revision. (Show Details)Jan 31 2023, 9:54 PM

joshua-arch1 edited the summary of this revision. (Show Details)

Since the assmebly support patch for Zfa instructions except FLI has been accepted, could anyone please review this codegen patch?

craig.topper added inline comments.Feb 12 2023, 3:24 PM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1501–1514	I think we can't reach this line when Zfa is enabled now? getLoadFP32Imm/getLoadFP64Imm/getLoadFP16Imm return -1 for 0.0 and -0.0 right?
2025	FCVTMOD does not saturate. It returns the lower bits of the overflowed value. These instruction cannot be used to lower saturating conversion.
3837	Does this need to be qualified with Zfa?
8055	Does this need to be qualified with Zfa?
llvm/lib/Target/RISCV/RISCVISelLowering.h
114 ↗	(On Diff #493508)	This description is incorrect. The FCVTMOD instructions do not saturate.
llvm/lib/Target/RISCV/RISCVInstrInfoZfa.td
138	This name is misleading. There is no fmvh.x.w instruction. FMV_X_W_FPR64 would be better.
174	Why do you need a COPY_TO_REGCLASS?
191	What is giving these patterns priority over the ones in RISCVInstrInfoF.td?
llvm/test/CodeGen/RISCV/double-zfa.ll
108	This doesn't look like the minimum value for normal value for double. I would expect the mantissa bits to be 0. So it should be 0x0010000000000000 I think?
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitcast.ll
422 ↗	(On Diff #493508)	Why does this test change? It doesn't use Zfa.

craig.topper added inline comments.Feb 12 2023, 7:30 PM

llvm/lib/Target/RISCV/RISCVInstrInfoZfa.td
191	I think this works because patterns with UsesCustomInserter=1 have lower priority than patterns that don't use CustomInserter.

craig.topper added inline comments.Feb 12 2023, 7:32 PM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1502	This needs to be rebased.
llvm/lib/Target/RISCV/RISCVInstrInfoZfa.td
160	This isn't a "bitcast". It's an encoding conversion. fpimm is now handled with custom code in RISCVISelDAGToDAG.cpp so this needs to be moved there.

In D141560#4120971, @joshua-arch1 wrote:

Since the assmebly support patch for Zfa instructions except FLI has been accepted, could anyone please review this codegen patch?

I'd encourage you to land the underlying approved review. I can't speak for others, but I certainty prioritize review for changes which have fewer outstanding blocking issues. You could also split this into an FLI and non-FLI part with the same reasoning.

In D141560#4122948, @reames wrote:

In D141560#4120971, @joshua-arch1 wrote:

Since the assmebly support patch for Zfa instructions except FLI has been accepted, could anyone please review this codegen patch?

I'd encourage you to land the underlying approved review. I can't speak for others, but I certainty prioritize review for changes which have fewer outstanding blocking issues. You could also split this into an FLI and non-FLI part with the same reasoning.

Different from assembly support, I think there is nothing special with the codegen pattern for FLI instruction. It's unnecessory to split the codegen patch into an FLI and non-FLI part.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2025	So what's the difference between FCVTMOD and normal FP_TO_INT? Is NaN converted to zero in normal FP_TO_INT?
2025	What instruction sequence does FCVTMOD need to replace? I'm a bit confused.
llvm/test/CodeGen/RISCV/double-zfa.ll
108	It is IR syntax. I have written a simple case to confirm that this value is used to express 0x0010000000000000 in IR level.

In D141560#4124863, @joshua-arch1 wrote:

In D141560#4122948, @reames wrote:

In D141560#4120971, @joshua-arch1 wrote:

Since the assmebly support patch for Zfa instructions except FLI has been accepted, could anyone please review this codegen patch?

I'd encourage you to land the underlying approved review. I can't speak for others, but I certainty prioritize review for changes which have fewer outstanding blocking issues. You could also split this into an FLI and non-FLI part with the same reasoning.

Different from assembly support, I think there is nothing special with the codegen pattern for FLI instruction. It's unnecessory to split the codegen patch into an FLI and non-FLI part.

The suggestion to split is to unblock parts of the patch. We would rather have small pieces make progress than stalling a whole patch when one part of it that can be isolated needs additional work. Smaller patches are much easier to review and re-review.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2025	FCVTMOD follows the semantics of JavaScript. It’s for being able writing JavaScript JITs for RISC-V. It’s not useful for C or Rust.
llvm/test/CodeGen/RISCV/double-zfa.ll
108	It’s the IR syntax for double just the hex representation of double.

In D141560#4124967, @craig.topper wrote:

In D141560#4124863, @joshua-arch1 wrote:

In D141560#4122948, @reames wrote:

In D141560#4120971, @joshua-arch1 wrote:

Since the assmebly support patch for Zfa instructions except FLI has been accepted, could anyone please review this codegen patch?

I'd encourage you to land the underlying approved review. I can't speak for others, but I certainty prioritize review for changes which have fewer outstanding blocking issues. You could also split this into an FLI and non-FLI part with the same reasoning.

Different from assembly support, I think there is nothing special with the codegen pattern for FLI instruction. It's unnecessory to split the codegen patch into an FLI and non-FLI part.

The suggestion to split is to unblock parts of the patch. We would rather have small pieces make progress than stalling a whole patch when one part of it that can be isolated needs additional work. Smaller patches are much easier to review and re-review.

I see. I will split these patch into three small ones. One for FLI, one for FCVMOD and another for other instructions.

Codegen patch for instructions except FLI and FCVTMOD has been commited. https://reviews.llvm.org/D143982

In D141560#4125173, @joshua-arch1 wrote:

Codegen patch for instructions except FLI and FCVTMOD has been commited. https://reviews.llvm.org/D143982

Important English point here. The patch has been *posted* not *committed*. Saying it's committed means that it was landed to the git repository. That appears to not be what you meant, and was actively confusing in this case.

reames added a parent revision: D143982: [RISCV][CodeGen] Add codegen pattern for experimental zfa extension (FLI and FCVTMOD not included).Feb 14 2023, 8:11 AM

joshua-arch1 updated this revision to Diff 497964.Feb 16 2023, 4:37 AM

Harbormaster completed remote builds in B214121: Diff 497964.Feb 16 2023, 4:37 AM

joshua-arch1 retitled this revision from [RISCV][CodeGen] Add codegen pattern for experimental zfa extension to [RISCV][CodeGen] Add codegen pattern for FLI instruction in experimental zfa extension.Feb 16 2023, 4:39 AM

joshua-arch1 edited the summary of this revision. (Show Details)

craig.topper added inline comments.Feb 16 2023, 10:22 AM

llvm/test/CodeGen/RISCV/float-zfa.ll
123	Why 255.0? Is this a negative test?

Ping.

llvm/test/CodeGen/RISCV/float-zfa.ll
123	Yep. I just want to ensure load floating-point immediates of other values will not generate FLI.

LGTM

This revision is now accepted and ready to land.Feb 19 2023, 6:37 PM

In D141560#4137896, @craig.topper wrote:

LGTM

This CodeGen patch depends on some functions in https://reviews.llvm.org/D140460.

craig.topper added a comment.Feb 19 2023, 11:14 PM

This comment was removed by craig.topper.

This revision was landed with ongoing or failed builds.Mar 6 2023, 10:27 PM

Closed by commit rGada264146067: [RISCV][CodeGen] Add codegen pattern for FLI instruction in experimental zfa… (authored by joshua-arch1). · Explain Why

This revision was automatically updated to reflect the committed changes.

joshua-arch1 added a commit: rGada264146067: [RISCV][CodeGen] Add codegen pattern for FLI instruction in experimental zfa….

craig.topper added inline comments.Mar 8 2023, 9:50 PM

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
711	Why was this isPosZero check added? It wasn't there when the patch was approved.

Anyone knows how to generate FLI from C-code? If I compile the following program, I cannot get FlI. ConstantFP will be converted to Constant in DAG.

void foo_double64 ()
{
  volatile double a;
  a = 0.0625;
}

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
711	Just in order to make sure loading +0.0 will not use fli.s.

Herald added a subscriber: jobnoorman. · View Herald TranscriptMar 15 2023, 6:48 PM

In D141560#4198220, @joshua-arch1 wrote:
Anyone knows how to generate FLI from C-code? If I compile the following program, I cannot get FlI. ConstantFP will be converted to Constant in DAG.
void foo_double64 ()
{
  volatile double a;
  a = 0.0625;
}

You just need to use it to do some floating point arithmetic

void foo_double64 (double x)
{
  volatile double a;
  a = x + 0.0625;
}

or return a floating point value

double foo_double64 ()
{
  return 0.0625;
}

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
711	But why treat fli.s different than fli.h or fli.d?

In D141560#4198233, @craig.topper wrote:
In D141560#4198220, @joshua-arch1 wrote:
Anyone knows how to generate FLI from C-code? If I compile the following program, I cannot get FlI. ConstantFP will be converted to Constant in DAG.
void foo_double64 ()
{
  volatile double a;
  a = 0.0625;
}
You just need to use it to do some floating point arithmetic
void foo_double64 (double x)
{
  volatile double a;
  a = x + 0.0625;
}
or return a floating point value
double foo_double64 ()
{
  return 0.0625;
}

Is that because we cannot directly store a ConstantFP in DAG?

In D141560#4198252, @joshua-arch1 wrote:
In D141560#4198233, @craig.topper wrote:
In D141560#4198220, @joshua-arch1 wrote:
Anyone knows how to generate FLI from C-code? If I compile the following program, I cannot get FlI. ConstantFP will be converted to Constant in DAG.
void foo_double64 ()
{
  volatile double a;
  a = 0.0625;
}
You just need to use it to do some floating point arithmetic
void foo_double64 (double x)
{
  volatile double a;
  a = x + 0.0625;
}
or return a floating point value
double foo_double64 ()
{
  return 0.0625;
}
Is that because we cannot directly store a ConstantFP in DAG?

The change is done in DAGCombiner::replaceStoreOfFPConstant. I think the idea is that FP constants are usually harder to create in registers than integer constants. Also CPUs usually have more integer resources than FP resources. It doesn't look like it can be disabled currently. I think all of the constants FLI handles can be done in 1 or 2 integer instructions so I'm not very concerned about this. It was more than that I might be concerned.

joshua-arch1 added a comment.Mar 15 2023, 7:45 PM

This comment was removed by joshua-arch1.

In D141560#4198257, @craig.topper wrote:
In D141560#4198252, @joshua-arch1 wrote:
In D141560#4198233, @craig.topper wrote:
In D141560#4198220, @joshua-arch1 wrote:
Anyone knows how to generate FLI from C-code? If I compile the following program, I cannot get FlI. ConstantFP will be converted to Constant in DAG.
void foo_double64 ()
{
  volatile double a;
  a = 0.0625;
}
You just need to use it to do some floating point arithmetic
void foo_double64 (double x)
{
  volatile double a;
  a = x + 0.0625;
}
or return a floating point value
double foo_double64 ()
{
  return 0.0625;
}
Is that because we cannot directly store a ConstantFP in DAG?
The change is done in DAGCombiner::replaceStoreOfFPConstant. I think the idea is that FP constants are usually harder to create in registers than integer constants. Also CPUs usually have more integer resources than FP resources. It doesn't look like it can be disabled currently. I think all of the constants FLI handles can be done in 1 or 2 integer instructions so I'm not very concerned about this. It was more than that I might be concerned.

Will one FLI instruction have less cycles and perform better than two integer instructions?

In D141560#4201276, @joshua-arch1 wrote:
In D141560#4198257, @craig.topper wrote:
In D141560#4198252, @joshua-arch1 wrote:
In D141560#4198233, @craig.topper wrote:
In D141560#4198220, @joshua-arch1 wrote:
Anyone knows how to generate FLI from C-code? If I compile the following program, I cannot get FlI. ConstantFP will be converted to Constant in DAG.
void foo_double64 ()
{
  volatile double a;
  a = 0.0625;
}
You just need to use it to do some floating point arithmetic
void foo_double64 (double x)
{
  volatile double a;
  a = x + 0.0625;
}
or return a floating point value
double foo_double64 ()
{
  return 0.0625;
}
Is that because we cannot directly store a ConstantFP in DAG?
The change is done in DAGCombiner::replaceStoreOfFPConstant. I think the idea is that FP constants are usually harder to create in registers than integer constants. Also CPUs usually have more integer resources than FP resources. It doesn't look like it can be disabled currently. I think all of the constants FLI handles can be done in 1 or 2 integer instructions so I'm not very concerned about this. It was more than that I might be concerned.
Will one FLI instruction have less cycles and perform better than two integer instructions?

Maybe, but it will probably be microarchitecture dependent.

Delaying a store by an extra cycle doesn't seem like a big deal. Other instructions don't directly depend on a store except later loads. The compiler would often be able to see the load is to the same location as the store and forward the data without the load.

If we're storing the same value in a loop we should be hoisting the 2 integer instructions out of the loop. So they wouldn't be executed many times.

Maybe, but it will probably be microarchitecture dependent.

Delaying a store by an extra cycle doesn't seem like a big deal. Other instructions don't directly depend on a store except later loads. The compiler would often be able to see the load is to the same location as the store and forward the data without the load.

If we're storing the same value in a loop we should be hoisting the 2 integer instructions out of the loop. So they wouldn't be executed many times.

Even if we disable replaceStoreOfFPConstant in DAGCombiner, FP constant will still be converted to integer in OptimizeFloatStore() when legalizing.

joshua-arch1 added a comment.Mar 21 2023, 8:21 PM

This comment was removed by joshua-arch1.

In D141560#4212001, @joshua-arch1 wrote:
In D141560#4198257, @craig.topper wrote:
In D141560#4198252, @joshua-arch1 wrote:
In D141560#4198233, @craig.topper wrote:
In D141560#4198220, @joshua-arch1 wrote:
Anyone knows how to generate FLI from C-code? If I compile the following program, I cannot get FlI. ConstantFP will be converted to Constant in DAG.
void foo_double64 ()
{
  volatile double a;
  a = 0.0625;
}
You just need to use it to do some floating point arithmetic
void foo_double64 (double x)
{
  volatile double a;
  a = x + 0.0625;
}
or return a floating point value
double foo_double64 ()
{
  return 0.0625;
}
Is that because we cannot directly store a ConstantFP in DAG?
The change is done in DAGCombiner::replaceStoreOfFPConstant. I think the idea is that FP constants are usually harder to create in registers than integer constants. Also CPUs usually have more integer resources than FP resources. It doesn't look like it can be disabled currently. I think all of the constants FLI handles can be done in 1 or 2 integer instructions so I'm not very concerned about this. It was more than that I might be concerned.
For rv64 in O0, replaceStoreOfFPConstant will not convert FP constant into integers. However, for rv32, FP constant will be changed to integers both in O0 and O2. Is there any special consideration?

I can see that it won't ever do it for f32 on rv64 because it doesn't consider the possibility of using an i32 store when i32 isn't a legal type.

Beyond that I would need to see your test cases.

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVISelDAGToDAG.cpp

7 lines

RISCVISelLowering.cpp

7 lines

RISCVInstrInfoZfa.td

18 lines

test/

CodeGen/

RISCV/

double-zfa.ll

119 lines

float-zfa.ll

119 lines

half-zfa.ll

119 lines

Diff 497964

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

Show First 20 Lines • Show All 698 Lines • ▼ Show 20 Lines	case ISD::Constant: {
if (!isInt<32>(Imm) && isUInt<32>(Imm) && hasAllWUsers(Node))		if (!isInt<32>(Imm) && isUInt<32>(Imm) && hasAllWUsers(Node))
Imm = SignExtend64<32>(Imm);		Imm = SignExtend64<32>(Imm);

ReplaceNode(Node, selectImm(CurDAG, DL, VT, Imm, *Subtarget).getNode());		ReplaceNode(Node, selectImm(CurDAG, DL, VT, Imm, *Subtarget).getNode());
return;		return;
}		}
case ISD::ConstantFP: {		case ISD::ConstantFP: {
const APFloat &APF = cast<ConstantFPSDNode>(Node)->getValueAPF();		const APFloat &APF = cast<ConstantFPSDNode>(Node)->getValueAPF();
		if (Subtarget->hasStdExtZfa()) {
		if ((VT == MVT::f32 && RISCVLoadFPImm::getLoadFP32Imm(APF) != -1) \|\|
		(VT == MVT::f64 && RISCVLoadFPImm::getLoadFP64Imm(APF) != -1) \|\|
		(VT == MVT::f16 && RISCVLoadFPImm::getLoadFP16Imm(APF) != -1))
		break;
		craig.topperUnsubmitted Not Done Reply Inline Actions Why was this isPosZero check added? It wasn't there when the patch was approved. craig.topper: Why was this isPosZero check added? It wasn't there when the patch was approved.
		joshua-arch1AuthorUnsubmitted Done Reply Inline Actions Just in order to make sure loading +0.0 will not use fli.s. joshua-arch1: Just in order to make sure loading +0.0 will not use fli.s.
		craig.topperUnsubmitted Not Done Reply Inline Actions But why treat fli.s different than fli.h or fli.d? craig.topper: But why treat fli.s different than fli.h or fli.d?
		}

bool NegZeroF64 = APF.isNegZero() && VT == MVT::f64;		bool NegZeroF64 = APF.isNegZero() && VT == MVT::f64;
SDValue Imm;		SDValue Imm;
// For +0.0 or f64 -0.0 we need to start from X0. For all others, we will		// For +0.0 or f64 -0.0 we need to start from X0. For all others, we will
// create an integer immediate.		// create an integer immediate.
if (APF.isPosZero() \|\| NegZeroF64)		if (APF.isPosZero() \|\| NegZeroF64)
Imm = CurDAG->getRegister(RISCV::X0, XLenVT);		Imm = CurDAG->getRegister(RISCV::X0, XLenVT);
else		else
Imm = selectImm(CurDAG, DL, XLenVT, APF.bitcastToAPInt().getSExtValue(),		Imm = selectImm(CurDAG, DL, XLenVT, APF.bitcastToAPInt().getSExtValue(),
▲ Show 20 Lines • Show All 2,307 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,479 Lines • ▼ Show 20 Lines	bool RISCVTargetLowering::isOffsetFoldingLegal(
// keep a separate ADD node for the global address offset instead of folding		// keep a separate ADD node for the global address offset instead of folding
// it in the global address node. Later peephole optimisations may choose to		// it in the global address node. Later peephole optimisations may choose to
// fold it back in when profitable.		// fold it back in when profitable.
return false;		return false;
}		}

bool RISCVTargetLowering::isFPImmLegal(const APFloat &Imm, EVT VT,		bool RISCVTargetLowering::isFPImmLegal(const APFloat &Imm, EVT VT,
bool ForCodeSize) const {		bool ForCodeSize) const {
		if (Subtarget.hasStdExtZfa()) {
		if ((VT == MVT::f32 && RISCVLoadFPImm::getLoadFP32Imm(Imm) != -1) \|\|
		(VT == MVT::f64 && RISCVLoadFPImm::getLoadFP64Imm(Imm) != -1) \|\|
		(VT == MVT::f16 && RISCVLoadFPImm::getLoadFP16Imm(Imm) != -1))
		return true;
		}

if (VT == MVT::f16 && !Subtarget.hasStdExtZfhOrZfhmin())		if (VT == MVT::f16 && !Subtarget.hasStdExtZfhOrZfhmin())
return false;		return false;
if (VT == MVT::f32 && !Subtarget.hasStdExtF())		if (VT == MVT::f32 && !Subtarget.hasStdExtF())
return false;		return false;
if (VT == MVT::f64 && !Subtarget.hasStdExtD())		if (VT == MVT::f64 && !Subtarget.hasStdExtD())
return false;		return false;
// Cannot create a 64 bit floating-point immediate value for rv32.		// Cannot create a 64 bit floating-point immediate value for rv32.
if (Subtarget.getXLen() < VT.getScalarSizeInBits()) {		if (Subtarget.getXLen() < VT.getScalarSizeInBits()) {
		craig.topperUnsubmitted Done Reply Inline Actions This needs to be rebased. craig.topper: This needs to be rebased.
// td can handle +0.0 or -0.0 already.		// td can handle +0.0 or -0.0 already.
// -0.0 can be created by fmv + fneg.		// -0.0 can be created by fmv + fneg.
return Imm.isZero();		return Imm.isZero();
}		}
// Special case: the cost for -0.0 is 1.		// Special case: the cost for -0.0 is 1.
int Cost = Imm.isNegZero()		int Cost = Imm.isNegZero()
? 1		? 1
: RISCVMatInt::getIntMatCost(Imm.bitcastToAPInt(),		: RISCVMatInt::getIntMatCost(Imm.bitcastToAPInt(),
Subtarget.getXLen(),		Subtarget.getXLen(),
Subtarget.getFeatureBits());		Subtarget.getFeatureBits());
// If the constantpool data is already in cache, only Cost 1 is cheaper.		// If the constantpool data is already in cache, only Cost 1 is cheaper.
return Cost < FPImmCost;		return Cost < FPImmCost;
		craig.topperUnsubmitted Done Reply Inline Actions I think we can't reach this line when Zfa is enabled now? getLoadFP32Imm/getLoadFP64Imm/getLoadFP16Imm return -1 for 0.0 and -0.0 right? craig.topper: I think we can't reach this line when Zfa is enabled now?
}		}

// TODO: This is very conservative.		// TODO: This is very conservative.
bool RISCVTargetLowering::isExtractSubvectorCheap(EVT ResVT, EVT SrcVT,		bool RISCVTargetLowering::isExtractSubvectorCheap(EVT ResVT, EVT SrcVT,
unsigned Index) const {		unsigned Index) const {
if (!isOperationLegalOrCustom(ISD::EXTRACT_SUBVECTOR, ResVT))		if (!isOperationLegalOrCustom(ISD::EXTRACT_SUBVECTOR, ResVT))
return false;		return false;

▲ Show 20 Lines • Show All 494 Lines • ▼ Show 20 Lines	if (!DstVT.isVector()) {
if (Src.getSimpleValueType() == MVT::f16 && !Subtarget.hasStdExtZfh()) {		if (Src.getSimpleValueType() == MVT::f16 && !Subtarget.hasStdExtZfh()) {
Src = DAG.getNode(ISD::FP_EXTEND, SDLoc(Op), MVT::f32, Src);		Src = DAG.getNode(ISD::FP_EXTEND, SDLoc(Op), MVT::f32, Src);
}		}

unsigned Opc;		unsigned Opc;
if (SatVT == DstVT)		if (SatVT == DstVT)
Opc = IsSigned ? RISCVISD::FCVT_X : RISCVISD::FCVT_XU;		Opc = IsSigned ? RISCVISD::FCVT_X : RISCVISD::FCVT_XU;
else if (DstVT == MVT::i64 && SatVT == MVT::i32)		else if (DstVT == MVT::i64 && SatVT == MVT::i32)
Opc = IsSigned ? RISCVISD::FCVT_W_RV64 : RISCVISD::FCVT_WU_RV64;		Opc = IsSigned ? RISCVISD::FCVT_W_RV64 : RISCVISD::FCVT_WU_RV64;
		craig.topperUnsubmitted Not Done Reply Inline Actions FCVTMOD does not saturate. It returns the lower bits of the overflowed value. These instruction cannot be used to lower saturating conversion. craig.topper: FCVTMOD does not saturate. It returns the lower bits of the overflowed value. These…
		joshua-arch1AuthorUnsubmitted Not Done Reply Inline Actions So what's the difference between FCVTMOD and normal FP_TO_INT? Is NaN converted to zero in normal FP_TO_INT? joshua-arch1: So what's the difference between FCVTMOD and normal FP_TO_INT? Is NaN converted to zero in…
		joshua-arch1AuthorUnsubmitted Done Reply Inline Actions What instruction sequence does FCVTMOD need to replace? I'm a bit confused. joshua-arch1: What instruction sequence does FCVTMOD need to replace? I'm a bit confused.
		craig.topperUnsubmitted Not Done Reply Inline Actions FCVTMOD follows the semantics of JavaScript. It’s for being able writing JavaScript JITs for RISC-V. It’s not useful for C or Rust. craig.topper: FCVTMOD follows the semantics of JavaScript. It’s for being able writing JavaScript JITs for…
else		else
return SDValue();		return SDValue();
// FIXME: Support other SatVTs by clamping before or after the conversion.		// FIXME: Support other SatVTs by clamping before or after the conversion.

SDLoc DL(Op);		SDLoc DL(Op);
SDValue FpToInt = DAG.getNode(		SDValue FpToInt = DAG.getNode(
Opc, DL, DstVT, Src,		Opc, DL, DstVT, Src,
DAG.getTargetConstant(RISCVFPRndMode::RTZ, DL, Subtarget.getXLenVT()));		DAG.getTargetConstant(RISCVFPRndMode::RTZ, DL, Subtarget.getXLenVT()));
▲ Show 20 Lines • Show All 1,795 Lines • ▼ Show 20 Lines	case ISD::BITCAST: {
}		}
if (VT == MVT::f32 && Op0VT == MVT::i32 && Subtarget.is64Bit() &&		if (VT == MVT::f32 && Op0VT == MVT::i32 && Subtarget.is64Bit() &&
Subtarget.hasStdExtF()) {		Subtarget.hasStdExtF()) {
SDValue NewOp0 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op0);		SDValue NewOp0 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op0);
SDValue FPConv =		SDValue FPConv =
DAG.getNode(RISCVISD::FMV_W_X_RV64, DL, MVT::f32, NewOp0);		DAG.getNode(RISCVISD::FMV_W_X_RV64, DL, MVT::f32, NewOp0);
return FPConv;		return FPConv;
}		}
if (VT == MVT::f64 && Op0VT == MVT::i64 && XLenVT == MVT::i32 &&		if (VT == MVT::f64 && Op0VT == MVT::i64 && XLenVT == MVT::i32 &&
		craig.topperUnsubmitted Not Done Reply Inline Actions Does this need to be qualified with Zfa? craig.topper: Does this need to be qualified with Zfa?
Subtarget.hasStdExtZfa()) {		Subtarget.hasStdExtZfa()) {
SDValue Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, MVT::i32, Op0,		SDValue Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, MVT::i32, Op0,
DAG.getConstant(0, DL, MVT::i32));		DAG.getConstant(0, DL, MVT::i32));
SDValue Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, MVT::i32, Op0,		SDValue Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, MVT::i32, Op0,
DAG.getConstant(1, DL, MVT::i32));		DAG.getConstant(1, DL, MVT::i32));
SDValue RetReg =		SDValue RetReg =
DAG.getNode(RISCVISD::BuildPairF64, DL, MVT::f64, Lo, Hi);		DAG.getNode(RISCVISD::BuildPairF64, DL, MVT::f64, Lo, Hi);
return RetReg;		return RetReg;
▲ Show 20 Lines • Show All 4,201 Lines • ▼ Show 20 Lines	if (VT == MVT::i16 && Op0VT == MVT::f16 &&
Subtarget.hasStdExtZfhOrZfhmin()) {		Subtarget.hasStdExtZfhOrZfhmin()) {
SDValue FPConv = DAG.getNode(RISCVISD::FMV_X_ANYEXTH, DL, XLenVT, Op0);		SDValue FPConv = DAG.getNode(RISCVISD::FMV_X_ANYEXTH, DL, XLenVT, Op0);
Results.push_back(DAG.getNode(ISD::TRUNCATE, DL, MVT::i16, FPConv));		Results.push_back(DAG.getNode(ISD::TRUNCATE, DL, MVT::i16, FPConv));
} else if (VT == MVT::i32 && Op0VT == MVT::f32 && Subtarget.is64Bit() &&		} else if (VT == MVT::i32 && Op0VT == MVT::f32 && Subtarget.is64Bit() &&
Subtarget.hasStdExtF()) {		Subtarget.hasStdExtF()) {
SDValue FPConv =		SDValue FPConv =
DAG.getNode(RISCVISD::FMV_X_ANYEXTW_RV64, DL, MVT::i64, Op0);		DAG.getNode(RISCVISD::FMV_X_ANYEXTW_RV64, DL, MVT::i64, Op0);
Results.push_back(DAG.getNode(ISD::TRUNCATE, DL, MVT::i32, FPConv));		Results.push_back(DAG.getNode(ISD::TRUNCATE, DL, MVT::i32, FPConv));
} else if (VT == MVT::i64 && Op0VT == MVT::f64 && XLenVT == MVT::i32 &&		} else if (VT == MVT::i64 && Op0VT == MVT::f64 && XLenVT == MVT::i32 &&
		craig.topperUnsubmitted Done Reply Inline Actions Does this need to be qualified with Zfa? craig.topper: Does this need to be qualified with Zfa?
Subtarget.hasStdExtZfa()) {		Subtarget.hasStdExtZfa()) {
SDValue NewReg = DAG.getNode(RISCVISD::SplitF64, DL,		SDValue NewReg = DAG.getNode(RISCVISD::SplitF64, DL,
DAG.getVTList(MVT::i32, MVT::i32), Op0);		DAG.getVTList(MVT::i32, MVT::i32), Op0);
SDValue Lo = NewReg.getValue(0);		SDValue Lo = NewReg.getValue(0);
SDValue Hi = NewReg.getValue(1);		SDValue Hi = NewReg.getValue(1);
SDValue RetReg = DAG.getNode(ISD::BUILD_PAIR, DL, MVT::i64, Lo, Hi);		SDValue RetReg = DAG.getNode(ISD::BUILD_PAIR, DL, MVT::i64, Lo, Hi);
Results.push_back(RetReg);		Results.push_back(RetReg);
} else if (!VT.isVector() && Op0VT.isFixedLengthVector() &&		} else if (!VT.isVector() && Op0VT.isFixedLengthVector() &&
▲ Show 20 Lines • Show All 6,292 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVInstrInfoZfa.td

	Show First 20 Lines • Show All 129 Lines • ▼ Show 20 Lines
	def FMVH_X_D : FPUnaryOp_r<0b1110001, 0b00001, 0b000, GPR, FPR64, "fmvh.x.d">,			def FMVH_X_D : FPUnaryOp_r<0b1110001, 0b00001, 0b000, GPR, FPR64, "fmvh.x.d">,
	Sched<[WriteFMovF32ToI32, ReadFMovF32ToI32]>;			Sched<[WriteFMovF32ToI32, ReadFMovF32ToI32]>;
	def FMVP_D_X : FPBinaryOp_rr<0b1011001, 0b000, FPR64, GPR, "fmvp.d.x">,			def FMVP_D_X : FPBinaryOp_rr<0b1011001, 0b000, FPR64, GPR, "fmvp.d.x">,
	Sched<[WriteFMovI32ToF32, ReadFMovI32ToF32]>;			Sched<[WriteFMovI32ToF32, ReadFMovI32ToF32]>;
	let isCodeGenOnly = 1, mayRaiseFPException = 0 in {			let isCodeGenOnly = 1, mayRaiseFPException = 0 in {
	def FMV_X_W_FPR64 : FPUnaryOp_r<0b1110000, 0b00000, 0b000, GPR, FPR64, "fmv.x.w">,			def FMV_X_W_FPR64 : FPUnaryOp_r<0b1110000, 0b00000, 0b000, GPR, FPR64, "fmv.x.w">,
	Sched<[WriteFMovF32ToI32, ReadFMovF32ToI32]>;			Sched<[WriteFMovF32ToI32, ReadFMovF32ToI32]>;
	}			}
	} // Predicates = [HasStdExtZfa, HasStdExtD, IsRV32]			} // Predicates = [HasStdExtZfa, HasStdExtD, IsRV32]
				craig.topperUnsubmitted Done Reply Inline Actions This name is misleading. There is no fmvh.x.w instruction. FMV_X_W_FPR64 would be better. craig.topper: This name is misleading. There is no fmvh.x.w instruction. FMV_X_W_FPR64 would be better.

	let Predicates = [HasStdExtZfa, HasStdExtZfh] in {			let Predicates = [HasStdExtZfa, HasStdExtZfh] in {
	def FLI_H : FPUnaryOp_imm<0b1111010, 0b00001, 0b000, OPC_OP_FP, (outs FPR16:$rd),			def FLI_H : FPUnaryOp_imm<0b1111010, 0b00001, 0b000, OPC_OP_FP, (outs FPR16:$rd),
	(ins loadfp16imm:$imm), "fli.h", "$rd, $imm">,			(ins loadfp16imm:$imm), "fli.h", "$rd, $imm">,
	Sched<[WriteFMovI16ToF16, ReadFMovI16ToF16]>;			Sched<[WriteFMovI16ToF16, ReadFMovI16ToF16]>;

	def FMINM_H: FPALU_rr<0b0010110, 0b010, "fminm.h", FPR16, /Commutable/ 1>;			def FMINM_H: FPALU_rr<0b0010110, 0b010, "fminm.h", FPR16, /Commutable/ 1>;
	def FMAXM_H: FPALU_rr<0b0010110, 0b011, "fmaxm.h", FPR16, /Commutable/ 1>;			def FMAXM_H: FPALU_rr<0b0010110, 0b011, "fmaxm.h", FPR16, /Commutable/ 1>;

	def FROUND_H : FPUnaryOp_r_frm<0b0100010, 0b00100, FPR16, FPR16, "fround.h">;			def FROUND_H : FPUnaryOp_r_frm<0b0100010, 0b00100, FPR16, FPR16, "fround.h">;
	def FROUNDNX_H : FPUnaryOp_r_frm<0b0100010, 0b00101, FPR16, FPR16, "froundnx.h">;			def FROUNDNX_H : FPUnaryOp_r_frm<0b0100010, 0b00101, FPR16, FPR16, "froundnx.h">;

	def FLTQ_H : FPCmp_rr<0b1010010, 0b101, "fltq.h", FPR16, /Commutable/ 1>;			def FLTQ_H : FPCmp_rr<0b1010010, 0b101, "fltq.h", FPR16, /Commutable/ 1>;
	def FLEQ_H : FPCmp_rr<0b1010010, 0b100, "fleq.h", FPR16, /Commutable/ 1>;			def FLEQ_H : FPCmp_rr<0b1010010, 0b100, "fleq.h", FPR16, /Commutable/ 1>;
	} // Predicates = [HasStdExtZfa, HasStdExtZfh]			} // Predicates = [HasStdExtZfa, HasStdExtZfh]


	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Codegen patterns			// Codegen patterns
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

				def fp32imm_to_loadfpimm : SDNodeXForm<fpimm, [{
				craig.topperUnsubmitted Not Done Reply Inline Actions This isn't a "bitcast". It's an encoding conversion. fpimm is now handled with custom code in RISCVISelDAGToDAG.cpp so this needs to be moved there. craig.topper: This isn't a "bitcast". It's an encoding conversion. fpimm is now handled with custom code in…
				return CurDAG->getTargetConstant(RISCVLoadFPImm::getLoadFP32Imm(N->getValueAPF()),
				SDLoc(N), Subtarget->getXLenVT());}]>;

				def fp64imm_to_loadfpimm : SDNodeXForm<fpimm, [{
				return CurDAG->getTargetConstant(RISCVLoadFPImm::getLoadFP64Imm(N->getValueAPF()),
				SDLoc(N), Subtarget->getXLenVT());}]>;

				def fp16imm_to_loadfpimm : SDNodeXForm<fpimm, [{
				return CurDAG->getTargetConstant(RISCVLoadFPImm::getLoadFP16Imm(N->getValueAPF()),
				SDLoc(N), Subtarget->getXLenVT());}]>;

	let Predicates = [HasStdExtZfa] in {			let Predicates = [HasStdExtZfa] in {
				def : Pat<(f32 fpimm:$imm), (FLI_S (fp32imm_to_loadfpimm fpimm:$imm))>;

				craig.topperUnsubmitted Done Reply Inline Actions Why do you need a COPY_TO_REGCLASS? craig.topper: Why do you need a COPY_TO_REGCLASS?
	def: PatFprFpr<fminimum, FMINM_S, FPR32>;			def: PatFprFpr<fminimum, FMINM_S, FPR32>;
	def: PatFprFpr<fmaximum, FMAXM_S, FPR32>;			def: PatFprFpr<fmaximum, FMAXM_S, FPR32>;

	// frint rounds according to the current rounding mode and detects			// frint rounds according to the current rounding mode and detects
	// inexact conditions.			// inexact conditions.
	def: Pat<(any_frint FPR32 : $rs1), (FROUNDNX_S FPR32 : $rs1, 0b111)>;			def: Pat<(any_frint FPR32 : $rs1), (FROUNDNX_S FPR32 : $rs1, 0b111)>;

	// fnearbyint is like frint but does not detect inexact conditions.			// fnearbyint is like frint but does not detect inexact conditions.
	def: Pat<(any_fnearbyint FPR32 : $rs1), (FROUND_S FPR32 : $rs1, 0b111)>;			def: Pat<(any_fnearbyint FPR32 : $rs1), (FROUND_S FPR32 : $rs1, 0b111)>;

	def: Pat<(any_fround FPR32 : $rs1), (FROUND_S FPR32 : $rs1, 0b100)>;			def: Pat<(any_fround FPR32 : $rs1), (FROUND_S FPR32 : $rs1, 0b100)>;
	def: Pat<(any_ffloor FPR32 : $rs1), (FROUND_S FPR32 : $rs1, 0b011)>;			def: Pat<(any_ffloor FPR32 : $rs1), (FROUND_S FPR32 : $rs1, 0b011)>;
	def: Pat<(any_fceil FPR32 : $rs1), (FROUND_S FPR32 : $rs1, 0b010)>;			def: Pat<(any_fceil FPR32 : $rs1), (FROUND_S FPR32 : $rs1, 0b010)>;
	def: Pat<(any_ftrunc FPR32 : $rs1), (FROUND_S FPR32 : $rs1, 0b001)>;			def: Pat<(any_ftrunc FPR32 : $rs1), (FROUND_S FPR32 : $rs1, 0b001)>;

	def: PatSetCC<FPR32, strict_fsetcc, SETLT, FLTQ_S>;			def: PatSetCC<FPR32, strict_fsetcc, SETLT, FLTQ_S>;
	def: PatSetCC<FPR32, strict_fsetcc, SETOLT, FLTQ_S>;			def: PatSetCC<FPR32, strict_fsetcc, SETOLT, FLTQ_S>;
				craig.topperUnsubmitted Done Reply Inline Actions What is giving these patterns priority over the ones in RISCVInstrInfoF.td? craig.topper: What is giving these patterns priority over the ones in RISCVInstrInfoF.td?
				craig.topperUnsubmitted Done Reply Inline Actions I think this works because patterns with UsesCustomInserter=1 have lower priority than patterns that don't use CustomInserter. craig.topper: I think this works because patterns with UsesCustomInserter=1 have lower priority than patterns…
	def: PatSetCC<FPR32, strict_fsetcc, SETLE, FLEQ_S>;			def: PatSetCC<FPR32, strict_fsetcc, SETLE, FLEQ_S>;
	def: PatSetCC<FPR32, strict_fsetcc, SETOLE, FLEQ_S>;			def: PatSetCC<FPR32, strict_fsetcc, SETOLE, FLEQ_S>;
	} // Predicates = [HasStdExtZfa]			} // Predicates = [HasStdExtZfa]

	let Predicates = [HasStdExtZfa, HasStdExtD] in {			let Predicates = [HasStdExtZfa, HasStdExtD] in {
				def : Pat<(f64 fpimm:$imm), (FLI_D (fp64imm_to_loadfpimm fpimm:$imm))>;

	def: PatFprFpr<fminimum, FMINM_D, FPR64>;			def: PatFprFpr<fminimum, FMINM_D, FPR64>;
	def: PatFprFpr<fmaximum, FMAXM_D, FPR64>;			def: PatFprFpr<fmaximum, FMAXM_D, FPR64>;

	// frint rounds according to the current rounding mode and detects			// frint rounds according to the current rounding mode and detects
	// inexact conditions.			// inexact conditions.
	def: Pat<(any_frint FPR64 : $rs1), (FROUNDNX_D FPR64 : $rs1, 0b111)>;			def: Pat<(any_frint FPR64 : $rs1), (FROUNDNX_D FPR64 : $rs1, 0b111)>;

	// fnearbyint is like frint but does not detect inexact conditions.			// fnearbyint is like frint but does not detect inexact conditions.
	def: Pat<(any_fnearbyint FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b111)>;			def: Pat<(any_fnearbyint FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b111)>;

	def: Pat<(any_fround FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b100)>;			def: Pat<(any_fround FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b100)>;
	def: Pat<(any_froundeven FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b000)>;			def: Pat<(any_froundeven FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b000)>;
	def: Pat<(any_ffloor FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b011)>;			def: Pat<(any_ffloor FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b011)>;
	def: Pat<(any_fceil FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b010)>;			def: Pat<(any_fceil FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b010)>;
	def: Pat<(any_ftrunc FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b001)>;			def: Pat<(any_ftrunc FPR64 : $rs1), (FROUND_D FPR64 : $rs1, 0b001)>;

	def: PatSetCC<FPR64, strict_fsetcc, SETLT, FLTQ_D>;			def: PatSetCC<FPR64, strict_fsetcc, SETLT, FLTQ_D>;
	def: PatSetCC<FPR64, strict_fsetcc, SETOLT, FLTQ_D>;			def: PatSetCC<FPR64, strict_fsetcc, SETOLT, FLTQ_D>;
	def: PatSetCC<FPR64, strict_fsetcc, SETLE, FLEQ_D>;			def: PatSetCC<FPR64, strict_fsetcc, SETLE, FLEQ_D>;
	def: PatSetCC<FPR64, strict_fsetcc, SETOLE, FLEQ_D>;			def: PatSetCC<FPR64, strict_fsetcc, SETOLE, FLEQ_D>;
	} // Predicates = [HasStdExtZfa, HasStdExtD]			} // Predicates = [HasStdExtZfa, HasStdExtD]

	let Predicates = [HasStdExtZfa, HasStdExtZfh] in {			let Predicates = [HasStdExtZfa, HasStdExtZfh] in {
				def : Pat<(f16 fpimm:$imm), (FLI_H (fp16imm_to_loadfpimm fpimm:$imm))>;

	def: PatFprFpr<fminimum, FMINM_H, FPR16>;			def: PatFprFpr<fminimum, FMINM_H, FPR16>;
	def: PatFprFpr<fmaximum, FMAXM_H, FPR16>;			def: PatFprFpr<fmaximum, FMAXM_H, FPR16>;

	// frint rounds according to the current rounding mode and detects			// frint rounds according to the current rounding mode and detects
	// inexact conditions.			// inexact conditions.
	def: Pat<(any_frint FPR16 : $rs1), (FROUNDNX_H FPR16 : $rs1, 0b111)>;			def: Pat<(any_frint FPR16 : $rs1), (FROUNDNX_H FPR16 : $rs1, 0b111)>;

	// fnearbyint is like frint but does not detect inexact conditions.			// fnearbyint is like frint but does not detect inexact conditions.
	Show All 13 Lines

llvm/test/CodeGen/RISCV/double-zfa.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=riscv32 -target-abi ilp32d -mattr=+experimental-zfa,+d < %s \			; RUN: llc -mtriple=riscv32 -target-abi ilp32d -mattr=+experimental-zfa,+d < %s \
	; RUN: \| FileCheck --check-prefix=RV32IDZFA %s			; RUN: \| FileCheck --check-prefix=RV32IDZFA %s
	; RUN: llc -mtriple=riscv64 -target-abi lp64d -mattr=+experimental-zfa,+d < %s \			; RUN: llc -mtriple=riscv64 -target-abi lp64d -mattr=+experimental-zfa,+d < %s \
	; RUN: \| FileCheck --check-prefix=RV64DZFA %s			; RUN: \| FileCheck --check-prefix=RV64DZFA %s

				define double @loadfpimm1() {
				; RV32IDZFA-LABEL: loadfpimm1:
				; RV32IDZFA: # %bb.0:
				; RV32IDZFA-NEXT: fli.d fa0, 6.250000e-02
				; RV32IDZFA-NEXT: ret
				;
				; RV64DZFA-LABEL: loadfpimm1:
				; RV64DZFA: # %bb.0:
				; RV64DZFA-NEXT: fli.d fa0, 6.250000e-02
				; RV64DZFA-NEXT: ret
				ret double 0.0625
				}

				define double @loadfpimm2() {
				; RV32IDZFA-LABEL: loadfpimm2:
				; RV32IDZFA: # %bb.0:
				; RV32IDZFA-NEXT: fli.d fa0, 7.500000e-01
				; RV32IDZFA-NEXT: ret
				;
				; RV64DZFA-LABEL: loadfpimm2:
				; RV64DZFA: # %bb.0:
				; RV64DZFA-NEXT: fli.d fa0, 7.500000e-01
				; RV64DZFA-NEXT: ret
				ret double 0.75
				}

				define double @loadfpimm3() {
				; RV32IDZFA-LABEL: loadfpimm3:
				; RV32IDZFA: # %bb.0:
				; RV32IDZFA-NEXT: fli.d fa0, 1.250000e+00
				; RV32IDZFA-NEXT: ret
				;
				; RV64DZFA-LABEL: loadfpimm3:
				; RV64DZFA: # %bb.0:
				; RV64DZFA-NEXT: fli.d fa0, 1.250000e+00
				; RV64DZFA-NEXT: ret
				ret double 1.25
				}

				define double @loadfpimm4() {
				; RV32IDZFA-LABEL: loadfpimm4:
				; RV32IDZFA: # %bb.0:
				; RV32IDZFA-NEXT: fli.d fa0, 3.000000e+00
				; RV32IDZFA-NEXT: ret
				;
				; RV64DZFA-LABEL: loadfpimm4:
				; RV64DZFA: # %bb.0:
				; RV64DZFA-NEXT: fli.d fa0, 3.000000e+00
				; RV64DZFA-NEXT: ret
				ret double 3.0
				}

				define double @loadfpimm5() {
				; RV32IDZFA-LABEL: loadfpimm5:
				; RV32IDZFA: # %bb.0:
				; RV32IDZFA-NEXT: fli.d fa0, 2.560000e+02
				; RV32IDZFA-NEXT: ret
				;
				; RV64DZFA-LABEL: loadfpimm5:
				; RV64DZFA: # %bb.0:
				; RV64DZFA-NEXT: fli.d fa0, 2.560000e+02
				; RV64DZFA-NEXT: ret
				ret double 256.0
				}

				define double @loadfpimm6() {
				; RV32IDZFA-LABEL: loadfpimm6:
				; RV32IDZFA: # %bb.0:
				; RV32IDZFA-NEXT: fli.d fa0, inf
				; RV32IDZFA-NEXT: ret
				;
				; RV64DZFA-LABEL: loadfpimm6:
				; RV64DZFA: # %bb.0:
				; RV64DZFA-NEXT: fli.d fa0, inf
				; RV64DZFA-NEXT: ret
				ret double 0x7FF0000000000000
				}

				define double @loadfpimm7() {
				; RV32IDZFA-LABEL: loadfpimm7:
				; RV32IDZFA: # %bb.0:
				; RV32IDZFA-NEXT: fli.d fa0, nan
				; RV32IDZFA-NEXT: ret
				;
				; RV64DZFA-LABEL: loadfpimm7:
				; RV64DZFA: # %bb.0:
				; RV64DZFA-NEXT: fli.d fa0, nan
				; RV64DZFA-NEXT: ret
				ret double 0x7FF8000000000000
				}

				define double @loadfpimm8() {
				; RV32IDZFA-LABEL: loadfpimm8:
				; RV32IDZFA: # %bb.0:
				; RV32IDZFA-NEXT: fli.d fa0, min
				; RV32IDZFA-NEXT: ret
				;
				; RV64DZFA-LABEL: loadfpimm8:
				; RV64DZFA: # %bb.0:
				; RV64DZFA-NEXT: fli.d fa0, min
				; RV64DZFA-NEXT: ret
				ret double 0x0010000000000000
				craig.topperUnsubmitted Done Reply Inline Actions This doesn't look like the minimum value for normal value for double. I would expect the mantissa bits to be 0. So it should be 0x0010000000000000 I think? craig.topper: This doesn't look like the minimum value for normal value for double. I would expect the…
				joshua-arch1AuthorUnsubmitted Done Reply Inline Actions It is IR syntax. I have written a simple case to confirm that this value is used to express 0x0010000000000000 in IR level. joshua-arch1: It is IR syntax. I have written a simple case to confirm that this value is used to express…
				craig.topperUnsubmitted Done Reply Inline Actions It’s the IR syntax for double just the hex representation of double. craig.topper: It’s the IR syntax for double just the hex representation of double.
				}

				define double @loadfpimm9() {
				; RV32IDZFA-LABEL: loadfpimm9:
				; RV32IDZFA: # %bb.0:
				; RV32IDZFA-NEXT: lui a0, %hi(.LCPI8_0)
				; RV32IDZFA-NEXT: fld fa0, %lo(.LCPI8_0)(a0)
				; RV32IDZFA-NEXT: ret
				;
				; RV64DZFA-LABEL: loadfpimm9:
				; RV64DZFA: # %bb.0:
				; RV64DZFA-NEXT: lui a0, %hi(.LCPI8_0)
				; RV64DZFA-NEXT: fld fa0, %lo(.LCPI8_0)(a0)
				; RV64DZFA-NEXT: ret
				ret double 255.0
				}

	declare double @llvm.minimum.f64(double, double)			declare double @llvm.minimum.f64(double, double)

	define double @fminm_d(double %a, double %b) nounwind {			define double @fminm_d(double %a, double %b) nounwind {
	; RV32IDZFA-LABEL: fminm_d:			; RV32IDZFA-LABEL: fminm_d:
	; RV32IDZFA: # %bb.0:			; RV32IDZFA: # %bb.0:
	; RV32IDZFA-NEXT: fminm.d fa0, fa0, fa1			; RV32IDZFA-NEXT: fminm.d fa0, fa0, fa1
	; RV32IDZFA-NEXT: ret			; RV32IDZFA-NEXT: ret
	;			;
	▲ Show 20 Lines • Show All 185 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/float-zfa.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=riscv32 -target-abi ilp32f -mattr=+experimental-zfa < %s \			; RUN: llc -mtriple=riscv32 -target-abi ilp32f -mattr=+experimental-zfa < %s \
	; RUN: \| FileCheck --check-prefix=RV32IZFA %s			; RUN: \| FileCheck --check-prefix=RV32IZFA %s
	; RUN: llc -mtriple=riscv64 -target-abi lp64f -mattr=+experimental-zfa < %s \			; RUN: llc -mtriple=riscv64 -target-abi lp64f -mattr=+experimental-zfa < %s \
	; RUN: \| FileCheck --check-prefix=RV64IZFA %s			; RUN: \| FileCheck --check-prefix=RV64IZFA %s

				define float @loadfpimm1() {
				; RV32IZFA-LABEL: loadfpimm1:
				; RV32IZFA: # %bb.0:
				; RV32IZFA-NEXT: fli.s fa0, 6.250000e-02
				; RV32IZFA-NEXT: ret
				;
				; RV64IZFA-LABEL: loadfpimm1:
				; RV64IZFA: # %bb.0:
				; RV64IZFA-NEXT: fli.s fa0, 6.250000e-02
				; RV64IZFA-NEXT: ret
				ret float 0.0625
				}

				define float @loadfpimm2() {
				; RV32IZFA-LABEL: loadfpimm2:
				; RV32IZFA: # %bb.0:
				; RV32IZFA-NEXT: fli.s fa0, 7.500000e-01
				; RV32IZFA-NEXT: ret
				;
				; RV64IZFA-LABEL: loadfpimm2:
				; RV64IZFA: # %bb.0:
				; RV64IZFA-NEXT: fli.s fa0, 7.500000e-01
				; RV64IZFA-NEXT: ret
				ret float 0.75
				}

				define float @loadfpimm3() {
				; RV32IZFA-LABEL: loadfpimm3:
				; RV32IZFA: # %bb.0:
				; RV32IZFA-NEXT: fli.s fa0, 1.250000e+00
				; RV32IZFA-NEXT: ret
				;
				; RV64IZFA-LABEL: loadfpimm3:
				; RV64IZFA: # %bb.0:
				; RV64IZFA-NEXT: fli.s fa0, 1.250000e+00
				; RV64IZFA-NEXT: ret
				ret float 1.25
				}

				define float @loadfpimm4() {
				; RV32IZFA-LABEL: loadfpimm4:
				; RV32IZFA: # %bb.0:
				; RV32IZFA-NEXT: fli.s fa0, 3.000000e+00
				; RV32IZFA-NEXT: ret
				;
				; RV64IZFA-LABEL: loadfpimm4:
				; RV64IZFA: # %bb.0:
				; RV64IZFA-NEXT: fli.s fa0, 3.000000e+00
				; RV64IZFA-NEXT: ret
				ret float 3.0
				}

				define float @loadfpimm5() {
				; RV32IZFA-LABEL: loadfpimm5:
				; RV32IZFA: # %bb.0:
				; RV32IZFA-NEXT: fli.s fa0, 2.560000e+02
				; RV32IZFA-NEXT: ret
				;
				; RV64IZFA-LABEL: loadfpimm5:
				; RV64IZFA: # %bb.0:
				; RV64IZFA-NEXT: fli.s fa0, 2.560000e+02
				; RV64IZFA-NEXT: ret
				ret float 256.0
				}

				define float @loadfpimm6() {
				; RV32IZFA-LABEL: loadfpimm6:
				; RV32IZFA: # %bb.0:
				; RV32IZFA-NEXT: fli.s fa0, inf
				; RV32IZFA-NEXT: ret
				;
				; RV64IZFA-LABEL: loadfpimm6:
				; RV64IZFA: # %bb.0:
				; RV64IZFA-NEXT: fli.s fa0, inf
				; RV64IZFA-NEXT: ret
				ret float 0x7FF0000000000000
				}

				define float @loadfpimm7() {
				; RV32IZFA-LABEL: loadfpimm7:
				; RV32IZFA: # %bb.0:
				; RV32IZFA-NEXT: fli.s fa0, nan
				; RV32IZFA-NEXT: ret
				;
				; RV64IZFA-LABEL: loadfpimm7:
				; RV64IZFA: # %bb.0:
				; RV64IZFA-NEXT: fli.s fa0, nan
				; RV64IZFA-NEXT: ret
				ret float 0x7FF8000000000000
				}

				define float @loadfpimm8() {
				; RV32IZFA-LABEL: loadfpimm8:
				; RV32IZFA: # %bb.0:
				; RV32IZFA-NEXT: fli.s fa0, min
				; RV32IZFA-NEXT: ret
				;
				; RV64IZFA-LABEL: loadfpimm8:
				; RV64IZFA: # %bb.0:
				; RV64IZFA-NEXT: fli.s fa0, min
				; RV64IZFA-NEXT: ret
				ret float 0x3810000000000000
				}

				define float @loadfpimm9() {
				; RV32IZFA-LABEL: loadfpimm9:
				; RV32IZFA: # %bb.0:
				; RV32IZFA-NEXT: lui a0, 276464
				; RV32IZFA-NEXT: fmv.w.x fa0, a0
				; RV32IZFA-NEXT: ret
				;
				; RV64IZFA-LABEL: loadfpimm9:
				; RV64IZFA: # %bb.0:
				; RV64IZFA-NEXT: lui a0, 276464
				; RV64IZFA-NEXT: fmv.w.x fa0, a0
				; RV64IZFA-NEXT: ret
				ret float 255.0
				craig.topperUnsubmitted Not Done Reply Inline Actions Why 255.0? Is this a negative test? craig.topper: Why 255.0? Is this a negative test?
				joshua-arch1AuthorUnsubmitted Done Reply Inline Actions Yep. I just want to ensure load floating-point immediates of other values will not generate FLI. joshua-arch1: Yep. I just want to ensure load floating-point immediates of other values will not generate FLI.
				}

	declare float @llvm.minimum.f32(float, float)			declare float @llvm.minimum.f32(float, float)

	define float @fminm_s(float %a, float %b) nounwind {			define float @fminm_s(float %a, float %b) nounwind {
	; RV32IZFA-LABEL: fminm_s:			; RV32IZFA-LABEL: fminm_s:
	; RV32IZFA: # %bb.0:			; RV32IZFA: # %bb.0:
	; RV32IZFA-NEXT: fminm.s fa0, fa0, fa1			; RV32IZFA-NEXT: fminm.s fa0, fa0, fa1
	; RV32IZFA-NEXT: ret			; RV32IZFA-NEXT: ret
	;			;
	▲ Show 20 Lines • Show All 157 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/half-zfa.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=riscv32 -target-abi ilp32f -mattr=+experimental-zfa,+zfh < %s \			; RUN: llc -mtriple=riscv32 -target-abi ilp32f -mattr=+experimental-zfa,+zfh < %s \
	; RUN: \| FileCheck --check-prefix=RV32IHZFA %s			; RUN: \| FileCheck --check-prefix=RV32IHZFA %s
	; RUN: llc -mtriple=riscv64 -target-abi lp64f -mattr=+experimental-zfa,+zfh < %s \			; RUN: llc -mtriple=riscv64 -target-abi lp64f -mattr=+experimental-zfa,+zfh < %s \
	; RUN: \| FileCheck --check-prefix=RV64HZFA %s			; RUN: \| FileCheck --check-prefix=RV64HZFA %s

				define half @loadfpimm1() {
				; RV32IHZFA-LABEL: loadfpimm1:
				; RV32IHZFA: # %bb.0:
				; RV32IHZFA-NEXT: fli.h fa0, 6.250000e-02
				; RV32IHZFA-NEXT: ret
				;
				; RV64HZFA-LABEL: loadfpimm1:
				; RV64HZFA: # %bb.0:
				; RV64HZFA-NEXT: fli.h fa0, 6.250000e-02
				; RV64HZFA-NEXT: ret
				ret half 0.0625
				}

				define half @loadfpimm2() {
				; RV32IHZFA-LABEL: loadfpimm2:
				; RV32IHZFA: # %bb.0:
				; RV32IHZFA-NEXT: fli.h fa0, 7.500000e-01
				; RV32IHZFA-NEXT: ret
				;
				; RV64HZFA-LABEL: loadfpimm2:
				; RV64HZFA: # %bb.0:
				; RV64HZFA-NEXT: fli.h fa0, 7.500000e-01
				; RV64HZFA-NEXT: ret
				ret half 0.75
				}

				define half @loadfpimm3() {
				; RV32IHZFA-LABEL: loadfpimm3:
				; RV32IHZFA: # %bb.0:
				; RV32IHZFA-NEXT: fli.h fa0, 1.250000e+00
				; RV32IHZFA-NEXT: ret
				;
				; RV64HZFA-LABEL: loadfpimm3:
				; RV64HZFA: # %bb.0:
				; RV64HZFA-NEXT: fli.h fa0, 1.250000e+00
				; RV64HZFA-NEXT: ret
				ret half 1.25
				}

				define half @loadfpimm4() {
				; RV32IHZFA-LABEL: loadfpimm4:
				; RV32IHZFA: # %bb.0:
				; RV32IHZFA-NEXT: fli.h fa0, 3.000000e+00
				; RV32IHZFA-NEXT: ret
				;
				; RV64HZFA-LABEL: loadfpimm4:
				; RV64HZFA: # %bb.0:
				; RV64HZFA-NEXT: fli.h fa0, 3.000000e+00
				; RV64HZFA-NEXT: ret
				ret half 3.0
				}

				define half @loadfpimm5() {
				; RV32IHZFA-LABEL: loadfpimm5:
				; RV32IHZFA: # %bb.0:
				; RV32IHZFA-NEXT: fli.h fa0, 2.560000e+02
				; RV32IHZFA-NEXT: ret
				;
				; RV64HZFA-LABEL: loadfpimm5:
				; RV64HZFA: # %bb.0:
				; RV64HZFA-NEXT: fli.h fa0, 2.560000e+02
				; RV64HZFA-NEXT: ret
				ret half 256.0
				}

				define half @loadfpimm6() {
				; RV32IHZFA-LABEL: loadfpimm6:
				; RV32IHZFA: # %bb.0:
				; RV32IHZFA-NEXT: fli.h fa0, inf
				; RV32IHZFA-NEXT: ret
				;
				; RV64HZFA-LABEL: loadfpimm6:
				; RV64HZFA: # %bb.0:
				; RV64HZFA-NEXT: fli.h fa0, inf
				; RV64HZFA-NEXT: ret
				ret half 0xH7C00
				}

				define half @loadfpimm7() {
				; RV32IHZFA-LABEL: loadfpimm7:
				; RV32IHZFA: # %bb.0:
				; RV32IHZFA-NEXT: fli.h fa0, nan
				; RV32IHZFA-NEXT: ret
				;
				; RV64HZFA-LABEL: loadfpimm7:
				; RV64HZFA: # %bb.0:
				; RV64HZFA-NEXT: fli.h fa0, nan
				; RV64HZFA-NEXT: ret
				ret half 0xH7E00
				}

				define half @loadfpimm8() {
				; RV32IHZFA-LABEL: loadfpimm8:
				; RV32IHZFA: # %bb.0:
				; RV32IHZFA-NEXT: fli.h fa0, min
				; RV32IHZFA-NEXT: ret
				;
				; RV64HZFA-LABEL: loadfpimm8:
				; RV64HZFA: # %bb.0:
				; RV64HZFA-NEXT: fli.h fa0, min
				; RV64HZFA-NEXT: ret
				ret half 0xH0400
				}

				define half @loadfpimm9() {
				; RV32IHZFA-LABEL: loadfpimm9:
				; RV32IHZFA: # %bb.0:
				; RV32IHZFA-NEXT: lui a0, %hi(.LCPI8_0)
				; RV32IHZFA-NEXT: flh fa0, %lo(.LCPI8_0)(a0)
				; RV32IHZFA-NEXT: ret
				;
				; RV64HZFA-LABEL: loadfpimm9:
				; RV64HZFA: # %bb.0:
				; RV64HZFA-NEXT: lui a0, %hi(.LCPI8_0)
				; RV64HZFA-NEXT: flh fa0, %lo(.LCPI8_0)(a0)
				; RV64HZFA-NEXT: ret
				ret half 255.0
				}

	declare half @llvm.minimum.f16(half, half)			declare half @llvm.minimum.f16(half, half)

	define half @fminm_h(half %a, half %b) nounwind {			define half @fminm_h(half %a, half %b) nounwind {
	; RV32IHZFA-LABEL: fminm_h:			; RV32IHZFA-LABEL: fminm_h:
	; RV32IHZFA: # %bb.0:			; RV32IHZFA: # %bb.0:
	; RV32IHZFA-NEXT: fminm.h fa0, fa0, fa1			; RV32IHZFA-NEXT: fminm.h fa0, fa0, fa1
	; RV32IHZFA-NEXT: ret			; RV32IHZFA-NEXT: ret
	;			;
	▲ Show 20 Lines • Show All 156 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV][CodeGen] Add codegen pattern for FLI instruction in experimental zfa extensionClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 497964

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/lib/Target/RISCV/RISCVInstrInfoZfa.td

llvm/test/CodeGen/RISCV/double-zfa.ll

llvm/test/CodeGen/RISCV/float-zfa.ll

llvm/test/CodeGen/RISCV/half-zfa.ll

[RISCV][CodeGen] Add codegen pattern for FLI instruction in experimental zfa extension
ClosedPublic