This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AArch64/
-
Target/
-
AArch64/
6/13
AArch64ISelLowering.cpp
-
AArch64InstrFormats.td
3/7
AArch64InstrInfo.td
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
1/3
fp-intrinsics.ll

Differential D114946

[AArch64] Add instruction selection for strict FP
ClosedPublic

Authored by john.brawn on Dec 2 2021, 5:06 AM.

Download Raw Diff

Details

Reviewers

t.p.northover
kpn
dmgreen
bsmith

Commits

rGd4342efb6959: [AArch64] Add instruction selection for strict FP

Summary

This consists of marking the various strict opcodes as legal, and adjusting instruction selection patterns so that 'op' is 'any_op'.

FP16 and vector instruction additionally require some extra work in lowering and legalization, so we can't set IsStrictFPEnabled just yet. Also more work needs to be done for full strict fp support (marking instructions that can raise exceptions as such, and modelling FPCR use for controlling rounding).

Diff Detail

Unit TestsFailed

	Time	Test
	1,970 ms	x64 debian > Clang.CodeGen::aarch64-neon-scalar-x-indexed-elem-constrained.c
	2,580 ms	x64 debian > Clang.CodeGen::aarch64-v8.2a-neon-intrinsics-constrained.c
	60,030 ms	x64 debian > MLIR.Examples/standalone::test.toy
	60,030 ms	x64 debian > libFuzzer.libFuzzer::large.test

Event Timeline

john.brawn created this revision.Dec 2 2021, 5:06 AM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald TranscriptDec 2 2021, 5:06 AM

john.brawn requested review of this revision.Dec 2 2021, 5:06 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 2 2021, 5:06 AM

Harbormaster completed remote builds in B137109: Diff 391275.Dec 2 2021, 5:54 AM

That flag tells front ends that strictfp will generate correct instructions. Don't set it if more work is needed for correct instructions. The support should at least have parity with X86 and SystemZ.

In D114946#3167497, @kpn wrote:

That flag tells front ends that strictfp will generate correct instructions. Don't set it if more work is needed for correct instructions. The support should at least have parity with X86 and SystemZ.

I think IsStrictFPEnabled is just the SelectionDAG flag. The frontend has a separate flag HasStrictFP.

In D114946#3168771, @craig.topper wrote:

In D114946#3167497, @kpn wrote:

That flag tells front ends that strictfp will generate correct instructions. Don't set it if more work is needed for correct instructions. The support should at least have parity with X86 and SystemZ.

I think IsStrictFPEnabled is just the SelectionDAG flag. The frontend has a separate flag HasStrictFP.

Yes, without HasStrictFP in the clang TargetInfo the constrained fp instrinsics won't be emitted (unless you use -fexperimental-strict-floating-point).

On an unrelated note, I've noticed that for globalisel (used at -O0 in aarch64) to work I also need to adjust a few things.

If you could split things up a bit whilst you are at it, that might make it easier to get through the review bit at a time too.

llvm/test/CodeGen/AArch64/fp-intrinsics.ll
1–2	Using update_llc_test_checks would be good, I think.

It turns out that though global isel is enabled at -O0 it also falls back to non-global isel when it sees things it can't handle, and there's several things unrelated to strict fp (or that are but aren't aarch64-specific) that cause this to happen. I've added the a run line to run the test with -global-isel=true anyway, but with the option that means we get this fallback behaviour. I'll post some follow-on patches later to fix some of these global isel things.

john.brawn added inline comments.Dec 6 2021, 6:40 AM

llvm/test/CodeGen/AArch64/fp-intrinsics.ll
1–2	It doesn't do a good job of handling the slight differences between the non-global-isel and global-isel output. It generates check lines that work for one but fail with the other.

Harbormaster completed remote builds in B137648: Diff 392044.Dec 6 2021, 7:25 AM

john.brawn added a child revision: D115352: [AArch64] Add mayRaiseFPException to appropriate instructions.Dec 8 2021, 8:08 AM

This is still a pretty big patch. Is it possible to break it up into some logically separate parts?

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1398	Is it better to have a section like this which is all the "Strict-fp handling", or should all this code be next to the existing operations? We already have ISD::STRICT_FP_TO_SINT above next to ISD::FP_TO_SINT. And if SVE was added I would expect it to be in the SVE section (not that this is super well laid out). Should the ISD::STRICT_FCOS be next to the ISD::FCOS?
llvm/test/CodeGen/AArch64/fp-intrinsics.ll
1–2	That might be because the order of the check prefixes matter when it generates the checks, from most-general to least general. This file is bound to get updated by someone at some point to run the scripts, it's the only way to keep files like this maintainable, we might as well do it now. The file is very big though, and looks like it would be better as a few different test files.

In D114946#3182147, @dmgreen wrote:

This is still a pretty big patch. Is it possible to break it up into some logically separate parts?

The best I've been able to do is break out the parts related to fp16 legalization and lowering out into a separate patch, which I'll have ready later today.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1398	After fiddling about with things a bit it does look like it's probably better to have the handling of OP to be next to STRICT_OP, so I'll do that.

Split off fp16 handling to a separate patch, moved things round so STRICT_OP is grouped with OP.

john.brawn edited the summary of this revision. (Show Details)Dec 10 2021, 9:06 AM

john.brawn set the repository for this revision to rG LLVM Github Monorepo.

Harbormaster completed remote builds in B138688: Diff 393514.Dec 10 2021, 9:59 AM

john.brawn added a child revision: D115620: [AArch64] Lowering and legalization of strict FP16.Dec 13 2021, 3:26 AM

I'm looking at the Clang.CodeGen::aarch64-neon-scalar-x-indexed-elem-constrained.c failure, and it looks like there are other constrained fp tests that are currently xfailed but maybe should actually pass with this patch.

Matt added a subscriber: Matt.Jan 4 2022, 9:24 AM

Added some more addOperationAction lines to fix failures in tests. However looking at the tests there's some XFAILed tests that still fail due to more work needing to be done on handling of vector types. I'll do that in a separate patch.

Harbormaster completed remote builds in B142101: Diff 398167.Jan 7 2022, 9:47 AM

john.brawn added a child revision: D117795: [AArch64] Add some missing strict FP vector lowering.Jan 20 2022, 8:48 AM

Ping. I've also uploaded D117795 for fixes related to vector instructions.

Sorry for the delay - these patch is quite big and some of the details in constrained intrinsics can be.. subtle. It makes them difficult to review.

The best I've been able to do is break out the parts related to fp16 legalization and lowering out into a separate patch, which I'll have ready later today.

I'm not sure I see why all these parts need to be in a single patch. Smaller patches are much more preferable for review. Is it because the entire test file needs to be made to work at the same time? That file looks like it's testing too many separate intrinsics.

But I've tried to take a look through. Comments inline

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
655	Was LLROUND of f16 previously Legal by default? So this doesn't alter anything by explicitly marking it?
942–964	The formatting is apparently off here. I used to like the old code because you could do a search for ISD::FROUND and see all the setOperationAction(ISD::FROUND, VT, Legal) lines come up. Alas.
llvm/lib/Target/AArch64/AArch64InstrInfo.td
4991	Are optimizations like this desirable (or always valid?) for strict nodes? Same for the loads below. Do we have test cases for them?
6293	Do these make a lot of sense, with a strict fma but a non-strict fneg?

I'll split off the parts that are purely patterns into separate patches.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
655	Yes, TargetLoweringBase::initActions declares a bunch of operations as Expand for all floating-point types except f16 which makes them Legal by default (possibly a bug in TargetLoweringBase::initActions).
942–964	clang-format is a bit strange about how it likes to format these kinds of for loops, in that it has three different schemes (entries in list closely packed; entries in list vertical-aligned using whitespace; each entry on its own line) that it chooses between, and adding or removing elements can cause the whole thing to be dramatically reformatted. It wants here to use the "each entry on its own line" scheme which defeats the point of using a for loop (making the thing more compact so you can see the "these are the operations that are expanded for v1f64" part on one screen) so I've ignored it.
llvm/lib/Target/AArch64/AArch64InstrInfo.td
4991	Anything where there's a one-to-one mapping between a strict selectiondag node and a floating-point instruction is fine (or more generally where we have an instruction sequence that can cause the same floating-point exceptions in the same order and respects rounding modes). These aren't tested, no. Actually it probably makes sense to move these kinds of patterns into separate patches and add tests for them there, as that can be done separately from the basic isel stuff.
6293	fneg is purely a bit-flipping operation that doesn't have strict/non-strict versions and combining it doesn't change exception/rounding behaviour.

Removed patterns that aren't strictly required for isel, I'll create new patches for those.

Remove non-essential indexed mul/mla patterns that I missed in the last version.

The non-essential patterns are now D118485, D118487, D118489. There were also patterns involving load then scvtf/ucvft where it's not clear where they're tested and I haven't been able to trigger with a quick test, so I may just drop those.

john.brawn added child revisions: D118489: [AArch64] Allow strict opcodes in faddp patterns, D118487: [AArch64] Allow strict opcodes in indexed fmul and fma patterns, D118485: [AArch64] Allow strict opcodes in fp->int->fp patterns.Jan 31 2022, 2:42 AM

john.brawn mentioned this in D117795: [AArch64] Add some missing strict FP vector lowering.Feb 1 2022, 9:45 AM

Moved a couple of setOperationAction lines from D117795 to here.

Harbormaster completed remote builds in B147091: Diff 405206.Feb 2 2022, 3:56 AM

Sorry, I thought I had already left this comment from the last time I tried to review these.

Is a f128 fminimum / fmaximum something that should be handled? I don't see any tests for fminimum in general.

Moved a couple of setOperationAction lines from D117795 to here.

New patches are free. Feel free to use them :) The developer policy is quite clear on trying to keep patches small: https://llvm.org/docs/DeveloperPolicy.html#incremental-development.
This is fine for now so long as you answer the question inline though, and from what I can tell this patch appears OK, minus the nitpicking. about fminimum and formatting.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
942–964	I find the loop quite hard to read, to be honest. It is difficult to see which operations the loop is acting on without more clear structure. Maybe it's trying to do too much all at once? We should make sure all new code is clang-formatted. (I know there is still some old code that isn't formatted yet). But I do acknowledge that I sometimes have opinions about these kind of things that are not shared with the rest of coders. Simpler is better in my opinion, and simpler isn't always smaller. I think that if we add this, it will either be clang-formatted now or it will be formatting by someone in the future. So best to come up with something that looks OK when formatted. Does splitting the loop into strict and nonstrict help?
1485	FCmp are a pretty common operation. Is it not something we would want to support, without expanding? Don't we have intrinsics that would expect to become a single vector operation?

Is a f128 fminimum / fmaximum something that should be handled? I don't see any tests for fminimum in general.

Yes, it should, I'll do that and add tests.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
942–964	I had a bit of a look at how clang-format is deciding how to format things here, and it turns out that the decision of how to format things in columns just tries a different numbers of columns, puts items in the order it sees them into columns, then discards any layout with over-long lines. So it won't try having one item span multiple columns, or break a line early when it would be over-long. How it decides between column layout and tightly packed I didn't manage to figure out. Anyway, the upshot is that rearranging the items slightly means clang-format can successfully find a layout, so I'll be doing that.
1485	Yes, it's something we would want to support, hence the FIXME. And with regards to intrinsics, generating code that's correct is more important than using a single instruction.

Adjusted loops so that the formatting matches what clang-format prefers. It turns out that non-strict f128 fmaximum/fminimum doesn't work, so I've added a FIXME comment about that and just added tests for non-f128.

Harbormaster completed remote builds in B148029: Diff 406514.Feb 7 2022, 12:14 PM

dmgreen added inline comments.Feb 8 2022, 11:18 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
942–964	OK that's good. Clang format can sure be funny at times. Thanks for doing that.
1485	In my experience, FIXME's are things people leave in the code because they have no plan of fixing them. Correct code is of course important, but people also have an expectation of a certain level of performance from the intrinsics.
llvm/lib/Target/AArch64/AArch64InstrInfo.td
4038	Is there a test for FNMADD (and FNMSUB)? I don't see it in D118487 either.
4230	None of the test below seem to be Neon. Are they tested somewhere else?

john.brawn added inline comments.Feb 9 2022, 9:26 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1485	Yes, I'm not planning on fixing this. My primary goal here is to have code generation that is valid, and secondarily when we have some existing ISel patterns that apply to non-strict and would also be valid for strict adjust them to work with strict. Vector STRICT_FSETCC/STRICT_FSETCCS are more complex than this, hence leaving them as Expand (which is the default for strict vector operations anyway).
llvm/lib/Target/AArch64/AArch64InstrInfo.td
4230	D117795 adds neon tests.

Added tests for fnmadd and fnmsub.

Harbormaster completed remote builds in B148502: Diff 407189.Feb 9 2022, 10:52 AM

OK. From what I can tell, this looks correct.

This revision is now accepted and ready to land.Feb 15 2022, 12:47 AM

This revision was landed with ongoing or failed builds.Feb 17 2022, 5:12 AM

Closed by commit rGd4342efb6959: [AArch64] Add instruction selection for strict FP (authored by john.brawn). · Explain Why

This revision was automatically updated to reflect the committed changes.

john.brawn added a commit: rGd4342efb6959: [AArch64] Add instruction selection for strict FP.

Revision Contents

Path

Size

llvm/

lib/

Target/

AArch64/

AArch64ISelLowering.cpp

199 lines

AArch64InstrFormats.td

9 lines

AArch64InstrInfo.td

138 lines

test/

CodeGen/

AArch64/

fp-intrinsics.ll

35 lines

Diff 406514

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 398 Lines • ▼ Show 20 Lines	AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
setOperationAction(ISD::FTRUNC, MVT::f128, Expand);		setOperationAction(ISD::FTRUNC, MVT::f128, Expand);
setOperationAction(ISD::SETCC, MVT::f128, Custom);		setOperationAction(ISD::SETCC, MVT::f128, Custom);
setOperationAction(ISD::STRICT_FSETCC, MVT::f128, Custom);		setOperationAction(ISD::STRICT_FSETCC, MVT::f128, Custom);
setOperationAction(ISD::STRICT_FSETCCS, MVT::f128, Custom);		setOperationAction(ISD::STRICT_FSETCCS, MVT::f128, Custom);
setOperationAction(ISD::BR_CC, MVT::f128, Custom);		setOperationAction(ISD::BR_CC, MVT::f128, Custom);
setOperationAction(ISD::SELECT, MVT::f128, Custom);		setOperationAction(ISD::SELECT, MVT::f128, Custom);
setOperationAction(ISD::SELECT_CC, MVT::f128, Custom);		setOperationAction(ISD::SELECT_CC, MVT::f128, Custom);
setOperationAction(ISD::FP_EXTEND, MVT::f128, Custom);		setOperationAction(ISD::FP_EXTEND, MVT::f128, Custom);
		// FIXME: f128 FMINIMUM and FMAXIMUM (including STRICT versions) currently
		// aren't handled.

// Lowering for many of the conversions is actually specified by the non-f128		// Lowering for many of the conversions is actually specified by the non-f128
// type. The LowerXXX function will be trivial when f128 isn't involved.		// type. The LowerXXX function will be trivial when f128 isn't involved.
setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);		setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);		setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);
setOperationAction(ISD::FP_TO_SINT, MVT::i128, Custom);		setOperationAction(ISD::FP_TO_SINT, MVT::i128, Custom);
setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i32, Custom);		setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i32, Custom);
setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i64, Custom);		setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i64, Custom);
▲ Show 20 Lines • Show All 227 Lines • ▼ Show 20 Lines	if (!Subtarget->hasFullFP16()) {
setOperationAction(ISD::SETCC, MVT::v8f16, Expand);		setOperationAction(ISD::SETCC, MVT::v8f16, Expand);
setOperationAction(ISD::BR_CC, MVT::v8f16, Expand);		setOperationAction(ISD::BR_CC, MVT::v8f16, Expand);
setOperationAction(ISD::SELECT, MVT::v8f16, Expand);		setOperationAction(ISD::SELECT, MVT::v8f16, Expand);
setOperationAction(ISD::SELECT_CC, MVT::v8f16, Expand);		setOperationAction(ISD::SELECT_CC, MVT::v8f16, Expand);
setOperationAction(ISD::FP_EXTEND, MVT::v8f16, Expand);		setOperationAction(ISD::FP_EXTEND, MVT::v8f16, Expand);
}		}

// AArch64 has implementations of a lot of rounding-like FP operations.		// AArch64 has implementations of a lot of rounding-like FP operations.
for (MVT Ty : {MVT::f32, MVT::f64}) {		for (auto Op :
setOperationAction(ISD::FFLOOR, Ty, Legal);		{ISD::FFLOOR, ISD::FNEARBYINT, ISD::FCEIL,
setOperationAction(ISD::FNEARBYINT, Ty, Legal);		ISD::FRINT, ISD::FTRUNC, ISD::FROUND,
setOperationAction(ISD::FCEIL, Ty, Legal);		ISD::FROUNDEVEN, ISD::FMINNUM, ISD::FMAXNUM,
		dmgreenUnsubmitted Not Done Reply Inline Actions Was LLROUND of f16 previously Legal by default? So this doesn't alter anything by explicitly marking it? dmgreen: Was LLROUND of f16 previously Legal by default? So this doesn't alter anything by explicitly…
		john.brawnAuthorUnsubmitted Done Reply Inline Actions Yes, TargetLoweringBase::initActions declares a bunch of operations as Expand for all floating-point types except f16 which makes them Legal by default (possibly a bug in TargetLoweringBase::initActions). john.brawn: Yes, TargetLoweringBase::initActions declares a bunch of operations as Expand for all floating…
setOperationAction(ISD::FRINT, Ty, Legal);		ISD::FMINIMUM, ISD::FMAXIMUM, ISD::LROUND,
setOperationAction(ISD::FTRUNC, Ty, Legal);		ISD::LLROUND, ISD::LRINT, ISD::LLRINT,
setOperationAction(ISD::FROUND, Ty, Legal);		ISD::STRICT_FFLOOR, ISD::STRICT_FCEIL, ISD::STRICT_FNEARBYINT,
setOperationAction(ISD::FROUNDEVEN, Ty, Legal);		ISD::STRICT_FRINT, ISD::STRICT_FTRUNC, ISD::STRICT_FROUNDEVEN,
setOperationAction(ISD::FMINNUM, Ty, Legal);		ISD::STRICT_FROUND, ISD::STRICT_FMINNUM, ISD::STRICT_FMAXNUM,
setOperationAction(ISD::FMAXNUM, Ty, Legal);		ISD::STRICT_FMINIMUM, ISD::STRICT_FMAXIMUM, ISD::STRICT_LROUND,
setOperationAction(ISD::FMINIMUM, Ty, Legal);		ISD::STRICT_LLROUND, ISD::STRICT_LRINT, ISD::STRICT_LLRINT}) {
setOperationAction(ISD::FMAXIMUM, Ty, Legal);		for (MVT Ty : {MVT::f32, MVT::f64})
setOperationAction(ISD::LROUND, Ty, Legal);		setOperationAction(Op, Ty, Legal);
setOperationAction(ISD::LLROUND, Ty, Legal);		if (Subtarget->hasFullFP16())
setOperationAction(ISD::LRINT, Ty, Legal);		setOperationAction(Op, MVT::f16, Legal);
setOperationAction(ISD::LLRINT, Ty, Legal);
}		}

if (Subtarget->hasFullFP16()) {		// Basic strict FP operations are legal
setOperationAction(ISD::FNEARBYINT, MVT::f16, Legal);		for (auto Op : {ISD::STRICT_FADD, ISD::STRICT_FSUB, ISD::STRICT_FMUL,
setOperationAction(ISD::FFLOOR, MVT::f16, Legal);		ISD::STRICT_FDIV, ISD::STRICT_FMA, ISD::STRICT_FSQRT}) {
setOperationAction(ISD::FCEIL, MVT::f16, Legal);		for (MVT Ty : {MVT::f32, MVT::f64})
setOperationAction(ISD::FRINT, MVT::f16, Legal);		setOperationAction(Op, Ty, Legal);
setOperationAction(ISD::FTRUNC, MVT::f16, Legal);		if (Subtarget->hasFullFP16())
setOperationAction(ISD::FROUND, MVT::f16, Legal);		setOperationAction(Op, MVT::f16, Legal);
setOperationAction(ISD::FROUNDEVEN, MVT::f16, Legal);
setOperationAction(ISD::FMINNUM, MVT::f16, Legal);
setOperationAction(ISD::FMAXNUM, MVT::f16, Legal);
setOperationAction(ISD::FMINIMUM, MVT::f16, Legal);
setOperationAction(ISD::FMAXIMUM, MVT::f16, Legal);
}		}

		// Strict conversion to a larger type is legal
		for (auto VT : {MVT::f32, MVT::f64})
		setOperationAction(ISD::STRICT_FP_EXTEND, VT, Legal);

setOperationAction(ISD::PREFETCH, MVT::Other, Custom);		setOperationAction(ISD::PREFETCH, MVT::Other, Custom);

setOperationAction(ISD::FLT_ROUNDS_, MVT::i32, Custom);		setOperationAction(ISD::FLT_ROUNDS_, MVT::i32, Custom);
setOperationAction(ISD::SET_ROUNDING, MVT::Other, Custom);		setOperationAction(ISD::SET_ROUNDING, MVT::Other, Custom);

setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i128, Custom);		setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i128, Custom);
setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Custom);		setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Custom);
setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i64, Custom);		setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i64, Custom);
▲ Show 20 Lines • Show All 244 Lines • ▼ Show 20 Lines	#undef LCALLNAME5

setHasExtractBitsInsn(true);		setHasExtractBitsInsn(true);

setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::Other, Custom);		setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::Other, Custom);

if (Subtarget->hasNEON()) {		if (Subtarget->hasNEON()) {
// FIXME: v1f64 shouldn't be legal if we can avoid it, because it leads to		// FIXME: v1f64 shouldn't be legal if we can avoid it, because it leads to
// silliness like this:		// silliness like this:
setOperationAction(ISD::FABS, MVT::v1f64, Expand);		for (auto Op :
setOperationAction(ISD::FADD, MVT::v1f64, Expand);		{ISD::SELECT, ISD::SELECT_CC, ISD::SETCC,
setOperationAction(ISD::FCEIL, MVT::v1f64, Expand);		ISD::BR_CC, ISD::FADD, ISD::FSUB,
setOperationAction(ISD::FCOPYSIGN, MVT::v1f64, Expand);		ISD::FMUL, ISD::FDIV, ISD::FMA,
setOperationAction(ISD::FCOS, MVT::v1f64, Expand);		ISD::FNEG, ISD::FABS, ISD::FCEIL,
setOperationAction(ISD::FDIV, MVT::v1f64, Expand);		ISD::FSQRT, ISD::FFLOOR, ISD::FNEARBYINT,
setOperationAction(ISD::FFLOOR, MVT::v1f64, Expand);		ISD::FRINT, ISD::FROUND, ISD::FROUNDEVEN,
setOperationAction(ISD::FMA, MVT::v1f64, Expand);		ISD::FTRUNC, ISD::FMINNUM, ISD::FMAXNUM,
setOperationAction(ISD::FMUL, MVT::v1f64, Expand);		ISD::FMINIMUM, ISD::FMAXIMUM, ISD::STRICT_FADD,
setOperationAction(ISD::FNEARBYINT, MVT::v1f64, Expand);		ISD::STRICT_FSUB, ISD::STRICT_FMUL, ISD::STRICT_FDIV,
setOperationAction(ISD::FNEG, MVT::v1f64, Expand);		ISD::STRICT_FMA, ISD::STRICT_FCEIL, ISD::STRICT_FFLOOR,
setOperationAction(ISD::FPOW, MVT::v1f64, Expand);		ISD::STRICT_FSQRT, ISD::STRICT_FRINT, ISD::STRICT_FNEARBYINT,
setOperationAction(ISD::FREM, MVT::v1f64, Expand);		ISD::STRICT_FROUND, ISD::STRICT_FTRUNC, ISD::STRICT_FROUNDEVEN,
setOperationAction(ISD::FROUND, MVT::v1f64, Expand);		ISD::STRICT_FMINNUM, ISD::STRICT_FMAXNUM, ISD::STRICT_FMINIMUM,
setOperationAction(ISD::FROUNDEVEN, MVT::v1f64, Expand);		ISD::STRICT_FMAXIMUM})
setOperationAction(ISD::FRINT, MVT::v1f64, Expand);		setOperationAction(Op, MVT::v1f64, Expand);
setOperationAction(ISD::FSIN, MVT::v1f64, Expand);
setOperationAction(ISD::FSINCOS, MVT::v1f64, Expand);		for (auto Op :
setOperationAction(ISD::FSQRT, MVT::v1f64, Expand);		{ISD::FP_TO_SINT, ISD::FP_TO_UINT, ISD::SINT_TO_FP, ISD::UINT_TO_FP,
setOperationAction(ISD::FSUB, MVT::v1f64, Expand);		ISD::FP_ROUND, ISD::FP_TO_SINT_SAT, ISD::FP_TO_UINT_SAT, ISD::MUL,
setOperationAction(ISD::FTRUNC, MVT::v1f64, Expand);		ISD::STRICT_FP_TO_SINT, ISD::STRICT_FP_TO_UINT,
setOperationAction(ISD::SETCC, MVT::v1f64, Expand);		ISD::STRICT_SINT_TO_FP, ISD::STRICT_UINT_TO_FP, ISD::STRICT_FP_ROUND})
setOperationAction(ISD::BR_CC, MVT::v1f64, Expand);		setOperationAction(Op, MVT::v1i64, Expand);
		dmgreenUnsubmitted Not Done Reply Inline Actions The formatting is apparently off here. I used to like the old code because you could do a search for ISD::FROUND and see all the setOperationAction(ISD::FROUND, VT, Legal) lines come up. Alas. dmgreen: The formatting is apparently off here. I used to like the old code because you could do a…
		john.brawnAuthorUnsubmitted Done Reply Inline Actions clang-format is a bit strange about how it likes to format these kinds of for loops, in that it has three different schemes (entries in list closely packed; entries in list vertical-aligned using whitespace; each entry on its own line) that it chooses between, and adding or removing elements can cause the whole thing to be dramatically reformatted. It wants here to use the "each entry on its own line" scheme which defeats the point of using a for loop (making the thing more compact so you can see the "these are the operations that are expanded for v1f64" part on one screen) so I've ignored it. john.brawn: clang-format is a bit strange about how it likes to format these kinds of for loops, in that it…
		dmgreenUnsubmitted Not Done Reply Inline Actions I find the loop quite hard to read, to be honest. It is difficult to see which operations the loop is acting on without more clear structure. Maybe it's trying to do too much all at once? We should make sure all new code is clang-formatted. (I know there is still some old code that isn't formatted yet). But I do acknowledge that I sometimes have opinions about these kind of things that are not shared with the rest of coders. Simpler is better in my opinion, and simpler isn't always smaller. I think that if we add this, it will either be clang-formatted now or it will be formatting by someone in the future. So best to come up with something that looks OK when formatted. Does splitting the loop into strict and nonstrict help? dmgreen: I find the loop quite hard to read, to be honest. It is difficult to see which operations the…
		john.brawnAuthorUnsubmitted Done Reply Inline Actions I had a bit of a look at how clang-format is deciding how to format things here, and it turns out that the decision of how to format things in columns just tries a different numbers of columns, puts items in the order it sees them into columns, then discards any layout with over-long lines. So it won't try having one item span multiple columns, or break a line early when it would be over-long. How it decides between column layout and tightly packed I didn't manage to figure out. Anyway, the upshot is that rearranging the items slightly means clang-format can successfully find a layout, so I'll be doing that. john.brawn: I had a bit of a look at how clang-format is deciding how to format things here, and it turns…
		dmgreenUnsubmitted Not Done Reply Inline Actions OK that's good. Clang format can sure be funny at times. Thanks for doing that. dmgreen: OK that's good. Clang format can sure be funny at times. Thanks for doing that.
setOperationAction(ISD::SELECT, MVT::v1f64, Expand);
setOperationAction(ISD::SELECT_CC, MVT::v1f64, Expand);
setOperationAction(ISD::FP_EXTEND, MVT::v1f64, Expand);

setOperationAction(ISD::FP_TO_SINT, MVT::v1i64, Expand);
setOperationAction(ISD::FP_TO_UINT, MVT::v1i64, Expand);
setOperationAction(ISD::SINT_TO_FP, MVT::v1i64, Expand);
setOperationAction(ISD::UINT_TO_FP, MVT::v1i64, Expand);
setOperationAction(ISD::FP_ROUND, MVT::v1f64, Expand);

setOperationAction(ISD::FP_TO_SINT_SAT, MVT::v1i64, Expand);
setOperationAction(ISD::FP_TO_UINT_SAT, MVT::v1i64, Expand);

setOperationAction(ISD::MUL, MVT::v1i64, Expand);

// AArch64 doesn't have a direct vector ->f32 conversion instructions for		// AArch64 doesn't have a direct vector ->f32 conversion instructions for
// elements smaller than i32, so promote the input to i32 first.		// elements smaller than i32, so promote the input to i32 first.
setOperationPromotedToType(ISD::UINT_TO_FP, MVT::v4i8, MVT::v4i32);		setOperationPromotedToType(ISD::UINT_TO_FP, MVT::v4i8, MVT::v4i32);
setOperationPromotedToType(ISD::SINT_TO_FP, MVT::v4i8, MVT::v4i32);		setOperationPromotedToType(ISD::SINT_TO_FP, MVT::v4i8, MVT::v4i32);

// Similarly, there is no direct i32 -> f64 vector conversion instruction.		// Similarly, there is no direct i32 -> f64 vector conversion instruction.
setOperationAction(ISD::SINT_TO_FP, MVT::v2i32, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::v2i32, Custom);
setOperationAction(ISD::SINT_TO_FP, MVT::v2i64, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::v2i64, Custom);
// Or, direct i32 -> f16 vector conversion. Set it so custom, so the		// Or, direct i32 -> f16 vector conversion. Set it so custom, so the
// conversion happens in two steps: v4i32 -> v4f32 -> v4f16		// conversion happens in two steps: v4i32 -> v4f32 -> v4f16
setOperationAction(ISD::SINT_TO_FP, MVT::v4i32, Custom);		for (auto Op : {ISD::SINT_TO_FP, ISD::UINT_TO_FP, ISD::STRICT_SINT_TO_FP,
setOperationAction(ISD::UINT_TO_FP, MVT::v4i32, Custom);		ISD::STRICT_UINT_TO_FP})
		for (auto VT : {MVT::v2i32, MVT::v2i64, MVT::v4i32})
		setOperationAction(Op, VT, Custom);

if (Subtarget->hasFullFP16()) {		if (Subtarget->hasFullFP16()) {
setOperationAction(ISD::SINT_TO_FP, MVT::v8i8, Custom);		setOperationAction(ISD::SINT_TO_FP, MVT::v8i8, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::v8i8, Custom);		setOperationAction(ISD::UINT_TO_FP, MVT::v8i8, Custom);
setOperationAction(ISD::SINT_TO_FP, MVT::v16i8, Custom);		setOperationAction(ISD::SINT_TO_FP, MVT::v16i8, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::v16i8, Custom);		setOperationAction(ISD::UINT_TO_FP, MVT::v16i8, Custom);
setOperationAction(ISD::SINT_TO_FP, MVT::v4i16, Custom);		setOperationAction(ISD::SINT_TO_FP, MVT::v4i16, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::v4i16, Custom);		setOperationAction(ISD::UINT_TO_FP, MVT::v4i16, Custom);
▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines	for (MVT VT : MVT::fixedlen_vector_valuetypes()) {
setTruncStoreAction(VT, InnerVT, Expand);		setTruncStoreAction(VT, InnerVT, Expand);
setLoadExtAction(ISD::SEXTLOAD, VT, InnerVT, Expand);		setLoadExtAction(ISD::SEXTLOAD, VT, InnerVT, Expand);
setLoadExtAction(ISD::ZEXTLOAD, VT, InnerVT, Expand);		setLoadExtAction(ISD::ZEXTLOAD, VT, InnerVT, Expand);
setLoadExtAction(ISD::EXTLOAD, VT, InnerVT, Expand);		setLoadExtAction(ISD::EXTLOAD, VT, InnerVT, Expand);
}		}
}		}

// AArch64 has implementations of a lot of rounding-like FP operations.		// AArch64 has implementations of a lot of rounding-like FP operations.
for (MVT Ty : {MVT::v2f32, MVT::v4f32, MVT::v2f64}) {		for (auto Op :
setOperationAction(ISD::FFLOOR, Ty, Legal);		{ISD::FFLOOR, ISD::FNEARBYINT, ISD::FCEIL, ISD::FRINT, ISD::FTRUNC,
setOperationAction(ISD::FNEARBYINT, Ty, Legal);		ISD::FROUND, ISD::FROUNDEVEN, ISD::STRICT_FFLOOR,
setOperationAction(ISD::FCEIL, Ty, Legal);		ISD::STRICT_FNEARBYINT, ISD::STRICT_FCEIL, ISD::STRICT_FRINT,
setOperationAction(ISD::FRINT, Ty, Legal);		ISD::STRICT_FTRUNC, ISD::STRICT_FROUND, ISD::STRICT_FROUNDEVEN}) {
setOperationAction(ISD::FTRUNC, Ty, Legal);		for (MVT Ty : {MVT::v2f32, MVT::v4f32, MVT::v2f64})
setOperationAction(ISD::FROUND, Ty, Legal);		setOperationAction(Op, Ty, Legal);
setOperationAction(ISD::FROUNDEVEN, Ty, Legal);		if (Subtarget->hasFullFP16())
}		for (MVT Ty : {MVT::v4f16, MVT::v8f16})
		setOperationAction(Op, Ty, Legal);
if (Subtarget->hasFullFP16()) {
for (MVT Ty : {MVT::v4f16, MVT::v8f16}) {
setOperationAction(ISD::FFLOOR, Ty, Legal);
setOperationAction(ISD::FNEARBYINT, Ty, Legal);
setOperationAction(ISD::FCEIL, Ty, Legal);
setOperationAction(ISD::FRINT, Ty, Legal);
setOperationAction(ISD::FTRUNC, Ty, Legal);
setOperationAction(ISD::FROUND, Ty, Legal);
setOperationAction(ISD::FROUNDEVEN, Ty, Legal);
}
}		}

setTruncStoreAction(MVT::v4i16, MVT::v4i8, Custom);		setTruncStoreAction(MVT::v4i16, MVT::v4i8, Custom);

setLoadExtAction(ISD::EXTLOAD, MVT::v4i16, MVT::v4i8, Custom);		setLoadExtAction(ISD::EXTLOAD, MVT::v4i16, MVT::v4i8, Custom);
setLoadExtAction(ISD::SEXTLOAD, MVT::v4i16, MVT::v4i8, Custom);		setLoadExtAction(ISD::SEXTLOAD, MVT::v4i16, MVT::v4i8, Custom);
setLoadExtAction(ISD::ZEXTLOAD, MVT::v4i16, MVT::v4i8, Custom);		setLoadExtAction(ISD::ZEXTLOAD, MVT::v4i16, MVT::v4i8, Custom);
setLoadExtAction(ISD::EXTLOAD, MVT::v4i32, MVT::v4i8, Custom);		setLoadExtAction(ISD::EXTLOAD, MVT::v4i32, MVT::v4i8, Custom);
▲ Show 20 Lines • Show All 285 Lines • ▼ Show 20 Lines	#undef LCALLNAME5

if (Subtarget->hasMOPS() && Subtarget->hasMTE()) {		if (Subtarget->hasMOPS() && Subtarget->hasMTE()) {
// Only required for llvm.aarch64.mops.memset.tag		// Only required for llvm.aarch64.mops.memset.tag
setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::i8, Custom);		setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::i8, Custom);
}		}

PredictableSelectIsExpensive = Subtarget->predictableSelectIsExpensive();		PredictableSelectIsExpensive = Subtarget->predictableSelectIsExpensive();
}		}

		dmgreenUnsubmitted Not Done Reply Inline Actions Is it better to have a section like this which is all the "Strict-fp handling", or should all this code be next to the existing operations? We already have ISD::STRICT_FP_TO_SINT above next to ISD::FP_TO_SINT. And if SVE was added I would expect it to be in the SVE section (not that this is super well laid out). Should the ISD::STRICT_FCOS be next to the ISD::FCOS? dmgreen: Is it better to have a section like this which is all the "Strict-fp handling", or should all…
		john.brawnAuthorUnsubmitted Done Reply Inline Actions After fiddling about with things a bit it does look like it's probably better to have the handling of OP to be next to STRICT_OP, so I'll do that. john.brawn: After fiddling about with things a bit it does look like it's probably better to have the…
void AArch64TargetLowering::addTypeForNEON(MVT VT) {		void AArch64TargetLowering::addTypeForNEON(MVT VT) {
assert(VT.isVector() && "VT should be a vector type");		assert(VT.isVector() && "VT should be a vector type");

if (VT.isFloatingPoint()) {		if (VT.isFloatingPoint()) {
MVT PromoteTo = EVT(VT).changeVectorElementTypeToInteger().getSimpleVT();		MVT PromoteTo = EVT(VT).changeVectorElementTypeToInteger().getSimpleVT();
setOperationPromotedToType(ISD::LOAD, VT, PromoteTo);		setOperationPromotedToType(ISD::LOAD, VT, PromoteTo);
setOperationPromotedToType(ISD::STORE, VT, PromoteTo);		setOperationPromotedToType(ISD::STORE, VT, PromoteTo);
}		}
Show All 38 Lines	if (VT != MVT::v8i8 && VT != MVT::v16i8)
setOperationAction(ISD::CTPOP, VT, Custom);		setOperationAction(ISD::CTPOP, VT, Custom);

setOperationAction(ISD::UDIV, VT, Expand);		setOperationAction(ISD::UDIV, VT, Expand);
setOperationAction(ISD::SDIV, VT, Expand);		setOperationAction(ISD::SDIV, VT, Expand);
setOperationAction(ISD::UREM, VT, Expand);		setOperationAction(ISD::UREM, VT, Expand);
setOperationAction(ISD::SREM, VT, Expand);		setOperationAction(ISD::SREM, VT, Expand);
setOperationAction(ISD::FREM, VT, Expand);		setOperationAction(ISD::FREM, VT, Expand);

setOperationAction(ISD::FP_TO_SINT, VT, Custom);		for (unsigned Opcode :
setOperationAction(ISD::FP_TO_UINT, VT, Custom);		{ISD::FP_TO_SINT, ISD::FP_TO_UINT, ISD::FP_TO_SINT_SAT,
setOperationAction(ISD::FP_TO_SINT_SAT, VT, Custom);		ISD::FP_TO_UINT_SAT, ISD::STRICT_FP_TO_SINT, ISD::STRICT_FP_TO_UINT})
setOperationAction(ISD::FP_TO_UINT_SAT, VT, Custom);		setOperationAction(Opcode, VT, Custom);

if (!VT.isFloatingPoint())		if (!VT.isFloatingPoint())
setOperationAction(ISD::ABS, VT, Legal);		setOperationAction(ISD::ABS, VT, Legal);

// [SU][MIN\|MAX] are available for all NEON types apart from i64.		// [SU][MIN\|MAX] are available for all NEON types apart from i64.
if (!VT.isFloatingPoint() && VT != MVT::v2i64 && VT != MVT::v1i64)		if (!VT.isFloatingPoint() && VT != MVT::v2i64 && VT != MVT::v1i64)
for (unsigned Opcode : {ISD::SMIN, ISD::SMAX, ISD::UMIN, ISD::UMAX})		for (unsigned Opcode : {ISD::SMIN, ISD::SMAX, ISD::UMIN, ISD::UMAX})
setOperationAction(Opcode, VT, Legal);		setOperationAction(Opcode, VT, Legal);

// F[MIN\|MAX][NUM\|NAN] are available for all FP NEON types.		// F[MIN\|MAX][NUM\|NAN] and simple strict operations are available for all FP
		// NEON types.
if (VT.isFloatingPoint() &&		if (VT.isFloatingPoint() &&
VT.getVectorElementType() != MVT::bf16 &&		VT.getVectorElementType() != MVT::bf16 &&
(VT.getVectorElementType() != MVT::f16 \|\| Subtarget->hasFullFP16()))		(VT.getVectorElementType() != MVT::f16 \|\| Subtarget->hasFullFP16()))
for (unsigned Opcode :		for (unsigned Opcode :
{ISD::FMINIMUM, ISD::FMAXIMUM, ISD::FMINNUM, ISD::FMAXNUM})		{ISD::FMINIMUM, ISD::FMAXIMUM, ISD::FMINNUM, ISD::FMAXNUM,
		ISD::STRICT_FMINIMUM, ISD::STRICT_FMAXIMUM, ISD::STRICT_FMINNUM,
		ISD::STRICT_FMAXNUM, ISD::STRICT_FADD, ISD::STRICT_FSUB,
		ISD::STRICT_FMUL, ISD::STRICT_FDIV, ISD::STRICT_FMA,
		ISD::STRICT_FSQRT})
setOperationAction(Opcode, VT, Legal);		setOperationAction(Opcode, VT, Legal);

		// Strict fp extend and trunc are legal
		if (VT.isFloatingPoint() && VT.getScalarSizeInBits() != 16)
		setOperationAction(ISD::STRICT_FP_EXTEND, VT, Legal);
		if (VT.isFloatingPoint() && VT.getScalarSizeInBits() != 64)
		setOperationAction(ISD::STRICT_FP_ROUND, VT, Legal);

		// FIXME: We could potentially make use of the vector comparison instructions
		dmgreenUnsubmitted Not Done Reply Inline Actions FCmp are a pretty common operation. Is it not something we would want to support, without expanding? Don't we have intrinsics that would expect to become a single vector operation? dmgreen: FCmp are a pretty common operation. Is it not something we would want to support, without…
		john.brawnAuthorUnsubmitted Done Reply Inline Actions Yes, it's something we would want to support, hence the FIXME. And with regards to intrinsics, generating code that's correct is more important than using a single instruction. john.brawn: Yes, it's something we would want to support, hence the FIXME. And with regards to intrinsics…
		dmgreenUnsubmitted Not Done Reply Inline Actions In my experience, FIXME's are things people leave in the code because they have no plan of fixing them. Correct code is of course important, but people also have an expectation of a certain level of performance from the intrinsics. dmgreen: In my experience, FIXME's are things people leave in the code because they have no plan of…
		john.brawnAuthorUnsubmitted Done Reply Inline Actions Yes, I'm not planning on fixing this. My primary goal here is to have code generation that is valid, and secondarily when we have some existing ISel patterns that apply to non-strict and would also be valid for strict adjust them to work with strict. Vector STRICT_FSETCC/STRICT_FSETCCS are more complex than this, hence leaving them as Expand (which is the default for strict vector operations anyway). john.brawn: Yes, I'm not planning on fixing this. My primary goal here is to have code generation that is…
		// for STRICT_FSETCC and STRICT_FSETCSS, but there's a number of
		// complications:
		// * FCMPEQ/NE are quiet comparisons, the rest are signalling comparisons,
		// so we would need to expand when the condition code doesn't match the
		// kind of comparison.
		// * Some kinds of comparison require more than one FCMXY instruction so
		// would need to be expanded instead.
		// * The lowering of the non-strict versions involves target-specific ISD
		// nodes so we would likely need to add strict versions of all of them and
		// handle them appropriately.
		setOperationAction(ISD::STRICT_FSETCC, VT, Expand);
		setOperationAction(ISD::STRICT_FSETCCS, VT, Expand);

if (Subtarget->isLittleEndian()) {		if (Subtarget->isLittleEndian()) {
for (unsigned im = (unsigned)ISD::PRE_INC;		for (unsigned im = (unsigned)ISD::PRE_INC;
im != (unsigned)ISD::LAST_INDEXED_MODE; ++im) {		im != (unsigned)ISD::LAST_INDEXED_MODE; ++im) {
setIndexedLoadAction(im, VT, Legal);		setIndexedLoadAction(im, VT, Legal);
setIndexedStoreAction(im, VT, Legal);		setIndexedStoreAction(im, VT, Legal);
}		}
}		}
}		}
▲ Show 20 Lines • Show All 9,991 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrFormats.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,957 Lines • ▼ Show 20 Lines	def HDr : BaseFPConversion<0b01, 0b11, FPR16, FPR64, asm,
[(set (f16 FPR16:$Rd), (any_fpround FPR64:$Rn))]>;		[(set (f16 FPR16:$Rd), (any_fpround FPR64:$Rn))]>;

// Double-precision to Single-precision		// Double-precision to Single-precision
def SDr : BaseFPConversion<0b01, 0b00, FPR32, FPR64, asm,		def SDr : BaseFPConversion<0b01, 0b00, FPR32, FPR64, asm,
[(set FPR32:$Rd, (any_fpround FPR64:$Rn))]>;		[(set FPR32:$Rd, (any_fpround FPR64:$Rn))]>;

// Half-precision to Double-precision		// Half-precision to Double-precision
def DHr : BaseFPConversion<0b11, 0b01, FPR64, FPR16, asm,		def DHr : BaseFPConversion<0b11, 0b01, FPR64, FPR16, asm,
[(set FPR64:$Rd, (fpextend (f16 FPR16:$Rn)))]>;		[(set FPR64:$Rd, (any_fpextend (f16 FPR16:$Rn)))]>;

// Half-precision to Single-precision		// Half-precision to Single-precision
def SHr : BaseFPConversion<0b11, 0b00, FPR32, FPR16, asm,		def SHr : BaseFPConversion<0b11, 0b00, FPR32, FPR16, asm,
[(set FPR32:$Rd, (fpextend (f16 FPR16:$Rn)))]>;		[(set FPR32:$Rd, (any_fpextend (f16 FPR16:$Rn)))]>;

// Single-precision to Double-precision		// Single-precision to Double-precision
def DSr : BaseFPConversion<0b00, 0b01, FPR64, FPR32, asm,		def DSr : BaseFPConversion<0b00, 0b01, FPR64, FPR32, asm,
[(set FPR64:$Rd, (fpextend FPR32:$Rn))]>;		[(set FPR64:$Rd, (any_fpextend FPR32:$Rn))]>;

// Single-precision to Half-precision		// Single-precision to Half-precision
def HSr : BaseFPConversion<0b00, 0b11, FPR16, FPR32, asm,		def HSr : BaseFPConversion<0b00, 0b11, FPR16, FPR32, asm,
[(set (f16 FPR16:$Rd), (any_fpround FPR32:$Rn))]>;		[(set (f16 FPR16:$Rd), (any_fpround FPR32:$Rn))]>;
}		}

//---		//---
// Single operand floating point data processing		// Single operand floating point data processing
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	multiclass TwoOperandFPData<bits<4> opcode, string asm,

def Drr : BaseTwoOperandFPData<opcode, FPR64, asm,		def Drr : BaseTwoOperandFPData<opcode, FPR64, asm,
[(set (f64 FPR64:$Rd),		[(set (f64 FPR64:$Rd),
(node (f64 FPR64:$Rn), (f64 FPR64:$Rm)))]> {		(node (f64 FPR64:$Rn), (f64 FPR64:$Rm)))]> {
let Inst{23-22} = 0b01; // 64-bit size flag		let Inst{23-22} = 0b01; // 64-bit size flag
}		}
}		}

multiclass TwoOperandFPDataNeg<bits<4> opcode, string asm, SDNode node> {		multiclass TwoOperandFPDataNeg<bits<4> opcode, string asm,
		SDPatternOperator node> {
def Hrr : BaseTwoOperandFPData<opcode, FPR16, asm,		def Hrr : BaseTwoOperandFPData<opcode, FPR16, asm,
[(set (f16 FPR16:$Rd), (fneg (node (f16 FPR16:$Rn), (f16 FPR16:$Rm))))]> {		[(set (f16 FPR16:$Rd), (fneg (node (f16 FPR16:$Rn), (f16 FPR16:$Rm))))]> {
let Inst{23-22} = 0b11; // 16-bit size flag		let Inst{23-22} = 0b11; // 16-bit size flag
let Predicates = [HasFullFP16];		let Predicates = [HasFullFP16];
}		}

def Srr : BaseTwoOperandFPData<opcode, FPR32, asm,		def Srr : BaseTwoOperandFPData<opcode, FPR32, asm,
[(set FPR32:$Rd, (fneg (node FPR32:$Rn, (f32 FPR32:$Rm))))]> {		[(set FPR32:$Rd, (fneg (node FPR32:$Rn, (f32 FPR32:$Rm))))]> {
▲ Show 20 Lines • Show All 6,418 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrInfo.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,889 Lines • ▼ Show 20 Lines
defm : FPToIntegerPats<fp_to_sint, fp_to_sint_sat, ftrunc, "FCVTZS">;		defm : FPToIntegerPats<fp_to_sint, fp_to_sint_sat, ftrunc, "FCVTZS">;
defm : FPToIntegerPats<fp_to_uint, fp_to_uint_sat, ftrunc, "FCVTZU">;		defm : FPToIntegerPats<fp_to_uint, fp_to_uint_sat, ftrunc, "FCVTZU">;
defm : FPToIntegerPats<fp_to_sint, fp_to_sint_sat, fround, "FCVTAS">;		defm : FPToIntegerPats<fp_to_sint, fp_to_sint_sat, fround, "FCVTAS">;
defm : FPToIntegerPats<fp_to_uint, fp_to_uint_sat, fround, "FCVTAU">;		defm : FPToIntegerPats<fp_to_uint, fp_to_uint_sat, fround, "FCVTAU">;



let Predicates = [HasFullFP16] in {		let Predicates = [HasFullFP16] in {
def : Pat<(i32 (lround f16:$Rn)),		def : Pat<(i32 (any_lround f16:$Rn)),
(!cast<Instruction>(FCVTASUWHr) f16:$Rn)>;		(!cast<Instruction>(FCVTASUWHr) f16:$Rn)>;
def : Pat<(i64 (lround f16:$Rn)),		def : Pat<(i64 (any_lround f16:$Rn)),
(!cast<Instruction>(FCVTASUXHr) f16:$Rn)>;		(!cast<Instruction>(FCVTASUXHr) f16:$Rn)>;
def : Pat<(i64 (llround f16:$Rn)),		def : Pat<(i64 (any_llround f16:$Rn)),
(!cast<Instruction>(FCVTASUXHr) f16:$Rn)>;		(!cast<Instruction>(FCVTASUXHr) f16:$Rn)>;
}		}
def : Pat<(i32 (lround f32:$Rn)),		def : Pat<(i32 (any_lround f32:$Rn)),
(!cast<Instruction>(FCVTASUWSr) f32:$Rn)>;		(!cast<Instruction>(FCVTASUWSr) f32:$Rn)>;
def : Pat<(i32 (lround f64:$Rn)),		def : Pat<(i32 (any_lround f64:$Rn)),
(!cast<Instruction>(FCVTASUWDr) f64:$Rn)>;		(!cast<Instruction>(FCVTASUWDr) f64:$Rn)>;
def : Pat<(i64 (lround f32:$Rn)),		def : Pat<(i64 (any_lround f32:$Rn)),
(!cast<Instruction>(FCVTASUXSr) f32:$Rn)>;		(!cast<Instruction>(FCVTASUXSr) f32:$Rn)>;
def : Pat<(i64 (lround f64:$Rn)),		def : Pat<(i64 (any_lround f64:$Rn)),
(!cast<Instruction>(FCVTASUXDr) f64:$Rn)>;		(!cast<Instruction>(FCVTASUXDr) f64:$Rn)>;
def : Pat<(i64 (llround f32:$Rn)),		def : Pat<(i64 (any_llround f32:$Rn)),
(!cast<Instruction>(FCVTASUXSr) f32:$Rn)>;		(!cast<Instruction>(FCVTASUXSr) f32:$Rn)>;
def : Pat<(i64 (llround f64:$Rn)),		def : Pat<(i64 (any_llround f64:$Rn)),
(!cast<Instruction>(FCVTASUXDr) f64:$Rn)>;		(!cast<Instruction>(FCVTASUXDr) f64:$Rn)>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Scaled integer to floating point conversion instructions.		// Scaled integer to floating point conversion instructions.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

defm SCVTF : IntegerToFP<0, "scvtf", any_sint_to_fp>;		defm SCVTF : IntegerToFP<0, "scvtf", any_sint_to_fp>;
defm UCVTF : IntegerToFP<1, "ucvtf", any_uint_to_fp>;		defm UCVTF : IntegerToFP<1, "ucvtf", any_uint_to_fp>;
Show All 27 Lines

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Floating point single operand instructions.		// Floating point single operand instructions.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

defm FABS : SingleOperandFPData<0b0001, "fabs", fabs>;		defm FABS : SingleOperandFPData<0b0001, "fabs", fabs>;
defm FMOV : SingleOperandFPData<0b0000, "fmov">;		defm FMOV : SingleOperandFPData<0b0000, "fmov">;
defm FNEG : SingleOperandFPData<0b0010, "fneg", fneg>;		defm FNEG : SingleOperandFPData<0b0010, "fneg", fneg>;
defm FRINTA : SingleOperandFPData<0b1100, "frinta", fround>;		defm FRINTA : SingleOperandFPData<0b1100, "frinta", any_fround>;
defm FRINTI : SingleOperandFPData<0b1111, "frinti", fnearbyint>;		defm FRINTI : SingleOperandFPData<0b1111, "frinti", any_fnearbyint>;
defm FRINTM : SingleOperandFPData<0b1010, "frintm", ffloor>;		defm FRINTM : SingleOperandFPData<0b1010, "frintm", any_ffloor>;
defm FRINTN : SingleOperandFPData<0b1000, "frintn", froundeven>;		defm FRINTN : SingleOperandFPData<0b1000, "frintn", any_froundeven>;
defm FRINTP : SingleOperandFPData<0b1001, "frintp", fceil>;		defm FRINTP : SingleOperandFPData<0b1001, "frintp", any_fceil>;

defm FRINTX : SingleOperandFPData<0b1110, "frintx", frint>;		defm FRINTX : SingleOperandFPData<0b1110, "frintx", any_frint>;
defm FRINTZ : SingleOperandFPData<0b1011, "frintz", ftrunc>;		defm FRINTZ : SingleOperandFPData<0b1011, "frintz", any_ftrunc>;

let SchedRW = [WriteFDiv] in {		let SchedRW = [WriteFDiv] in {
defm FSQRT : SingleOperandFPData<0b0011, "fsqrt", fsqrt>;		defm FSQRT : SingleOperandFPData<0b0011, "fsqrt", any_fsqrt>;
}		}

let Predicates = [HasFRInt3264] in {		let Predicates = [HasFRInt3264] in {
defm FRINT32Z : FRIntNNT<0b00, "frint32z", int_aarch64_frint32z>;		defm FRINT32Z : FRIntNNT<0b00, "frint32z", int_aarch64_frint32z>;
defm FRINT64Z : FRIntNNT<0b10, "frint64z", int_aarch64_frint64z>;		defm FRINT64Z : FRIntNNT<0b10, "frint64z", int_aarch64_frint64z>;
defm FRINT32X : FRIntNNT<0b01, "frint32x", int_aarch64_frint32x>;		defm FRINT32X : FRIntNNT<0b01, "frint32x", int_aarch64_frint32x>;
defm FRINT64X : FRIntNNT<0b11, "frint64x", int_aarch64_frint64x>;		defm FRINT64X : FRIntNNT<0b11, "frint64x", int_aarch64_frint64x>;
} // HasFRInt3264		} // HasFRInt3264

		// Emitting strict_lrint as two instructions is valid as any exceptions that
		// occur will happen in exactly one of the instructions (e.g. if the input is
		// not an integer the inexact exception will happen in the FRINTX but not then
		// in the FCVTZS as the output of FRINTX is an integer).
let Predicates = [HasFullFP16] in {		let Predicates = [HasFullFP16] in {
def : Pat<(i32 (lrint f16:$Rn)),		def : Pat<(i32 (any_lrint f16:$Rn)),
(FCVTZSUWHr (!cast<Instruction>(FRINTXHr) f16:$Rn))>;		(FCVTZSUWHr (!cast<Instruction>(FRINTXHr) f16:$Rn))>;
def : Pat<(i64 (lrint f16:$Rn)),		def : Pat<(i64 (any_lrint f16:$Rn)),
(FCVTZSUXHr (!cast<Instruction>(FRINTXHr) f16:$Rn))>;		(FCVTZSUXHr (!cast<Instruction>(FRINTXHr) f16:$Rn))>;
def : Pat<(i64 (llrint f16:$Rn)),		def : Pat<(i64 (any_llrint f16:$Rn)),
(FCVTZSUXHr (!cast<Instruction>(FRINTXHr) f16:$Rn))>;		(FCVTZSUXHr (!cast<Instruction>(FRINTXHr) f16:$Rn))>;
}		}
def : Pat<(i32 (lrint f32:$Rn)),		def : Pat<(i32 (any_lrint f32:$Rn)),
(FCVTZSUWSr (!cast<Instruction>(FRINTXSr) f32:$Rn))>;		(FCVTZSUWSr (!cast<Instruction>(FRINTXSr) f32:$Rn))>;
def : Pat<(i32 (lrint f64:$Rn)),		def : Pat<(i32 (any_lrint f64:$Rn)),
(FCVTZSUWDr (!cast<Instruction>(FRINTXDr) f64:$Rn))>;		(FCVTZSUWDr (!cast<Instruction>(FRINTXDr) f64:$Rn))>;
def : Pat<(i64 (lrint f32:$Rn)),		def : Pat<(i64 (any_lrint f32:$Rn)),
(FCVTZSUXSr (!cast<Instruction>(FRINTXSr) f32:$Rn))>;		(FCVTZSUXSr (!cast<Instruction>(FRINTXSr) f32:$Rn))>;
def : Pat<(i64 (lrint f64:$Rn)),		def : Pat<(i64 (any_lrint f64:$Rn)),
(FCVTZSUXDr (!cast<Instruction>(FRINTXDr) f64:$Rn))>;		(FCVTZSUXDr (!cast<Instruction>(FRINTXDr) f64:$Rn))>;
def : Pat<(i64 (llrint f32:$Rn)),		def : Pat<(i64 (any_llrint f32:$Rn)),
(FCVTZSUXSr (!cast<Instruction>(FRINTXSr) f32:$Rn))>;		(FCVTZSUXSr (!cast<Instruction>(FRINTXSr) f32:$Rn))>;
def : Pat<(i64 (llrint f64:$Rn)),		def : Pat<(i64 (any_llrint f64:$Rn)),
(FCVTZSUXDr (!cast<Instruction>(FRINTXDr) f64:$Rn))>;		(FCVTZSUXDr (!cast<Instruction>(FRINTXDr) f64:$Rn))>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Floating point two operand instructions.		// Floating point two operand instructions.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

defm FADD : TwoOperandFPData<0b0010, "fadd", fadd>;		defm FADD : TwoOperandFPData<0b0010, "fadd", any_fadd>;
let SchedRW = [WriteFDiv] in {		let SchedRW = [WriteFDiv] in {
defm FDIV : TwoOperandFPData<0b0001, "fdiv", fdiv>;		defm FDIV : TwoOperandFPData<0b0001, "fdiv", any_fdiv>;
}		}
defm FMAXNM : TwoOperandFPData<0b0110, "fmaxnm", fmaxnum>;		defm FMAXNM : TwoOperandFPData<0b0110, "fmaxnm", any_fmaxnum>;
defm FMAX : TwoOperandFPData<0b0100, "fmax", fmaximum>;		defm FMAX : TwoOperandFPData<0b0100, "fmax", any_fmaximum>;
defm FMINNM : TwoOperandFPData<0b0111, "fminnm", fminnum>;		defm FMINNM : TwoOperandFPData<0b0111, "fminnm", any_fminnum>;
defm FMIN : TwoOperandFPData<0b0101, "fmin", fminimum>;		defm FMIN : TwoOperandFPData<0b0101, "fmin", any_fminimum>;
let SchedRW = [WriteFMul] in {		let SchedRW = [WriteFMul] in {
defm FMUL : TwoOperandFPData<0b0000, "fmul", fmul>;		defm FMUL : TwoOperandFPData<0b0000, "fmul", any_fmul>;
defm FNMUL : TwoOperandFPDataNeg<0b1000, "fnmul", fmul>;		defm FNMUL : TwoOperandFPDataNeg<0b1000, "fnmul", any_fmul>;
}		}
defm FSUB : TwoOperandFPData<0b0011, "fsub", fsub>;		defm FSUB : TwoOperandFPData<0b0011, "fsub", any_fsub>;

def : Pat<(v1f64 (fmaximum (v1f64 FPR64:$Rn), (v1f64 FPR64:$Rm))),		def : Pat<(v1f64 (fmaximum (v1f64 FPR64:$Rn), (v1f64 FPR64:$Rm))),
(FMAXDrr FPR64:$Rn, FPR64:$Rm)>;		(FMAXDrr FPR64:$Rn, FPR64:$Rm)>;
def : Pat<(v1f64 (fminimum (v1f64 FPR64:$Rn), (v1f64 FPR64:$Rm))),		def : Pat<(v1f64 (fminimum (v1f64 FPR64:$Rn), (v1f64 FPR64:$Rm))),
(FMINDrr FPR64:$Rn, FPR64:$Rm)>;		(FMINDrr FPR64:$Rn, FPR64:$Rm)>;
def : Pat<(v1f64 (fmaxnum (v1f64 FPR64:$Rn), (v1f64 FPR64:$Rm))),		def : Pat<(v1f64 (fmaxnum (v1f64 FPR64:$Rn), (v1f64 FPR64:$Rm))),
(FMAXNMDrr FPR64:$Rn, FPR64:$Rm)>;		(FMAXNMDrr FPR64:$Rn, FPR64:$Rm)>;
def : Pat<(v1f64 (fminnum (v1f64 FPR64:$Rn), (v1f64 FPR64:$Rm))),		def : Pat<(v1f64 (fminnum (v1f64 FPR64:$Rn), (v1f64 FPR64:$Rm))),
(FMINNMDrr FPR64:$Rn, FPR64:$Rm)>;		(FMINNMDrr FPR64:$Rn, FPR64:$Rm)>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Floating point three operand instructions.		// Floating point three operand instructions.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

defm FMADD : ThreeOperandFPData<0, 0, "fmadd", fma>;		defm FMADD : ThreeOperandFPData<0, 0, "fmadd", any_fma>;
defm FMSUB : ThreeOperandFPData<0, 1, "fmsub",		defm FMSUB : ThreeOperandFPData<0, 1, "fmsub",
TriOpFrag<(fma node:$LHS, (fneg node:$MHS), node:$RHS)> >;		TriOpFrag<(any_fma node:$LHS, (fneg node:$MHS), node:$RHS)> >;
defm FNMADD : ThreeOperandFPData<1, 0, "fnmadd",		defm FNMADD : ThreeOperandFPData<1, 0, "fnmadd",
		dmgreenUnsubmitted Not Done Reply Inline Actions Is there a test for FNMADD (and FNMSUB)? I don't see it in D118487 either. dmgreen: Is there a test for FNMADD (and FNMSUB)? I don't see it in D118487 either.
TriOpFrag<(fneg (fma node:$LHS, node:$MHS, node:$RHS))> >;		TriOpFrag<(fneg (any_fma node:$LHS, node:$MHS, node:$RHS))> >;
defm FNMSUB : ThreeOperandFPData<1, 1, "fnmsub",		defm FNMSUB : ThreeOperandFPData<1, 1, "fnmsub",
TriOpFrag<(fma node:$LHS, node:$MHS, (fneg node:$RHS))> >;		TriOpFrag<(any_fma node:$LHS, node:$MHS, (fneg node:$RHS))> >;

// The following def pats catch the case where the LHS of an FMA is negated.		// The following def pats catch the case where the LHS of an FMA is negated.
// The TriOpFrag above catches the case where the middle operand is negated.		// The TriOpFrag above catches the case where the middle operand is negated.

// N.b. FMSUB etc have the accumulator at the end of (outs), unlike		// N.b. FMSUB etc have the accumulator at the end of (outs), unlike
// the NEON variant.		// the NEON variant.

// Here we handle first -(a + b*c) for FNMADD:		// Here we handle first -(a + b*c) for FNMADD:
▲ Show 20 Lines • Show All 172 Lines • ▼ Show 20 Lines
defm FCVTAS : SIMDTwoVectorFPToInt<0,0,0b11100, "fcvtas",int_aarch64_neon_fcvtas>;		defm FCVTAS : SIMDTwoVectorFPToInt<0,0,0b11100, "fcvtas",int_aarch64_neon_fcvtas>;
defm FCVTAU : SIMDTwoVectorFPToInt<1,0,0b11100, "fcvtau",int_aarch64_neon_fcvtau>;		defm FCVTAU : SIMDTwoVectorFPToInt<1,0,0b11100, "fcvtau",int_aarch64_neon_fcvtau>;
defm FCVTL : SIMDFPWidenTwoVector<0, 0, 0b10111, "fcvtl">;		defm FCVTL : SIMDFPWidenTwoVector<0, 0, 0b10111, "fcvtl">;
def : Pat<(v4f32 (int_aarch64_neon_vcvthf2fp (v4i16 V64:$Rn))),		def : Pat<(v4f32 (int_aarch64_neon_vcvthf2fp (v4i16 V64:$Rn))),
(FCVTLv4i16 V64:$Rn)>;		(FCVTLv4i16 V64:$Rn)>;
def : Pat<(v4f32 (int_aarch64_neon_vcvthf2fp (extract_subvector (v8i16 V128:$Rn),		def : Pat<(v4f32 (int_aarch64_neon_vcvthf2fp (extract_subvector (v8i16 V128:$Rn),
(i64 4)))),		(i64 4)))),
(FCVTLv8i16 V128:$Rn)>;		(FCVTLv8i16 V128:$Rn)>;
def : Pat<(v2f64 (fpextend (v2f32 V64:$Rn))), (FCVTLv2i32 V64:$Rn)>;		def : Pat<(v2f64 (any_fpextend (v2f32 V64:$Rn))), (FCVTLv2i32 V64:$Rn)>;
		dmgreenUnsubmitted Not Done Reply Inline Actions None of the test below seem to be Neon. Are they tested somewhere else? dmgreen: None of the test below seem to be Neon. Are they tested somewhere else?
		john.brawnAuthorUnsubmitted Done Reply Inline Actions D117795 adds neon tests. john.brawn: D117795 adds neon tests.

def : Pat<(v4f32 (fpextend (v4f16 V64:$Rn))), (FCVTLv4i16 V64:$Rn)>;		def : Pat<(v4f32 (any_fpextend (v4f16 V64:$Rn))), (FCVTLv4i16 V64:$Rn)>;

defm FCVTMS : SIMDTwoVectorFPToInt<0,0,0b11011, "fcvtms",int_aarch64_neon_fcvtms>;		defm FCVTMS : SIMDTwoVectorFPToInt<0,0,0b11011, "fcvtms",int_aarch64_neon_fcvtms>;
defm FCVTMU : SIMDTwoVectorFPToInt<1,0,0b11011, "fcvtmu",int_aarch64_neon_fcvtmu>;		defm FCVTMU : SIMDTwoVectorFPToInt<1,0,0b11011, "fcvtmu",int_aarch64_neon_fcvtmu>;
defm FCVTNS : SIMDTwoVectorFPToInt<0,0,0b11010, "fcvtns",int_aarch64_neon_fcvtns>;		defm FCVTNS : SIMDTwoVectorFPToInt<0,0,0b11010, "fcvtns",int_aarch64_neon_fcvtns>;
defm FCVTNU : SIMDTwoVectorFPToInt<1,0,0b11010, "fcvtnu",int_aarch64_neon_fcvtnu>;		defm FCVTNU : SIMDTwoVectorFPToInt<1,0,0b11010, "fcvtnu",int_aarch64_neon_fcvtnu>;
defm FCVTN : SIMDFPNarrowTwoVector<0, 0, 0b10110, "fcvtn">;		defm FCVTN : SIMDFPNarrowTwoVector<0, 0, 0b10110, "fcvtn">;
def : Pat<(v4i16 (int_aarch64_neon_vcvtfp2hf (v4f32 V128:$Rn))),		def : Pat<(v4i16 (int_aarch64_neon_vcvtfp2hf (v4f32 V128:$Rn))),
(FCVTNv4i16 V128:$Rn)>;		(FCVTNv4i16 V128:$Rn)>;
def : Pat<(concat_vectors V64:$Rd,		def : Pat<(concat_vectors V64:$Rd,
(v4i16 (int_aarch64_neon_vcvtfp2hf (v4f32 V128:$Rn)))),		(v4i16 (int_aarch64_neon_vcvtfp2hf (v4f32 V128:$Rn)))),
(FCVTNv8i16 (INSERT_SUBREG (IMPLICIT_DEF), V64:$Rd, dsub), V128:$Rn)>;		(FCVTNv8i16 (INSERT_SUBREG (IMPLICIT_DEF), V64:$Rd, dsub), V128:$Rn)>;
def : Pat<(v2f32 (fpround (v2f64 V128:$Rn))), (FCVTNv2i32 V128:$Rn)>;		def : Pat<(v2f32 (any_fpround (v2f64 V128:$Rn))), (FCVTNv2i32 V128:$Rn)>;
def : Pat<(v4f16 (fpround (v4f32 V128:$Rn))), (FCVTNv4i16 V128:$Rn)>;		def : Pat<(v4f16 (any_fpround (v4f32 V128:$Rn))), (FCVTNv4i16 V128:$Rn)>;
def : Pat<(concat_vectors V64:$Rd, (v2f32 (fpround (v2f64 V128:$Rn)))),		def : Pat<(concat_vectors V64:$Rd, (v2f32 (any_fpround (v2f64 V128:$Rn)))),
(FCVTNv4i32 (INSERT_SUBREG (IMPLICIT_DEF), V64:$Rd, dsub), V128:$Rn)>;		(FCVTNv4i32 (INSERT_SUBREG (IMPLICIT_DEF), V64:$Rd, dsub), V128:$Rn)>;
defm FCVTPS : SIMDTwoVectorFPToInt<0,1,0b11010, "fcvtps",int_aarch64_neon_fcvtps>;		defm FCVTPS : SIMDTwoVectorFPToInt<0,1,0b11010, "fcvtps",int_aarch64_neon_fcvtps>;
defm FCVTPU : SIMDTwoVectorFPToInt<1,1,0b11010, "fcvtpu",int_aarch64_neon_fcvtpu>;		defm FCVTPU : SIMDTwoVectorFPToInt<1,1,0b11010, "fcvtpu",int_aarch64_neon_fcvtpu>;
defm FCVTXN : SIMDFPInexactCvtTwoVector<1, 0, 0b10110, "fcvtxn",		defm FCVTXN : SIMDFPInexactCvtTwoVector<1, 0, 0b10110, "fcvtxn",
int_aarch64_neon_fcvtxn>;		int_aarch64_neon_fcvtxn>;
defm FCVTZS : SIMDTwoVectorFPToInt<0, 1, 0b11011, "fcvtzs", fp_to_sint>;		defm FCVTZS : SIMDTwoVectorFPToInt<0, 1, 0b11011, "fcvtzs", any_fp_to_sint>;
defm FCVTZU : SIMDTwoVectorFPToInt<1, 1, 0b11011, "fcvtzu", fp_to_uint>;		defm FCVTZU : SIMDTwoVectorFPToInt<1, 1, 0b11011, "fcvtzu", any_fp_to_uint>;

// AArch64's FCVT instructions saturate when out of range.		// AArch64's FCVT instructions saturate when out of range.
multiclass SIMDTwoVectorFPToIntSatPats<SDNode to_int_sat, string INST> {		multiclass SIMDTwoVectorFPToIntSatPats<SDNode to_int_sat, string INST> {
def : Pat<(v4i16 (to_int_sat v4f16:$Rn, i16)),		def : Pat<(v4i16 (to_int_sat v4f16:$Rn, i16)),
(!cast<Instruction>(INST # v4f16) v4f16:$Rn)>;		(!cast<Instruction>(INST # v4f16) v4f16:$Rn)>;
def : Pat<(v8i16 (to_int_sat v8f16:$Rn, i16)),		def : Pat<(v8i16 (to_int_sat v8f16:$Rn, i16)),
(!cast<Instruction>(INST # v8f16) v8f16:$Rn)>;		(!cast<Instruction>(INST # v8f16) v8f16:$Rn)>;
def : Pat<(v2i32 (to_int_sat v2f32:$Rn, i32)),		def : Pat<(v2i32 (to_int_sat v2f32:$Rn, i32)),
Show All 15 Lines
def : Pat<(v4i16 (int_aarch64_neon_fcvtzu v4f16:$Rn)), (FCVTZUv4f16 $Rn)>;		def : Pat<(v4i16 (int_aarch64_neon_fcvtzu v4f16:$Rn)), (FCVTZUv4f16 $Rn)>;
def : Pat<(v8i16 (int_aarch64_neon_fcvtzu v8f16:$Rn)), (FCVTZUv8f16 $Rn)>;		def : Pat<(v8i16 (int_aarch64_neon_fcvtzu v8f16:$Rn)), (FCVTZUv8f16 $Rn)>;
def : Pat<(v2i32 (int_aarch64_neon_fcvtzu v2f32:$Rn)), (FCVTZUv2f32 $Rn)>;		def : Pat<(v2i32 (int_aarch64_neon_fcvtzu v2f32:$Rn)), (FCVTZUv2f32 $Rn)>;
def : Pat<(v4i32 (int_aarch64_neon_fcvtzu v4f32:$Rn)), (FCVTZUv4f32 $Rn)>;		def : Pat<(v4i32 (int_aarch64_neon_fcvtzu v4f32:$Rn)), (FCVTZUv4f32 $Rn)>;
def : Pat<(v2i64 (int_aarch64_neon_fcvtzu v2f64:$Rn)), (FCVTZUv2f64 $Rn)>;		def : Pat<(v2i64 (int_aarch64_neon_fcvtzu v2f64:$Rn)), (FCVTZUv2f64 $Rn)>;

defm FNEG : SIMDTwoVectorFP<1, 1, 0b01111, "fneg", fneg>;		defm FNEG : SIMDTwoVectorFP<1, 1, 0b01111, "fneg", fneg>;
defm FRECPE : SIMDTwoVectorFP<0, 1, 0b11101, "frecpe", int_aarch64_neon_frecpe>;		defm FRECPE : SIMDTwoVectorFP<0, 1, 0b11101, "frecpe", int_aarch64_neon_frecpe>;
defm FRINTA : SIMDTwoVectorFP<1, 0, 0b11000, "frinta", fround>;		defm FRINTA : SIMDTwoVectorFP<1, 0, 0b11000, "frinta", any_fround>;
defm FRINTI : SIMDTwoVectorFP<1, 1, 0b11001, "frinti", fnearbyint>;		defm FRINTI : SIMDTwoVectorFP<1, 1, 0b11001, "frinti", any_fnearbyint>;
defm FRINTM : SIMDTwoVectorFP<0, 0, 0b11001, "frintm", ffloor>;		defm FRINTM : SIMDTwoVectorFP<0, 0, 0b11001, "frintm", any_ffloor>;
defm FRINTN : SIMDTwoVectorFP<0, 0, 0b11000, "frintn", froundeven>;		defm FRINTN : SIMDTwoVectorFP<0, 0, 0b11000, "frintn", any_froundeven>;
defm FRINTP : SIMDTwoVectorFP<0, 1, 0b11000, "frintp", fceil>;		defm FRINTP : SIMDTwoVectorFP<0, 1, 0b11000, "frintp", any_fceil>;
defm FRINTX : SIMDTwoVectorFP<1, 0, 0b11001, "frintx", frint>;		defm FRINTX : SIMDTwoVectorFP<1, 0, 0b11001, "frintx", any_frint>;
defm FRINTZ : SIMDTwoVectorFP<0, 1, 0b11001, "frintz", ftrunc>;		defm FRINTZ : SIMDTwoVectorFP<0, 1, 0b11001, "frintz", any_ftrunc>;

let Predicates = [HasFRInt3264] in {		let Predicates = [HasFRInt3264] in {
defm FRINT32Z : FRIntNNTVector<0, 0, "frint32z", int_aarch64_neon_frint32z>;		defm FRINT32Z : FRIntNNTVector<0, 0, "frint32z", int_aarch64_neon_frint32z>;
defm FRINT64Z : FRIntNNTVector<0, 1, "frint64z", int_aarch64_neon_frint64z>;		defm FRINT64Z : FRIntNNTVector<0, 1, "frint64z", int_aarch64_neon_frint64z>;
defm FRINT32X : FRIntNNTVector<1, 0, "frint32x", int_aarch64_neon_frint32x>;		defm FRINT32X : FRIntNNTVector<1, 0, "frint32x", int_aarch64_neon_frint32x>;
defm FRINT64X : FRIntNNTVector<1, 1, "frint64x", int_aarch64_neon_frint64x>;		defm FRINT64X : FRIntNNTVector<1, 1, "frint64x", int_aarch64_neon_frint64x>;
} // HasFRInt3264		} // HasFRInt3264

defm FRSQRTE: SIMDTwoVectorFP<1, 1, 0b11101, "frsqrte", int_aarch64_neon_frsqrte>;		defm FRSQRTE: SIMDTwoVectorFP<1, 1, 0b11101, "frsqrte", int_aarch64_neon_frsqrte>;
defm FSQRT : SIMDTwoVectorFP<1, 1, 0b11111, "fsqrt", fsqrt>;		defm FSQRT : SIMDTwoVectorFP<1, 1, 0b11111, "fsqrt", any_fsqrt>;
defm NEG : SIMDTwoVectorBHSD<1, 0b01011, "neg",		defm NEG : SIMDTwoVectorBHSD<1, 0b01011, "neg",
UnOpFrag<(sub immAllZerosV, node:$LHS)> >;		UnOpFrag<(sub immAllZerosV, node:$LHS)> >;
defm NOT : SIMDTwoVectorB<1, 0b00, 0b00101, "not", vnot>;		defm NOT : SIMDTwoVectorB<1, 0b00, 0b00101, "not", vnot>;
// Aliases for MVN -> NOT.		// Aliases for MVN -> NOT.
def : InstAlias<"mvn{ $Vd.8b, $Vn.8b\|.8b $Vd, $Vn}",		def : InstAlias<"mvn{ $Vd.8b, $Vn.8b\|.8b $Vd, $Vn}",
(NOTv8i8 V64:$Vd, V64:$Vn)>;		(NOTv8i8 V64:$Vd, V64:$Vn)>;
def : InstAlias<"mvn{ $Vd.16b, $Vn.16b\|.16b $Vd, $Vn}",		def : InstAlias<"mvn{ $Vd.16b, $Vn.16b\|.16b $Vd, $Vn}",
(NOTv16i8 V128:$Vd, V128:$Vn)>;		(NOTv16i8 V128:$Vd, V128:$Vn)>;

def : Pat<(vnot (v4i16 V64:$Rn)), (NOTv8i8 V64:$Rn)>;		def : Pat<(vnot (v4i16 V64:$Rn)), (NOTv8i8 V64:$Rn)>;
def : Pat<(vnot (v8i16 V128:$Rn)), (NOTv16i8 V128:$Rn)>;		def : Pat<(vnot (v8i16 V128:$Rn)), (NOTv16i8 V128:$Rn)>;
def : Pat<(vnot (v2i32 V64:$Rn)), (NOTv8i8 V64:$Rn)>;		def : Pat<(vnot (v2i32 V64:$Rn)), (NOTv8i8 V64:$Rn)>;
def : Pat<(vnot (v4i32 V128:$Rn)), (NOTv16i8 V128:$Rn)>;		def : Pat<(vnot (v4i32 V128:$Rn)), (NOTv16i8 V128:$Rn)>;
def : Pat<(vnot (v1i64 V64:$Rn)), (NOTv8i8 V64:$Rn)>;		def : Pat<(vnot (v1i64 V64:$Rn)), (NOTv8i8 V64:$Rn)>;
def : Pat<(vnot (v2i64 V128:$Rn)), (NOTv16i8 V128:$Rn)>;		def : Pat<(vnot (v2i64 V128:$Rn)), (NOTv16i8 V128:$Rn)>;

defm RBIT : SIMDTwoVectorB<1, 0b01, 0b00101, "rbit", bitreverse>;		defm RBIT : SIMDTwoVectorB<1, 0b01, 0b00101, "rbit", bitreverse>;
defm REV16 : SIMDTwoVectorB<0, 0b00, 0b00001, "rev16", AArch64rev16>;		defm REV16 : SIMDTwoVectorB<0, 0b00, 0b00001, "rev16", AArch64rev16>;
defm REV32 : SIMDTwoVectorBH<1, 0b00000, "rev32", AArch64rev32>;		defm REV32 : SIMDTwoVectorBH<1, 0b00000, "rev32", AArch64rev32>;
defm REV64 : SIMDTwoVectorBHS<0, 0b00000, "rev64", AArch64rev64>;		defm REV64 : SIMDTwoVectorBHS<0, 0b00000, "rev64", AArch64rev64>;
defm SADALP : SIMDLongTwoVectorTied<0, 0b00110, "sadalp",		defm SADALP : SIMDLongTwoVectorTied<0, 0b00110, "sadalp",
BinOpFrag<(add node:$LHS, (AArch64saddlp node:$RHS))> >;		BinOpFrag<(add node:$LHS, (AArch64saddlp node:$RHS))> >;
defm SADDLP : SIMDLongTwoVector<0, 0b00010, "saddlp", AArch64saddlp>;		defm SADDLP : SIMDLongTwoVector<0, 0b00010, "saddlp", AArch64saddlp>;
defm SCVTF : SIMDTwoVectorIntToFP<0, 0, 0b11101, "scvtf", sint_to_fp>;		defm SCVTF : SIMDTwoVectorIntToFP<0, 0, 0b11101, "scvtf", any_sint_to_fp>;
defm SHLL : SIMDVectorLShiftLongBySizeBHS;		defm SHLL : SIMDVectorLShiftLongBySizeBHS;
defm SQABS : SIMDTwoVectorBHSD<0, 0b00111, "sqabs", int_aarch64_neon_sqabs>;		defm SQABS : SIMDTwoVectorBHSD<0, 0b00111, "sqabs", int_aarch64_neon_sqabs>;
defm SQNEG : SIMDTwoVectorBHSD<1, 0b00111, "sqneg", int_aarch64_neon_sqneg>;		defm SQNEG : SIMDTwoVectorBHSD<1, 0b00111, "sqneg", int_aarch64_neon_sqneg>;
defm SQXTN : SIMDMixedTwoVector<0, 0b10100, "sqxtn", int_aarch64_neon_sqxtn>;		defm SQXTN : SIMDMixedTwoVector<0, 0b10100, "sqxtn", int_aarch64_neon_sqxtn>;
defm SQXTUN : SIMDMixedTwoVector<1, 0b10010, "sqxtun", int_aarch64_neon_sqxtun>;		defm SQXTUN : SIMDMixedTwoVector<1, 0b10010, "sqxtun", int_aarch64_neon_sqxtun>;
defm SUQADD : SIMDTwoVectorBHSDTied<0, 0b00011, "suqadd",int_aarch64_neon_suqadd>;		defm SUQADD : SIMDTwoVectorBHSDTied<0, 0b00011, "suqadd",int_aarch64_neon_suqadd>;
defm UADALP : SIMDLongTwoVectorTied<1, 0b00110, "uadalp",		defm UADALP : SIMDLongTwoVectorTied<1, 0b00110, "uadalp",
BinOpFrag<(add node:$LHS, (AArch64uaddlp node:$RHS))> >;		BinOpFrag<(add node:$LHS, (AArch64uaddlp node:$RHS))> >;
defm UADDLP : SIMDLongTwoVector<1, 0b00010, "uaddlp", AArch64uaddlp>;		defm UADDLP : SIMDLongTwoVector<1, 0b00010, "uaddlp", AArch64uaddlp>;
defm UCVTF : SIMDTwoVectorIntToFP<1, 0, 0b11101, "ucvtf", uint_to_fp>;		defm UCVTF : SIMDTwoVectorIntToFP<1, 0, 0b11101, "ucvtf", any_uint_to_fp>;
defm UQXTN : SIMDMixedTwoVector<1, 0b10100, "uqxtn", int_aarch64_neon_uqxtn>;		defm UQXTN : SIMDMixedTwoVector<1, 0b10100, "uqxtn", int_aarch64_neon_uqxtn>;
defm URECPE : SIMDTwoVectorS<0, 1, 0b11100, "urecpe", int_aarch64_neon_urecpe>;		defm URECPE : SIMDTwoVectorS<0, 1, 0b11100, "urecpe", int_aarch64_neon_urecpe>;
defm URSQRTE: SIMDTwoVectorS<1, 1, 0b11100, "ursqrte", int_aarch64_neon_ursqrte>;		defm URSQRTE: SIMDTwoVectorS<1, 1, 0b11100, "ursqrte", int_aarch64_neon_ursqrte>;
defm USQADD : SIMDTwoVectorBHSDTied<1, 0b00011, "usqadd",int_aarch64_neon_usqadd>;		defm USQADD : SIMDTwoVectorBHSDTied<1, 0b00011, "usqadd",int_aarch64_neon_usqadd>;
defm XTN : SIMDMixedTwoVector<0, 0b10010, "xtn", trunc>;		defm XTN : SIMDMixedTwoVector<0, 0b10010, "xtn", trunc>;

def : Pat<(v4f16 (AArch64rev32 V64:$Rn)), (REV32v4i16 V64:$Rn)>;		def : Pat<(v4f16 (AArch64rev32 V64:$Rn)), (REV32v4i16 V64:$Rn)>;
def : Pat<(v4f16 (AArch64rev64 V64:$Rn)), (REV64v4i16 V64:$Rn)>;		def : Pat<(v4f16 (AArch64rev64 V64:$Rn)), (REV64v4i16 V64:$Rn)>;
▲ Show 20 Lines • Show All 107 Lines • ▼ Show 20 Lines
}		}
let Predicates = [HasNEON, HasFullFP16] in {		let Predicates = [HasNEON, HasFullFP16] in {
foreach VT = [ v4f16, v8f16 ] in		foreach VT = [ v4f16, v8f16 ] in
def : Pat<(fabs (fsub VT:$Rn, VT:$Rm)), (!cast<Instruction>("FABD"#VT) VT:$Rn, VT:$Rm)>;		def : Pat<(fabs (fsub VT:$Rn, VT:$Rm)), (!cast<Instruction>("FABD"#VT) VT:$Rn, VT:$Rm)>;
}		}
defm FACGE : SIMDThreeSameVectorFPCmp<1,0,0b101,"facge",int_aarch64_neon_facge>;		defm FACGE : SIMDThreeSameVectorFPCmp<1,0,0b101,"facge",int_aarch64_neon_facge>;
defm FACGT : SIMDThreeSameVectorFPCmp<1,1,0b101,"facgt",int_aarch64_neon_facgt>;		defm FACGT : SIMDThreeSameVectorFPCmp<1,1,0b101,"facgt",int_aarch64_neon_facgt>;
defm FADDP : SIMDThreeSameVectorFP<1,0,0b010,"faddp",int_aarch64_neon_faddp>;		defm FADDP : SIMDThreeSameVectorFP<1,0,0b010,"faddp",int_aarch64_neon_faddp>;
defm FADD : SIMDThreeSameVectorFP<0,0,0b010,"fadd", fadd>;		defm FADD : SIMDThreeSameVectorFP<0,0,0b010,"fadd", any_fadd>;
defm FCMEQ : SIMDThreeSameVectorFPCmp<0, 0, 0b100, "fcmeq", AArch64fcmeq>;		defm FCMEQ : SIMDThreeSameVectorFPCmp<0, 0, 0b100, "fcmeq", AArch64fcmeq>;
defm FCMGE : SIMDThreeSameVectorFPCmp<1, 0, 0b100, "fcmge", AArch64fcmge>;		defm FCMGE : SIMDThreeSameVectorFPCmp<1, 0, 0b100, "fcmge", AArch64fcmge>;
defm FCMGT : SIMDThreeSameVectorFPCmp<1, 1, 0b100, "fcmgt", AArch64fcmgt>;		defm FCMGT : SIMDThreeSameVectorFPCmp<1, 1, 0b100, "fcmgt", AArch64fcmgt>;
defm FDIV : SIMDThreeSameVectorFP<1,0,0b111,"fdiv", fdiv>;		defm FDIV : SIMDThreeSameVectorFP<1,0,0b111,"fdiv", any_fdiv>;
defm FMAXNMP : SIMDThreeSameVectorFP<1,0,0b000,"fmaxnmp", int_aarch64_neon_fmaxnmp>;		defm FMAXNMP : SIMDThreeSameVectorFP<1,0,0b000,"fmaxnmp", int_aarch64_neon_fmaxnmp>;
defm FMAXNM : SIMDThreeSameVectorFP<0,0,0b000,"fmaxnm", fmaxnum>;		defm FMAXNM : SIMDThreeSameVectorFP<0,0,0b000,"fmaxnm", any_fmaxnum>;
defm FMAXP : SIMDThreeSameVectorFP<1,0,0b110,"fmaxp", int_aarch64_neon_fmaxp>;		defm FMAXP : SIMDThreeSameVectorFP<1,0,0b110,"fmaxp", int_aarch64_neon_fmaxp>;
defm FMAX : SIMDThreeSameVectorFP<0,0,0b110,"fmax", fmaximum>;		defm FMAX : SIMDThreeSameVectorFP<0,0,0b110,"fmax", any_fmaximum>;
defm FMINNMP : SIMDThreeSameVectorFP<1,1,0b000,"fminnmp", int_aarch64_neon_fminnmp>;		defm FMINNMP : SIMDThreeSameVectorFP<1,1,0b000,"fminnmp", int_aarch64_neon_fminnmp>;
defm FMINNM : SIMDThreeSameVectorFP<0,1,0b000,"fminnm", fminnum>;		defm FMINNM : SIMDThreeSameVectorFP<0,1,0b000,"fminnm", any_fminnum>;
defm FMINP : SIMDThreeSameVectorFP<1,1,0b110,"fminp", int_aarch64_neon_fminp>;		defm FMINP : SIMDThreeSameVectorFP<1,1,0b110,"fminp", int_aarch64_neon_fminp>;
defm FMIN : SIMDThreeSameVectorFP<0,1,0b110,"fmin", fminimum>;		defm FMIN : SIMDThreeSameVectorFP<0,1,0b110,"fmin", any_fminimum>;

// NOTE: The operands of the PatFrag are reordered on FMLA/FMLS because the		// NOTE: The operands of the PatFrag are reordered on FMLA/FMLS because the
// instruction expects the addend first, while the fma intrinsic puts it last.		// instruction expects the addend first, while the fma intrinsic puts it last.
defm FMLA : SIMDThreeSameVectorFPTied<0, 0, 0b001, "fmla",		defm FMLA : SIMDThreeSameVectorFPTied<0, 0, 0b001, "fmla",
TriOpFrag<(fma node:$RHS, node:$MHS, node:$LHS)> >;		TriOpFrag<(any_fma node:$RHS, node:$MHS, node:$LHS)> >;
defm FMLS : SIMDThreeSameVectorFPTied<0, 1, 0b001, "fmls",		defm FMLS : SIMDThreeSameVectorFPTied<0, 1, 0b001, "fmls",
TriOpFrag<(fma node:$MHS, (fneg node:$RHS), node:$LHS)> >;		TriOpFrag<(any_fma node:$MHS, (fneg node:$RHS), node:$LHS)> >;

defm FMULX : SIMDThreeSameVectorFP<0,0,0b011,"fmulx", int_aarch64_neon_fmulx>;		defm FMULX : SIMDThreeSameVectorFP<0,0,0b011,"fmulx", int_aarch64_neon_fmulx>;
defm FMUL : SIMDThreeSameVectorFP<1,0,0b011,"fmul", fmul>;		defm FMUL : SIMDThreeSameVectorFP<1,0,0b011,"fmul", any_fmul>;
defm FRECPS : SIMDThreeSameVectorFP<0,0,0b111,"frecps", int_aarch64_neon_frecps>;		defm FRECPS : SIMDThreeSameVectorFP<0,0,0b111,"frecps", int_aarch64_neon_frecps>;
defm FRSQRTS : SIMDThreeSameVectorFP<0,1,0b111,"frsqrts", int_aarch64_neon_frsqrts>;		defm FRSQRTS : SIMDThreeSameVectorFP<0,1,0b111,"frsqrts", int_aarch64_neon_frsqrts>;
defm FSUB : SIMDThreeSameVectorFP<0,1,0b010,"fsub", fsub>;		defm FSUB : SIMDThreeSameVectorFP<0,1,0b010,"fsub", any_fsub>;

// MLA and MLS are generated in MachineCombine		// MLA and MLS are generated in MachineCombine
defm MLA : SIMDThreeSameVectorBHSTied<0, 0b10010, "mla", null_frag>;		defm MLA : SIMDThreeSameVectorBHSTied<0, 0b10010, "mla", null_frag>;
defm MLS : SIMDThreeSameVectorBHSTied<1, 0b10010, "mls", null_frag>;		defm MLS : SIMDThreeSameVectorBHSTied<1, 0b10010, "mls", null_frag>;

defm MUL : SIMDThreeSameVectorBHS<0, 0b10011, "mul", mul>;		defm MUL : SIMDThreeSameVectorBHS<0, 0b10011, "mul", mul>;
defm PMUL : SIMDThreeSameVectorB<1, 0b10011, "pmul", int_aarch64_neon_pmul>;		defm PMUL : SIMDThreeSameVectorB<1, 0b10011, "pmul", int_aarch64_neon_pmul>;
defm SABA : SIMDThreeSameVectorBHSTied<0, 0b01111, "saba",		defm SABA : SIMDThreeSameVectorBHSTied<0, 0b01111, "saba",
▲ Show 20 Lines • Show All 490 Lines • ▼ Show 20 Lines	def : Pat<(f64 (AArch64frsqrts (f64 FPR64:$Rn), (f64 FPR64:$Rm))),
(FRSQRTS64 FPR64:$Rn, FPR64:$Rm)>;		(FRSQRTS64 FPR64:$Rn, FPR64:$Rm)>;
def : Pat<(v2f64 (AArch64frsqrts (v2f64 FPR128:$Rn), (v2f64 FPR128:$Rm))),		def : Pat<(v2f64 (AArch64frsqrts (v2f64 FPR128:$Rn), (v2f64 FPR128:$Rm))),
(FRSQRTSv2f64 FPR128:$Rn, FPR128:$Rm)>;		(FRSQRTSv2f64 FPR128:$Rn, FPR128:$Rm)>;

// Some float -> int -> float conversion patterns for which we want to keep the		// Some float -> int -> float conversion patterns for which we want to keep the
// int values in FP registers using the corresponding NEON instructions to		// int values in FP registers using the corresponding NEON instructions to
// avoid more costly int <-> fp register transfers.		// avoid more costly int <-> fp register transfers.
let Predicates = [HasNEON] in {		let Predicates = [HasNEON] in {
def : Pat<(f64 (sint_to_fp (i64 (fp_to_sint f64:$Rn)))),		def : Pat<(f64 (sint_to_fp (i64 (fp_to_sint f64:$Rn)))),
		dmgreenUnsubmitted Not Done Reply Inline Actions Are optimizations like this desirable (or always valid?) for strict nodes? Same for the loads below. Do we have test cases for them? dmgreen: Are optimizations like this desirable (or always valid?) for strict nodes? Same for the loads…
		john.brawnAuthorUnsubmitted Done Reply Inline Actions Anything where there's a one-to-one mapping between a strict selectiondag node and a floating-point instruction is fine (or more generally where we have an instruction sequence that can cause the same floating-point exceptions in the same order and respects rounding modes). These aren't tested, no. Actually it probably makes sense to move these kinds of patterns into separate patches and add tests for them there, as that can be done separately from the basic isel stuff. john.brawn: Anything where there's a one-to-one mapping between a strict selectiondag node and a floating…
(SCVTFv1i64 (i64 (FCVTZSv1i64 f64:$Rn)))>;		(SCVTFv1i64 (i64 (FCVTZSv1i64 f64:$Rn)))>;
def : Pat<(f32 (sint_to_fp (i32 (fp_to_sint f32:$Rn)))),		def : Pat<(f32 (sint_to_fp (i32 (fp_to_sint f32:$Rn)))),
(SCVTFv1i32 (i32 (FCVTZSv1i32 f32:$Rn)))>;		(SCVTFv1i32 (i32 (FCVTZSv1i32 f32:$Rn)))>;
def : Pat<(f64 (uint_to_fp (i64 (fp_to_uint f64:$Rn)))),		def : Pat<(f64 (uint_to_fp (i64 (fp_to_uint f64:$Rn)))),
(UCVTFv1i64 (i64 (FCVTZUv1i64 f64:$Rn)))>;		(UCVTFv1i64 (i64 (FCVTZUv1i64 f64:$Rn)))>;
def : Pat<(f32 (uint_to_fp (i32 (fp_to_uint f32:$Rn)))),		def : Pat<(f32 (uint_to_fp (i32 (fp_to_uint f32:$Rn)))),
(UCVTFv1i32 (i32 (FCVTZUv1i32 f32:$Rn)))>;		(UCVTFv1i32 (i32 (FCVTZUv1i32 f32:$Rn)))>;

▲ Show 20 Lines • Show All 1,285 Lines • ▼ Show 20 Lines
// On the other hand, there are quite a few valid combinatorial options due to		// On the other hand, there are quite a few valid combinatorial options due to
// the commutativity of multiplication and the fact that (-x) * y = x * (-y).		// the commutativity of multiplication and the fact that (-x) * y = x * (-y).
defm : SIMDFPIndexedTiedPatterns<"FMLA",		defm : SIMDFPIndexedTiedPatterns<"FMLA",
TriOpFrag<(fma node:$RHS, node:$MHS, node:$LHS)>>;		TriOpFrag<(fma node:$RHS, node:$MHS, node:$LHS)>>;
defm : SIMDFPIndexedTiedPatterns<"FMLA",		defm : SIMDFPIndexedTiedPatterns<"FMLA",
TriOpFrag<(fma node:$MHS, node:$RHS, node:$LHS)>>;		TriOpFrag<(fma node:$MHS, node:$RHS, node:$LHS)>>;

defm : SIMDFPIndexedTiedPatterns<"FMLS",		defm : SIMDFPIndexedTiedPatterns<"FMLS",
TriOpFrag<(fma node:$MHS, (fneg node:$RHS), node:$LHS)> >;		TriOpFrag<(fma node:$MHS, (fneg node:$RHS), node:$LHS)> >;
		dmgreenUnsubmitted Not Done Reply Inline Actions Do these make a lot of sense, with a strict fma but a non-strict fneg? dmgreen: Do these make a lot of sense, with a strict fma but a non-strict fneg?
		john.brawnAuthorUnsubmitted Done Reply Inline Actions fneg is purely a bit-flipping operation that doesn't have strict/non-strict versions and combining it doesn't change exception/rounding behaviour. john.brawn: fneg is purely a bit-flipping operation that doesn't have strict/non-strict versions and…
defm : SIMDFPIndexedTiedPatterns<"FMLS",		defm : SIMDFPIndexedTiedPatterns<"FMLS",
TriOpFrag<(fma node:$RHS, (fneg node:$MHS), node:$LHS)> >;		TriOpFrag<(fma node:$RHS, (fneg node:$MHS), node:$LHS)> >;
defm : SIMDFPIndexedTiedPatterns<"FMLS",		defm : SIMDFPIndexedTiedPatterns<"FMLS",
TriOpFrag<(fma (fneg node:$RHS), node:$MHS, node:$LHS)> >;		TriOpFrag<(fma (fneg node:$RHS), node:$MHS, node:$LHS)> >;
defm : SIMDFPIndexedTiedPatterns<"FMLS",		defm : SIMDFPIndexedTiedPatterns<"FMLS",
TriOpFrag<(fma (fneg node:$MHS), node:$RHS, node:$LHS)> >;		TriOpFrag<(fma (fneg node:$MHS), node:$RHS, node:$LHS)> >;

multiclass FMLSIndexedAfterNegPatterns<SDPatternOperator OpNode> {		multiclass FMLSIndexedAfterNegPatterns<SDPatternOperator OpNode> {
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
}		}

defm : FMLSIndexedAfterNegPatterns<		defm : FMLSIndexedAfterNegPatterns<
TriOpFrag<(fma node:$RHS, node:$MHS, node:$LHS)> >;		TriOpFrag<(fma node:$RHS, node:$MHS, node:$LHS)> >;
defm : FMLSIndexedAfterNegPatterns<		defm : FMLSIndexedAfterNegPatterns<
TriOpFrag<(fma node:$MHS, node:$RHS, node:$LHS)> >;		TriOpFrag<(fma node:$MHS, node:$RHS, node:$LHS)> >;

defm FMULX : SIMDFPIndexed<1, 0b1001, "fmulx", int_aarch64_neon_fmulx>;		defm FMULX : SIMDFPIndexed<1, 0b1001, "fmulx", int_aarch64_neon_fmulx>;
defm FMUL : SIMDFPIndexed<0, 0b1001, "fmul", fmul>;		defm FMUL : SIMDFPIndexed<0, 0b1001, "fmul", any_fmul>;

def : Pat<(v2f32 (fmul V64:$Rn, (AArch64dup (f32 FPR32:$Rm)))),		def : Pat<(v2f32 (fmul V64:$Rn, (AArch64dup (f32 FPR32:$Rm)))),
(FMULv2i32_indexed V64:$Rn,		(FMULv2i32_indexed V64:$Rn,
(INSERT_SUBREG (v4i32 (IMPLICIT_DEF)), FPR32:$Rm, ssub),		(INSERT_SUBREG (v4i32 (IMPLICIT_DEF)), FPR32:$Rm, ssub),
(i64 0))>;		(i64 0))>;
def : Pat<(v4f32 (fmul V128:$Rn, (AArch64dup (f32 FPR32:$Rm)))),		def : Pat<(v4f32 (fmul V128:$Rn, (AArch64dup (f32 FPR32:$Rm)))),
(FMULv4i32_indexed V128:$Rn,		(FMULv4i32_indexed V128:$Rn,
(INSERT_SUBREG (v4i32 (IMPLICIT_DEF)), FPR32:$Rm, ssub),		(INSERT_SUBREG (v4i32 (IMPLICIT_DEF)), FPR32:$Rm, ssub),
▲ Show 20 Lines • Show All 2,035 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/fp-intrinsics.ll

	; RUN: llc -mtriple=aarch64-none-eabi %s -o - \| FileCheck %s			; RUN: llc -mtriple=aarch64-none-eabi %s -disable-strictnode-mutation -o - \| FileCheck %s
				; RUN: llc -mtriple=aarch64-none-eabi -global-isel=true -global-isel-abort=2 -disable-strictnode-mutation %s -o - \| FileCheck %s
				dmgreenUnsubmitted Not Done Reply Inline Actions Using update_llc_test_checks would be good, I think. dmgreen: Using update_llc_test_checks would be good, I think.
				john.brawnAuthorUnsubmitted Done Reply Inline Actions It doesn't do a good job of handling the slight differences between the non-global-isel and global-isel output. It generates check lines that work for one but fail with the other. john.brawn: It doesn't do a good job of handling the slight differences between the non-global-isel and…
				dmgreenUnsubmitted Not Done Reply Inline Actions That might be because the order of the check prefixes matter when it generates the checks, from most-general to least general. This file is bound to get updated by someone at some point to run the scripts, it's the only way to keep files like this maintainable, we might as well do it now. The file is very big though, and looks like it would be better as a few different test files. dmgreen: That might be because the order of the check prefixes matter when it generates the checks, from…

	; Check that constrained fp intrinsics are correctly lowered.			; Check that constrained fp intrinsics are correctly lowered.


	; Single-precision intrinsics			; Single-precision intrinsics

	; CHECK-LABEL: add_f32:			; CHECK-LABEL: add_f32:
	; CHECK: fadd s0, s0, s1			; CHECK: fadd s0, s0, s1
	▲ Show 20 Lines • Show All 216 Lines • ▼ Show 20 Lines

	; CHECK-LABEL: minnum_f32:			; CHECK-LABEL: minnum_f32:
	; CHECK: fminnm s0, s0, s1			; CHECK: fminnm s0, s0, s1
	define float @minnum_f32(float %x, float %y) #0 {			define float @minnum_f32(float %x, float %y) #0 {
	%val = call float @llvm.experimental.constrained.minnum.f32(float %x, float %y, metadata !"fpexcept.strict") #0			%val = call float @llvm.experimental.constrained.minnum.f32(float %x, float %y, metadata !"fpexcept.strict") #0
	ret float %val			ret float %val
	}			}

				; CHECK-LABEL: maximum_f32:
				; CHECK: fmax s0, s0, s1
				define float @maximum_f32(float %x, float %y) #0 {
				%val = call float @llvm.experimental.constrained.maximum.f32(float %x, float %y, metadata !"fpexcept.strict") #0
				ret float %val
				}

				; CHECK-LABEL: minimum_f32:
				; CHECK: fmin s0, s0, s1
				define float @minimum_f32(float %x, float %y) #0 {
				%val = call float @llvm.experimental.constrained.minimum.f32(float %x, float %y, metadata !"fpexcept.strict") #0
				ret float %val
				}

	; CHECK-LABEL: ceil_f32:			; CHECK-LABEL: ceil_f32:
	; CHECK: frintp s0, s0			; CHECK: frintp s0, s0
	define float @ceil_f32(float %x) #0 {			define float @ceil_f32(float %x) #0 {
	%val = call float @llvm.experimental.constrained.ceil.f32(float %x, metadata !"fpexcept.strict") #0			%val = call float @llvm.experimental.constrained.ceil.f32(float %x, metadata !"fpexcept.strict") #0
	ret float %val			ret float %val
	}			}

	; CHECK-LABEL: floor_f32:			; CHECK-LABEL: floor_f32:
	▲ Show 20 Lines • Show All 454 Lines • ▼ Show 20 Lines

	; CHECK-LABEL: minnum_f64:			; CHECK-LABEL: minnum_f64:
	; CHECK: fminnm d0, d0, d1			; CHECK: fminnm d0, d0, d1
	define double @minnum_f64(double %x, double %y) #0 {			define double @minnum_f64(double %x, double %y) #0 {
	%val = call double @llvm.experimental.constrained.minnum.f64(double %x, double %y, metadata !"fpexcept.strict") #0			%val = call double @llvm.experimental.constrained.minnum.f64(double %x, double %y, metadata !"fpexcept.strict") #0
	ret double %val			ret double %val
	}			}

				; CHECK-LABEL: maximum_f64:
				; CHECK: fmax d0, d0, d1
				define double @maximum_f64(double %x, double %y) #0 {
				%val = call double @llvm.experimental.constrained.maximum.f64(double %x, double %y, metadata !"fpexcept.strict") #0
				ret double %val
				}

				; CHECK-LABEL: minimum_f64:
				; CHECK: fmin d0, d0, d1
				define double @minimum_f64(double %x, double %y) #0 {
				%val = call double @llvm.experimental.constrained.minimum.f64(double %x, double %y, metadata !"fpexcept.strict") #0
				ret double %val
				}

	; CHECK-LABEL: ceil_f64:			; CHECK-LABEL: ceil_f64:
	; CHECK: frintp d0, d0			; CHECK: frintp d0, d0
	define double @ceil_f64(double %x) #0 {			define double @ceil_f64(double %x) #0 {
	%val = call double @llvm.experimental.constrained.ceil.f64(double %x, metadata !"fpexcept.strict") #0			%val = call double @llvm.experimental.constrained.ceil.f64(double %x, metadata !"fpexcept.strict") #0
	ret double %val			ret double %val
	}			}

	; CHECK-LABEL: floor_f64:			; CHECK-LABEL: floor_f64:
	▲ Show 20 Lines • Show All 766 Lines • ▼ Show 20 Lines
	declare float @llvm.experimental.constrained.exp.f32(float, metadata, metadata)			declare float @llvm.experimental.constrained.exp.f32(float, metadata, metadata)
	declare float @llvm.experimental.constrained.exp2.f32(float, metadata, metadata)			declare float @llvm.experimental.constrained.exp2.f32(float, metadata, metadata)
	declare float @llvm.experimental.constrained.rint.f32(float, metadata, metadata)			declare float @llvm.experimental.constrained.rint.f32(float, metadata, metadata)
	declare float @llvm.experimental.constrained.nearbyint.f32(float, metadata, metadata)			declare float @llvm.experimental.constrained.nearbyint.f32(float, metadata, metadata)
	declare i32 @llvm.experimental.constrained.lrint.f32(float, metadata, metadata)			declare i32 @llvm.experimental.constrained.lrint.f32(float, metadata, metadata)
	declare i64 @llvm.experimental.constrained.llrint.f32(float, metadata, metadata)			declare i64 @llvm.experimental.constrained.llrint.f32(float, metadata, metadata)
	declare float @llvm.experimental.constrained.maxnum.f32(float, float, metadata)			declare float @llvm.experimental.constrained.maxnum.f32(float, float, metadata)
	declare float @llvm.experimental.constrained.minnum.f32(float, float, metadata)			declare float @llvm.experimental.constrained.minnum.f32(float, float, metadata)
				declare float @llvm.experimental.constrained.maximum.f32(float, float, metadata)
				declare float @llvm.experimental.constrained.minimum.f32(float, float, metadata)
	declare float @llvm.experimental.constrained.ceil.f32(float, metadata)			declare float @llvm.experimental.constrained.ceil.f32(float, metadata)
	declare float @llvm.experimental.constrained.floor.f32(float, metadata)			declare float @llvm.experimental.constrained.floor.f32(float, metadata)
	declare i32 @llvm.experimental.constrained.lround.f32(float, metadata)			declare i32 @llvm.experimental.constrained.lround.f32(float, metadata)
	declare i64 @llvm.experimental.constrained.llround.f32(float, metadata)			declare i64 @llvm.experimental.constrained.llround.f32(float, metadata)
	declare float @llvm.experimental.constrained.round.f32(float, metadata)			declare float @llvm.experimental.constrained.round.f32(float, metadata)
	declare float @llvm.experimental.constrained.roundeven.f32(float, metadata)			declare float @llvm.experimental.constrained.roundeven.f32(float, metadata)
	declare float @llvm.experimental.constrained.trunc.f32(float, metadata)			declare float @llvm.experimental.constrained.trunc.f32(float, metadata)
	declare i1 @llvm.experimental.constrained.fcmps.f32(float, float, metadata, metadata)			declare i1 @llvm.experimental.constrained.fcmps.f32(float, float, metadata, metadata)
	Show All 26 Lines
	declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)
	declare i32 @llvm.experimental.constrained.lrint.f64(double, metadata, metadata)			declare i32 @llvm.experimental.constrained.lrint.f64(double, metadata, metadata)
	declare i64 @llvm.experimental.constrained.llrint.f64(double, metadata, metadata)			declare i64 @llvm.experimental.constrained.llrint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.maxnum.f64(double, double, metadata)			declare double @llvm.experimental.constrained.maxnum.f64(double, double, metadata)
	declare double @llvm.experimental.constrained.minnum.f64(double, double, metadata)			declare double @llvm.experimental.constrained.minnum.f64(double, double, metadata)
				declare double @llvm.experimental.constrained.maximum.f64(double, double, metadata)
				declare double @llvm.experimental.constrained.minimum.f64(double, double, metadata)
	declare double @llvm.experimental.constrained.ceil.f64(double, metadata)			declare double @llvm.experimental.constrained.ceil.f64(double, metadata)
	declare double @llvm.experimental.constrained.floor.f64(double, metadata)			declare double @llvm.experimental.constrained.floor.f64(double, metadata)
	declare i32 @llvm.experimental.constrained.lround.f64(double, metadata)			declare i32 @llvm.experimental.constrained.lround.f64(double, metadata)
	declare i64 @llvm.experimental.constrained.llround.f64(double, metadata)			declare i64 @llvm.experimental.constrained.llround.f64(double, metadata)
	declare double @llvm.experimental.constrained.round.f64(double, metadata)			declare double @llvm.experimental.constrained.round.f64(double, metadata)
	declare double @llvm.experimental.constrained.roundeven.f64(double, metadata)			declare double @llvm.experimental.constrained.roundeven.f64(double, metadata)
	declare double @llvm.experimental.constrained.trunc.f64(double, metadata)			declare double @llvm.experimental.constrained.trunc.f64(double, metadata)
	declare i1 @llvm.experimental.constrained.fcmps.f64(double, double, metadata, metadata)			declare i1 @llvm.experimental.constrained.fcmps.f64(double, double, metadata, metadata)
	▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines