This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Target/
-
llvm/
-
Target/
-
TargetSelectionDAG.td
-
lib/
-
CodeGen/
-
GlobalISel/
8/18
IRTranslator.cpp
2/5
LegalizerHelper.cpp
-
SelectionDAG/
-
LegalizeVectorTypes.cpp
-
Target/AMDGPU/
-
AMDGPU/
2/6
AMDGPUISelLowering.cpp
2/2
AMDGPUInstrInfo.td
-
AMDGPUInstructionSelector.h
6/7
AMDGPUInstructionSelector.cpp
1/1
AMDGPULegalizerInfo.cpp
1/1
AMDGPURegisterBankInfo.cpp
-
SIISelLowering.cpp
-
test/CodeGen/AMDGPU/
-
CodeGen/
-
AMDGPU/
-
GlobalISel/
2/4
irtranslator-fpclass-flags.ll
1/1
llvm.is.fpclass.f16.ll
7/7
llvm.is.fpclass.ll

Differential D135447

[AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support and introduce GlobalISel implementation for AMDGPU
ClosedPublic

Authored by JanekvO on Oct 7 2022, 7:24 AM.

Download Raw Diff

Details

Reviewers

arsenm

Summary

Uses existing SelectionDAG lowering of the llvm.amdgcn.class intrinsic for llvm.is.fpclass

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

JanekvO created this revision.Oct 7 2022, 7:24 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2022, 7:24 AM

Herald added subscribers: kosarev, foad, kerbowa and 7 others. · View Herald Transcript

JanekvO requested review of this revision.Oct 7 2022, 7:24 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2022, 7:24 AM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

Harbormaster completed remote builds in B190943: Diff 466071.Oct 7 2022, 8:20 AM

Can do it as a follow up commit, but the existing combines we have for AMDGPU::FP_CLASS should be ported to use the generic intrinsic. Also, llvm.amdgcn.class should get bitcode upgraded to the generic

llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.ll
183	Should add some vector cases too

arsenm added inline comments.Oct 7 2022, 9:15 AM

llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.ll
4	Should also test/handle globalisel

In D135447#3843046, @arsenm wrote:

Can do it as a follow up commit, but the existing combines we have for AMDGPU::FP_CLASS should be ported to use the generic intrinsic. Also, llvm.amdgcn.class should get bitcode upgraded to the generic

I just realized the amdgpu intrinsic allows non-immediate arguments, but is_fpclass does not so these are not equivalent

JanekvO planned changes to this revision.Oct 10 2022, 7:14 AM

SelectionDAG fpclass vector support
GlobalISel llvm.is.fpclass support for AMDGPU
rebase

Harbormaster completed remote builds in B195929: Diff 472930.Nov 3 2022, 8:09 AM

JanekvO retitled this revision from [AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support for AMDGPU to [AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support and introduce GlobalISel implementation for AMDGPU.Nov 3 2022, 10:20 AM

arsenm added inline comments.Nov 3 2022, 11:49 AM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2320	This should get an IRTranslator test to make sure the flags are passed through
2324–2325	getUniqueInteger is unnecessarily fancy, can just cast to ConstantInt directly
2332	Do you really need the float type operand? I know bfloat16 isn't going to work without it, but I thought the plan was to introduce FP types to LLT
llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.td
133	Should avoid defining an AMDGPU node for this and move this to generic code
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
924	I don't see why you need to manually select this (maybe sharing the pattern between the existing intrinsic is annoying because the new intrinsic uses immarg?)
934–937	Should be no reason to check this here
948–956	You can just unconditionally materialize the constant into a register and let SIFoldOperands sort out the constant bus restriction
959–960	You shouldn't need to special case the result constraint
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
3945	Pretty sure this default constructs to null
llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.ll
3	Should use some share prefixes, a lot of these functions are the same. Also needs a gfx7 and 8 run lines for the half promotion
1923	v3f16 and v4f16 are also potentially interesting

SelectionDAG fpclass vector support
GlobalISel llvm.is.fpclass support for AMDGPU
Address comments, add custom half promotion for gfx7

JanekvO added inline comments.Nov 8 2022, 7:48 AM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2320	Not sure if I completely hit the mark with my added test, but to me it seemed that not all flags were possible (e.g., `nnan` flag didn't work as it required a fp return type). For now I've added flag related tests that explicitly test the addition of `nofpexcept`. Do let me know if there's something missing or whether this `copyFlagsFromInstruction` is better omitted.
2332	I believe it's not necessary for amdgpu but required for the `G_IS_FPCLASS` target opcode. Leaving it out results in verifier errors (I also am unaware about introducing FP types and LLT).
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
924	I did look on whether I could re-use some of the existing tablegen but I couldn't get it quite into the right shape for it to match. `llvm.is.fpclass` requires the mask to be an immarg as you mentioned so materializing the immediate into a register anywhere before this function results in a verifier error.
llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.ll
3	I'm not that well versed in how gfx7 should do half promotion. I feel like either gfx7 selectiondag or gfx7 globalisel half promotion tests are incorrect (and if not, selectiondag version does seem suboptimal).

arsenm added inline comments.Nov 8 2022, 8:05 AM

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
2450	This will do the wrong thing for snans and also denormals inputs are flushed

Harbormaster completed remote builds in B196711: Diff 474002.Nov 8 2022, 8:15 AM

arsenm added a child revision: D137811: InstCombine: Perform basic isnan combines on llvm.is.fpclass.Nov 10 2022, 7:13 PM

arsenm added inline comments.Nov 10 2022, 7:17 PM

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
2450	I also don't see the corresponding DAG legalization. It's such a special case I think this should be split into a separate patch anyway.

arsenm added inline comments.Nov 10 2022, 7:19 PM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2320	I'd consider that a pre-existing bug in intrinsics. The IR is annoyingly strict about what things are allowed to have flags
2332	What do you mean verifier errors?

foad added inline comments.Nov 10 2022, 11:27 PM

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6492–6493 ↗	(On Diff #474002)	Why did this change?
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
318–320	It seems annoying to have such a long list of types here - it'll need updating whenever we introduce a new one. Can you use something like FloatVectorTypes instead?

JanekvO marked 2 inline comments as done.Nov 11 2022, 6:15 AM

JanekvO added inline comments.

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2332	Sorry, I meant that the MachineVerifier will fail for the `G_IS_FPCLASS` instruction.
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
2450	I also don't see the corresponding DAG legalization. I put the corresponding SelectionDAG type widening code for `IS_FPCLASS` is in target custom function `LowerIS_FPCLASS` as I couldn't bypass expansion in SelectionDAGBuilder.cpp when marking the action for the instruction with f16 as `promote` (i.e., it would call `IS_FPCLASS` expansion code even when trying to promote). It's such a special case I think this should be split into a separate patch anyway. As in, the widening code, or `IS_FPCLASS` support for amdgpu gfx7?
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6492–6493 ↗	(On Diff #474002)	`isOperationLegalOrCustom` will return false if the type is considered illegal regardless of whether the instruction's type is marked legal or custom whereas `isOperationCustom` won't explicitly check for type legality and returns whether the action was set to custom. I basically just wanted it to go through to target custom code. (May revert this in favor of using the expand code for f16 in case there is no f16 fp class instruction for amdgpu)

For this patch I'd like to drop all the attempts to handle legalizing the f16 case and move that to a separate patch. It's a much more complicated edge case that doesn't have much in common with the base handling

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2332	Right, it's there in the operand list. I mean more abstractly, why is it there?
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
2450	OK, there are several issues here. None of this should be done in target code. I also don't approve of doing this expansion in the DAG builder, but see that's a pre-existing issue. GlobalISel does need to do the same expansion.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6492–6493 ↗	(On Diff #474002)	For the no f16 case, I think we need to do software expansion to get correct results for denormal values
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
318–320	This should be unnecessary, we have no vector class instructions. These should just expand into scalars
2774	This doesn't work correctly for denormals. The f16 denormal value won't be denormal after casting to f32 (if it wasn't flushed to zero under DAZ or FTZ modes)

JanekvO added inline comments.Nov 11 2022, 11:06 AM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2332	I can see the semantics being used in the target independent expansion of `IS_FPCLASS` in SelectionDAG (e.g., for retrieving `inf` of a particular fp semantic). I'm inferring that the rationale could be: GlobalISel will require a similar implementation and therefore requires the semantics. I haven't looked into whether any alternatives exist that don't require passing of the semantics through the operand, though.
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
2450	I was looking at implementing the SelectionDAG target independent expansion for GlobalISel `lower()`. I'll first remove f16 legalizing for cases where there is no f16 instructions available for amdgpu for this diff and move the GlobalISel's expansion/lower to another diff.
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
318–320	If not set as custom (or legal), these'll get expanded through the target independent expansion. Bypassing said target independent expansion does result into the desired scalarizing.

arsenm added inline comments.Nov 11 2022, 11:14 AM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2332	Currently the LLT directly implies the semantics for every operation
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
318–320	This is one of the problems with doing this kind of expansion in SelectionDAGBuilder. This should go through the usual legalization paths

Remove IS_FPCLASS amdgpu f16 legalization, split tests into f16 and not f16 cases, temporarily disable gfx7 glisel tests
Rebase

JanekvO added a subscriber: sepavloff.Nov 14 2022, 5:37 AM

JanekvO added inline comments.

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2332	@sepavloff Do you happen to recall the rationale of the fp semantic operand for `G_IS_FPCLASS`? My knowledge about it are a bit shallow but perhaps it can be removed

Harbormaster completed remote builds in B197510: Diff 475117.Nov 14 2022, 6:18 AM

sepavloff added inline comments.Nov 14 2022, 7:45 AM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2332	It is used to workaround limitations of GlobalISel, - lack of floating-point types. Without this operand it is impossible to distinguish between `half` and `bfloat16` and also between different flavors of 8-bit floats. If LLT supported floating-point types, this operand could be removed.

arsenm added inline comments.Nov 14 2022, 9:11 AM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2332	But this is a problem for every single operation, not just this one. We don't have a decided upon strategy for dealing with this, so it doesn't make sense to me to try to deal with it here

sepavloff added inline comments.Nov 14 2022, 10:43 PM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2332	Sounds reasonable. Let's remove it, in separate commit.
2332	See https://reviews.llvm.org/D138004.

Remove fpsem operand construction in irtranslator for G_IS_FPCLASS

Harbormaster completed remote builds in B197732: Diff 475421.Nov 15 2022, 4:55 AM

arsenm added inline comments.Nov 16 2022, 2:54 PM

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
314	Can you add a fixme that we just want scalarization?
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
924	You might need to split it into a different pattern instantiation, but you would just need the S_MOV_B32 from the mask to the constant (although I actually would expect it to work if you directly folded the constant anyway, since the operand should have been copied to VGPR anyway). Something like: class ClassPat<Instruction inst, ValueType vt> : GCNPat < (fp_class (VOP3Mods vt:$src0, i32:$src0_mods), (i32 timm:mask)) (inst $src0_mods, VSrc_b32:$src0, $src0_mods, (S_MOV_B32 $mask)) >;
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
984	I think this clampScalar isn't doing anything and can be dropped
llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.ll
12	Can you also add some cases where the input will be an SGPR?
107	s/float/f32 in these function names

arsenm requested changes to this revision.Nov 16 2022, 2:56 PM

arsenm added inline comments.

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-fpclass-flags.ll
2	-global-isel to front, also generate these checks
17	Needs additional checks with other flags besides the one just set

This revision now requires changes to proceed.Nov 16 2022, 2:56 PM

JanekvO added inline comments.Nov 18 2022, 6:57 AM

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-fpclass-flags.ll
17	I've been wondering whether the flag copy from the IR intrinsic to G_IS_FPCLASS in IRTranslator should be removed altogether. I'd have to weaken the flags' constraints as they all require scalar or vector fp return types. Additionally, Any use of fast math flags outside of existing uses will most likely require amending langref. E.g., current descriptions of some fast math flags describe how input can result into a poison value but this wouldn't be possible for G_IS_FPCLASS as it's a bool return. Let me know what you think, I can see some of the flags being useful by folding into constant bool values (e.g., not a nan flag + G_IS_FPCLASS test for nans) but I may be a bit naïve on useful cases beyond said folding.

arsenm added inline comments.Nov 18 2022, 3:53 PM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
2320	You can remove the flag copy if you want, although the flags may be introduced in the future
2333–2334	I just realized there's no point in doing this. G_IS_FPCLASS is not marked as mayRaiseFPException, so the flag is implied
llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-fpclass-flags.ll
17	OK, might as well drop this test if we have end to end tests and there's nothing unique to test in the IRTranslator

Remove patfrag dependency for is_fpclass
Add dedicated patterns
Remove globalisel manual selection and depend on selectiondag tablegen
Address comments

JanekvO marked 4 inline comments as done.Nov 25 2022, 8:07 AM

Harbormaster completed remote builds in B199558: Diff 477969.Nov 25 2022, 8:52 AM

LGTM with nits. you have some dead checks and dead code

llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.td
132–133	Whole file is now whitespace only changes which can be dropped
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
913–922	Dead code
llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.f16.ll
95–96	This is broken for signaling nans. You dropped this from the patch but left these dead checks around

This revision is now accepted and ready to land.Nov 28 2022, 9:34 AM

Remove dead code and tests

arsenm accepted this revision.Nov 28 2022, 12:17 PM

Sorry, haven't gotten github access yet: could you (or somebody in AMDGPU group) land this for me? 😅

322966f8f8aa2ee1146c40eabe52c9ebeb91dab7

Harbormaster completed remote builds in B199830: Diff 478322.Nov 28 2022, 2:15 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Target/

TargetSelectionDAG.td

5 lines

lib/

CodeGen/

GlobalISel/

IRTranslator.cpp

21 lines

LegalizerHelper.cpp

2 lines

SelectionDAG/

LegalizeVectorTypes.cpp

11 lines

Target/

AMDGPU/

AMDGPUISelLowering.cpp

12 lines

AMDGPUInstrInfo.td

4 lines

AMDGPUInstructionSelector.h

1 line

AMDGPUInstructionSelector.cpp

43 lines

AMDGPULegalizerInfo.cpp

5 lines

AMDGPURegisterBankInfo.cpp

8 lines

SIISelLowering.cpp

2 lines

test/

CodeGen/

AMDGPU/

GlobalISel/

irtranslator-fpclass-flags.ll

17 lines

llvm.is.fpclass.f16.ll

632 lines

llvm.is.fpclass.ll

1345 lines

Diff 475117

llvm/include/llvm/Target/TargetSelectionDAG.td

Show First 20 Lines • Show All 155 Lines • ▼ Show 20 Lines	def SDTFPUnaryOp : SDTypeProfile<1, 1, [ // fneg, fsqrt, etc
SDTCisSameAs<0, 1>, SDTCisFP<0>		SDTCisSameAs<0, 1>, SDTCisFP<0>
]>;		]>;
def SDTFPRoundOp : SDTypeProfile<1, 1, [ // fpround		def SDTFPRoundOp : SDTypeProfile<1, 1, [ // fpround
SDTCisFP<0>, SDTCisFP<1>, SDTCisOpSmallerThanOp<0, 1>, SDTCisSameNumEltsAs<0, 1>		SDTCisFP<0>, SDTCisFP<1>, SDTCisOpSmallerThanOp<0, 1>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;
def SDTFPExtendOp : SDTypeProfile<1, 1, [ // fpextend		def SDTFPExtendOp : SDTypeProfile<1, 1, [ // fpextend
SDTCisFP<0>, SDTCisFP<1>, SDTCisOpSmallerThanOp<1, 0>, SDTCisSameNumEltsAs<0, 1>		SDTCisFP<0>, SDTCisFP<1>, SDTCisOpSmallerThanOp<1, 0>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;
		def SDIsFPClassOp : SDTypeProfile<1, 2, [ // is_fpclass
		SDTCisInt<0>, SDTCisFP<1>, SDTCisInt<2>, SDTCisSameNumEltsAs<0, 1>
		]>;
def SDTIntToFPOp : SDTypeProfile<1, 1, [ // [su]int_to_fp		def SDTIntToFPOp : SDTypeProfile<1, 1, [ // [su]int_to_fp
SDTCisFP<0>, SDTCisInt<1>, SDTCisSameNumEltsAs<0, 1>		SDTCisFP<0>, SDTCisInt<1>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;
def SDTFPToIntOp : SDTypeProfile<1, 1, [ // fp_to_[su]int		def SDTFPToIntOp : SDTypeProfile<1, 1, [ // fp_to_[su]int
SDTCisInt<0>, SDTCisFP<1>, SDTCisSameNumEltsAs<0, 1>		SDTCisInt<0>, SDTCisFP<1>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;
def SDTFPToIntSatOp : SDTypeProfile<1, 2, [ // fp_to_[su]int_sat		def SDTFPToIntSatOp : SDTypeProfile<1, 2, [ // fp_to_[su]int_sat
SDTCisInt<0>, SDTCisFP<1>, SDTCisSameNumEltsAs<0, 1>, SDTCisVT<2, OtherVT>		SDTCisInt<0>, SDTCisFP<1>, SDTCisSameNumEltsAs<0, 1>, SDTCisVT<2, OtherVT>
▲ Show 20 Lines • Show All 340 Lines • ▼ Show 20 Lines
def llround : SDNode<"ISD::LLROUND" , SDTFPToIntOp>;		def llround : SDNode<"ISD::LLROUND" , SDTFPToIntOp>;
def lrint : SDNode<"ISD::LRINT" , SDTFPToIntOp>;		def lrint : SDNode<"ISD::LRINT" , SDTFPToIntOp>;
def llrint : SDNode<"ISD::LLRINT" , SDTFPToIntOp>;		def llrint : SDNode<"ISD::LLRINT" , SDTFPToIntOp>;

def fpround : SDNode<"ISD::FP_ROUND" , SDTFPRoundOp>;		def fpround : SDNode<"ISD::FP_ROUND" , SDTFPRoundOp>;
def fpextend : SDNode<"ISD::FP_EXTEND" , SDTFPExtendOp>;		def fpextend : SDNode<"ISD::FP_EXTEND" , SDTFPExtendOp>;
def fcopysign : SDNode<"ISD::FCOPYSIGN" , SDTFPSignOp>;		def fcopysign : SDNode<"ISD::FCOPYSIGN" , SDTFPSignOp>;

		def is_fpclass : SDNode<"ISD::IS_FPCLASS" , SDIsFPClassOp>;

def sint_to_fp : SDNode<"ISD::SINT_TO_FP" , SDTIntToFPOp>;		def sint_to_fp : SDNode<"ISD::SINT_TO_FP" , SDTIntToFPOp>;
def uint_to_fp : SDNode<"ISD::UINT_TO_FP" , SDTIntToFPOp>;		def uint_to_fp : SDNode<"ISD::UINT_TO_FP" , SDTIntToFPOp>;
def fp_to_sint : SDNode<"ISD::FP_TO_SINT" , SDTFPToIntOp>;		def fp_to_sint : SDNode<"ISD::FP_TO_SINT" , SDTFPToIntOp>;
def fp_to_uint : SDNode<"ISD::FP_TO_UINT" , SDTFPToIntOp>;		def fp_to_uint : SDNode<"ISD::FP_TO_UINT" , SDTFPToIntOp>;
def fp_to_sint_sat : SDNode<"ISD::FP_TO_SINT_SAT" , SDTFPToIntSatOp>;		def fp_to_sint_sat : SDNode<"ISD::FP_TO_SINT_SAT" , SDTFPToIntSatOp>;
def fp_to_uint_sat : SDNode<"ISD::FP_TO_UINT_SAT" , SDTFPToIntSatOp>;		def fp_to_uint_sat : SDNode<"ISD::FP_TO_UINT_SAT" , SDTFPToIntSatOp>;
def f16_to_fp : SDNode<"ISD::FP16_TO_FP" , SDTIntToFPOp>;		def f16_to_fp : SDNode<"ISD::FP16_TO_FP" , SDTIntToFPOp>;
def fp_to_f16 : SDNode<"ISD::FP_TO_FP16" , SDTFPToIntOp>;		def fp_to_f16 : SDNode<"ISD::FP_TO_FP16" , SDTFPToIntOp>;
▲ Show 20 Lines • Show All 1,351 Lines • Show Last 20 Lines

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp

Show First 20 Lines • Show All 2,310 Lines • ▼ Show 20 Lines	case Intrinsic::fptrunc_round: {
MIRBuilder		MIRBuilder
.buildInstr(TargetOpcode::G_INTRINSIC_FPTRUNC_ROUND,		.buildInstr(TargetOpcode::G_INTRINSIC_FPTRUNC_ROUND,
{getOrCreateVReg(CI)},		{getOrCreateVReg(CI)},
{getOrCreateVReg(*CI.getArgOperand(0))}, Flags)		{getOrCreateVReg(*CI.getArgOperand(0))}, Flags)
.addImm((int)*RoundMode);		.addImm((int)*RoundMode);

return true;		return true;
}		}
		case Intrinsic::is_fpclass: {
		unsigned Flags = MachineInstr::copyFlagsFromInstruction(CI);
		arsenmUnsubmitted Not Done Reply Inline Actions This should get an IRTranslator test to make sure the flags are passed through arsenm: This should get an IRTranslator test to make sure the flags are passed through
		JanekvOAuthorUnsubmitted Done Reply Inline Actions Not sure if I completely hit the mark with my added test, but to me it seemed that not all flags were possible (e.g., `nnan` flag didn't work as it required a fp return type). For now I've added flag related tests that explicitly test the addition of `nofpexcept`. Do let me know if there's something missing or whether this `copyFlagsFromInstruction` is better omitted. JanekvO: Not sure if I completely hit the mark with my added test, but to me it seemed that not all…
		arsenmUnsubmitted Not Done Reply Inline Actions I'd consider that a pre-existing bug in intrinsics. The IR is annoyingly strict about what things are allowed to have flags arsenm: I'd consider that a pre-existing bug in intrinsics. The IR is annoyingly strict about what…
		arsenmUnsubmitted Done Reply Inline Actions You can remove the flag copy if you want, although the flags may be introduced in the future arsenm: You can remove the flag copy if you want, although the flags may be introduced in the future

		Value *FpValue = CI.getOperand(0);
		Type *FpEltTy = FpValue->getType()->getScalarType();
		ConstantInt *TestMaskValue = cast<ConstantInt>(CI.getOperand(1));
		const fltSemantics &FpSem = FpEltTy->getFltSemantics();
		arsenmUnsubmitted Done Reply Inline Actions getUniqueInteger is unnecessarily fancy, can just cast to ConstantInt directly arsenm: getUniqueInteger is unnecessarily fancy, can just cast to ConstantInt directly

		const MachineInstrBuilder &IsFpclass =
		MIRBuilder
		.buildInstr(TargetOpcode::G_IS_FPCLASS, {getOrCreateVReg(CI)},
		{getOrCreateVReg(*FpValue)}, Flags)
		.addImm(TestMaskValue->getZExtValue())
		.addImm((unsigned)APFloat::SemanticsToEnum(FpSem));
		arsenmUnsubmitted Not Done Reply Inline Actions Do you really need the float type operand? I know bfloat16 isn't going to work without it, but I thought the plan was to introduce FP types to LLT arsenm: Do you really need the float type operand? I know bfloat16 isn't going to work without it, but…
		JanekvOAuthorUnsubmitted Done Reply Inline Actions I believe it's not necessary for amdgpu but required for the `G_IS_FPCLASS` target opcode. Leaving it out results in verifier errors (I also am unaware about introducing FP types and LLT). JanekvO: I believe it's not necessary for amdgpu but required for the `G_IS_FPCLASS` target opcode.
		arsenmUnsubmitted Not Done Reply Inline Actions What do you mean verifier errors? arsenm: What do you mean verifier errors?
		JanekvOAuthorUnsubmitted Done Reply Inline Actions Sorry, I meant that the MachineVerifier will fail for the `G_IS_FPCLASS` instruction. JanekvO: Sorry, I meant that the MachineVerifier will fail for the `G_IS_FPCLASS` instruction.
		arsenmUnsubmitted Not Done Reply Inline Actions Right, it's there in the operand list. I mean more abstractly, why is it there? arsenm: Right, it's there in the operand list. I mean more abstractly, why is it there?
		JanekvOAuthorUnsubmitted Done Reply Inline Actions I can see the semantics being used in the target independent expansion of `IS_FPCLASS` in SelectionDAG (e.g., for retrieving `inf` of a particular fp semantic). I'm inferring that the rationale could be: GlobalISel will require a similar implementation and therefore requires the semantics. I haven't looked into whether any alternatives exist that don't require passing of the semantics through the operand, though. JanekvO: I can see the semantics being used in the target independent expansion of `IS_FPCLASS` in…
		arsenmUnsubmitted Not Done Reply Inline Actions Currently the LLT directly implies the semantics for every operation arsenm: Currently the LLT directly implies the semantics for every operation
		JanekvOAuthorUnsubmitted Done Reply Inline Actions @sepavloff Do you happen to recall the rationale of the fp semantic operand for `G_IS_FPCLASS`? My knowledge about it are a bit shallow but perhaps it can be removed JanekvO: @sepavloff Do you happen to recall the rationale of the fp semantic operand for `G_IS_FPCLASS`?
		sepavloffUnsubmitted Not Done Reply Inline Actions It is used to workaround limitations of GlobalISel, - lack of floating-point types. Without this operand it is impossible to distinguish between `half` and `bfloat16` and also between different flavors of 8-bit floats. If LLT supported floating-point types, this operand could be removed. sepavloff: It is used to workaround limitations of GlobalISel, - lack of floating-point types. Without…
		arsenmUnsubmitted Not Done Reply Inline Actions But this is a problem for every single operation, not just this one. We don't have a decided upon strategy for dealing with this, so it doesn't make sense to me to try to deal with it here arsenm: But this is a problem for every single operation, not just this one. We don't have a decided…
		sepavloffUnsubmitted Not Done Reply Inline Actions Sounds reasonable. Let's remove it, in separate commit. sepavloff: Sounds reasonable. Let's remove it, in separate commit.
		sepavloffUnsubmitted Not Done Reply Inline Actions See https://reviews.llvm.org/D138004. sepavloff: See https://reviews.llvm.org/D138004.

		const Function *F = CI.getFunction();
		arsenmUnsubmitted Done Reply Inline Actions I just realized there's no point in doing this. G_IS_FPCLASS is not marked as mayRaiseFPException, so the flag is implied arsenm: I just realized there's no point in doing this. G_IS_FPCLASS is not marked as…
		if (!F->getAttributes().hasFnAttr(llvm::Attribute::StrictFP))
		IsFpclass.setMIFlag(MachineInstr::NoFPExcept);

		return true;
		}
#define INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC) \		#define INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC) \
case Intrinsic::INTRINSIC:		case Intrinsic::INTRINSIC:
#include "llvm/IR/ConstrainedOps.def"		#include "llvm/IR/ConstrainedOps.def"
return translateConstrainedFPIntrinsic(cast<ConstrainedFPIntrinsic>(CI),		return translateConstrainedFPIntrinsic(cast<ConstrainedFPIntrinsic>(CI),
MIRBuilder);		MIRBuilder);

}		}
return false;		return false;
▲ Show 20 Lines • Show All 1,262 Lines • Show Last 20 Lines

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,441 Lines • ▼ Show 20 Lines	else {
widenScalarSrc(MI, WideTy, 2, ExtOpcode);		widenScalarSrc(MI, WideTy, 2, ExtOpcode);
widenScalarSrc(MI, WideTy, 3, ExtOpcode);		widenScalarSrc(MI, WideTy, 3, ExtOpcode);
}		}
Observer.changedInstr(MI);		Observer.changedInstr(MI);
return Legalized;		return Legalized;

case TargetOpcode::G_PTR_ADD:		case TargetOpcode::G_PTR_ADD:
assert(TypeIdx == 1 && "unable to legalize pointer of G_PTR_ADD");		assert(TypeIdx == 1 && "unable to legalize pointer of G_PTR_ADD");
Observer.changingInstr(MI);		Observer.changingInstr(MI);
		arsenmUnsubmitted Not Done Reply Inline Actions This will do the wrong thing for snans and also denormals inputs are flushed arsenm: This will do the wrong thing for snans and also denormals inputs are flushed
		arsenmUnsubmitted Not Done Reply Inline Actions I also don't see the corresponding DAG legalization. It's such a special case I think this should be split into a separate patch anyway. arsenm: I also don't see the corresponding DAG legalization. It's such a special case I think this…
		JanekvOAuthorUnsubmitted Done Reply Inline Actions I also don't see the corresponding DAG legalization. I put the corresponding SelectionDAG type widening code for `IS_FPCLASS` is in target custom function `LowerIS_FPCLASS` as I couldn't bypass expansion in SelectionDAGBuilder.cpp when marking the action for the instruction with f16 as `promote` (i.e., it would call `IS_FPCLASS` expansion code even when trying to promote). It's such a special case I think this should be split into a separate patch anyway. As in, the widening code, or `IS_FPCLASS` support for amdgpu gfx7? JanekvO: > I also don't see the corresponding DAG legalization. I put the corresponding SelectionDAG…
		arsenmUnsubmitted Not Done Reply Inline Actions OK, there are several issues here. None of this should be done in target code. I also don't approve of doing this expansion in the DAG builder, but see that's a pre-existing issue. GlobalISel does need to do the same expansion. arsenm: OK, there are several issues here. None of this should be done in target code. I also don't…
		JanekvOAuthorUnsubmitted Done Reply Inline Actions I was looking at implementing the SelectionDAG target independent expansion for GlobalISel `lower()`. I'll first remove f16 legalizing for cases where there is no f16 instructions available for amdgpu for this diff and move the GlobalISel's expansion/lower to another diff. JanekvO: I was looking at implementing the SelectionDAG target independent expansion for GlobalISel…
widenScalarSrc(MI, WideTy, 2, TargetOpcode::G_SEXT);		widenScalarSrc(MI, WideTy, 2, TargetOpcode::G_SEXT);
Observer.changedInstr(MI);		Observer.changedInstr(MI);
return Legalized;		return Legalized;

case TargetOpcode::G_PHI: {		case TargetOpcode::G_PHI: {
assert(TypeIdx == 0 && "Expecting only Idx 0");		assert(TypeIdx == 0 && "Expecting only Idx 0");

Observer.changingInstr(MI);		Observer.changingInstr(MI);
▲ Show 20 Lines • Show All 1,758 Lines • ▼ Show 20 Lines	LegalizerHelper::fewerElementsVector(MachineInstr &MI, unsigned TypeIdx,
case G_SADDO:		case G_SADDO:
case G_SSUBO:		case G_SSUBO:
case G_SADDE:		case G_SADDE:
case G_SSUBE:		case G_SSUBE:
return fewerElementsVectorMultiEltType(GMI, NumElts);		return fewerElementsVectorMultiEltType(GMI, NumElts);
case G_ICMP:		case G_ICMP:
case G_FCMP:		case G_FCMP:
return fewerElementsVectorMultiEltType(GMI, NumElts, {1 /cpm predicate/});		return fewerElementsVectorMultiEltType(GMI, NumElts, {1 /cpm predicate/});
		case G_IS_FPCLASS:
		return fewerElementsVectorMultiEltType(GMI, NumElts, {2, 3 /mask,fpsem/});
case G_SELECT:		case G_SELECT:
if (MRI.getType(MI.getOperand(1).getReg()).isVector())		if (MRI.getType(MI.getOperand(1).getReg()).isVector())
return fewerElementsVectorMultiEltType(GMI, NumElts);		return fewerElementsVectorMultiEltType(GMI, NumElts);
return fewerElementsVectorMultiEltType(GMI, NumElts, {1 /scalar cond/});		return fewerElementsVectorMultiEltType(GMI, NumElts, {1 /scalar cond/});
case G_PHI:		case G_PHI:
return fewerElementsVectorPhi(GMI, NumElts);		return fewerElementsVectorPhi(GMI, NumElts);
case G_UNMERGE_VALUES:		case G_UNMERGE_VALUES:
return fewerElementsVectorUnmergeValues(MI, TypeIdx, NarrowTy);		return fewerElementsVectorUnmergeValues(MI, TypeIdx, NarrowTy);
▲ Show 20 Lines • Show All 3,658 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,471 Lines • ▼ Show 20 Lines	void DAGTypeLegalizer::SplitVecRes_FCOPYSIGN(SDNode *N, SDValue &Lo,
Hi = DAG.getNode(ISD::FCOPYSIGN, DL, LHSHi.getValueType(), LHSHi, RHSHi);		Hi = DAG.getNode(ISD::FCOPYSIGN, DL, LHSHi.getValueType(), LHSHi, RHSHi);
}		}

void DAGTypeLegalizer::SplitVecRes_IS_FPCLASS(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::SplitVecRes_IS_FPCLASS(SDNode *N, SDValue &Lo,
SDValue &Hi) {		SDValue &Hi) {
SDLoc DL(N);		SDLoc DL(N);
SDValue ArgLo, ArgHi;		SDValue ArgLo, ArgHi;
SDValue Test = N->getOperand(1);		SDValue Test = N->getOperand(1);
GetSplitVector(N->getOperand(0), ArgLo, ArgHi);		SDValue FpValue = N->getOperand(0);
		if (getTypeAction(FpValue.getValueType()) == TargetLowering::TypeSplitVector)
		GetSplitVector(FpValue, ArgLo, ArgHi);
		else
		std::tie(ArgLo, ArgHi) = DAG.SplitVector(FpValue, SDLoc(FpValue));
EVT LoVT, HiVT;		EVT LoVT, HiVT;
std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));		std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));

Lo = DAG.getNode(ISD::IS_FPCLASS, DL, LoVT, ArgLo, Test, N->getFlags());		Lo = DAG.getNode(ISD::IS_FPCLASS, DL, LoVT, ArgLo, Test, N->getFlags());
Hi = DAG.getNode(ISD::IS_FPCLASS, DL, HiVT, ArgHi, Test, N->getFlags());		Hi = DAG.getNode(ISD::IS_FPCLASS, DL, HiVT, ArgHi, Test, N->getFlags());
}		}

void DAGTypeLegalizer::SplitVecRes_InregOp(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::SplitVecRes_InregOp(SDNode *N, SDValue &Lo,
▲ Show 20 Lines • Show All 3,222 Lines • ▼ Show 20 Lines	if (N->getOperand(0).getValueType() == N->getOperand(1).getValueType())
return WidenVecRes_BinaryCanTrap(N);		return WidenVecRes_BinaryCanTrap(N);

// If the types are different, fall back to unrolling.		// If the types are different, fall back to unrolling.
EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));		EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
return DAG.UnrollVectorOp(N, WidenVT.getVectorNumElements());		return DAG.UnrollVectorOp(N, WidenVT.getVectorNumElements());
}		}

SDValue DAGTypeLegalizer::WidenVecRes_IS_FPCLASS(SDNode *N) {		SDValue DAGTypeLegalizer::WidenVecRes_IS_FPCLASS(SDNode *N) {
		SDValue FpValue = N->getOperand(0);
EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));		EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
SDValue Arg = GetWidenedVector(N->getOperand(0));		if (getTypeAction(FpValue.getValueType()) != TargetLowering::TypeWidenVector)
		return DAG.UnrollVectorOp(N, WidenVT.getVectorNumElements());
		SDValue Arg = GetWidenedVector(FpValue);
return DAG.getNode(N->getOpcode(), SDLoc(N), WidenVT, {Arg, N->getOperand(1)},		return DAG.getNode(N->getOpcode(), SDLoc(N), WidenVT, {Arg, N->getOperand(1)},
N->getFlags());		N->getFlags());
}		}

SDValue DAGTypeLegalizer::WidenVecRes_POWI(SDNode *N) {		SDValue DAGTypeLegalizer::WidenVecRes_POWI(SDNode *N) {
EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));		EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
SDValue InOp = GetWidenedVector(N->getOperand(0));		SDValue InOp = GetWidenedVector(N->getOperand(0));
SDValue ShOp = N->getOperand(1);		SDValue ShOp = N->getOperand(1);
▲ Show 20 Lines • Show All 2,339 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp

Show First 20 Lines • Show All 300 Lines • ▼ Show 20 Lines	AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::FROUND, {MVT::f32, MVT::f64}, Custom);		setOperationAction(ISD::FROUND, {MVT::f32, MVT::f64}, Custom);

setOperationAction({ISD::FLOG, ISD::FLOG10, ISD::FEXP}, MVT::f32, Custom);		setOperationAction({ISD::FLOG, ISD::FLOG10, ISD::FEXP}, MVT::f32, Custom);

setOperationAction(ISD::FNEARBYINT, {MVT::f16, MVT::f32, MVT::f64}, Custom);		setOperationAction(ISD::FNEARBYINT, {MVT::f16, MVT::f32, MVT::f64}, Custom);

setOperationAction(ISD::FREM, {MVT::f16, MVT::f32, MVT::f64}, Custom);		setOperationAction(ISD::FREM, {MVT::f16, MVT::f32, MVT::f64}, Custom);

		if (Subtarget->has16BitInsts())
		setOperationAction(ISD::IS_FPCLASS, {MVT::f16, MVT::f32, MVT::f64}, Legal);
		else
		setOperationAction(ISD::IS_FPCLASS, {MVT::f32, MVT::f64}, Legal);

		setOperationAction(
		arsenmUnsubmitted Done Reply Inline Actions Can you add a fixme that we just want scalarization? arsenm: Can you add a fixme that we just want scalarization?
		ISD::IS_FPCLASS,
		{MVT::v2f16, MVT::v3f16, MVT::v4f16, MVT::v16f16, MVT::v2f32, MVT::v3f32,
		MVT::v4f32, MVT::v5f32, MVT::v6f32, MVT::v7f32, MVT::v8f32, MVT::v16f32,
		MVT::v2f64, MVT::v3f64, MVT::v4f64, MVT::v8f64, MVT::v16f64},
		Custom);

		foadUnsubmitted Not Done Reply Inline Actions It seems annoying to have such a long list of types here - it'll need updating whenever we introduce a new one. Can you use something like FloatVectorTypes instead? foad: It seems annoying to have such a long list of types here - it'll need updating whenever we…
		arsenmUnsubmitted Not Done Reply Inline Actions This should be unnecessary, we have no vector class instructions. These should just expand into scalars arsenm: This should be unnecessary, we have no vector class instructions. These should just expand into…
		JanekvOAuthorUnsubmitted Done Reply Inline Actions If not set as custom (or legal), these'll get expanded through the target independent expansion. Bypassing said target independent expansion does result into the desired scalarizing. JanekvO: If not set as custom (or legal), these'll get expanded through the target independent expansion.
		arsenmUnsubmitted Not Done Reply Inline Actions This is one of the problems with doing this kind of expansion in SelectionDAGBuilder. This should go through the usual legalization paths arsenm: This is one of the problems with doing this kind of expansion in SelectionDAGBuilder. This…
// Expand to fneg + fadd.		// Expand to fneg + fadd.
setOperationAction(ISD::FSUB, MVT::f64, Expand);		setOperationAction(ISD::FSUB, MVT::f64, Expand);

setOperationAction(ISD::CONCAT_VECTORS,		setOperationAction(ISD::CONCAT_VECTORS,
{MVT::v3i32, MVT::v3f32, MVT::v4i32, MVT::v4f32,		{MVT::v3i32, MVT::v3f32, MVT::v4i32, MVT::v4f32,
MVT::v5i32, MVT::v5f32, MVT::v6i32, MVT::v6f32,		MVT::v5i32, MVT::v5f32, MVT::v6i32, MVT::v6f32,
MVT::v7i32, MVT::v7f32, MVT::v8i32, MVT::v8f32},		MVT::v7i32, MVT::v7f32, MVT::v8i32, MVT::v8f32},
Custom);		Custom);
▲ Show 20 Lines • Show All 2,437 Lines • ▼ Show 20 Lines	SDValue AMDGPUTargetLowering::LowerFP_TO_INT(SDValue Op,
return SDValue();		return SDValue();
}		}

SDValue AMDGPUTargetLowering::LowerSIGN_EXTEND_INREG(SDValue Op,		SDValue AMDGPUTargetLowering::LowerSIGN_EXTEND_INREG(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
EVT ExtraVT = cast<VTSDNode>(Op.getOperand(1))->getVT();		EVT ExtraVT = cast<VTSDNode>(Op.getOperand(1))->getVT();
MVT VT = Op.getSimpleValueType();		MVT VT = Op.getSimpleValueType();
MVT ScalarVT = VT.getScalarType();		MVT ScalarVT = VT.getScalarType();

		arsenmUnsubmitted Not Done Reply Inline Actions This doesn't work correctly for denormals. The f16 denormal value won't be denormal after casting to f32 (if it wasn't flushed to zero under DAZ or FTZ modes) arsenm: This doesn't work correctly for denormals. The f16 denormal value won't be denormal after…
assert(VT.isVector());		assert(VT.isVector());

SDValue Src = Op.getOperand(0);		SDValue Src = Op.getOperand(0);
SDLoc DL(Op);		SDLoc DL(Op);

// TODO: Don't scalarize on Evergreen?		// TODO: Don't scalarize on Evergreen?
unsigned NElts = VT.getVectorNumElements();		unsigned NElts = VT.getVectorNumElements();
SmallVector<SDValue, 8> Args;		SmallVector<SDValue, 8> Args;
▲ Show 20 Lines • Show All 2,104 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.td

Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines
def AMDGPUldexp_impl : SDNode<"AMDGPUISD::LDEXP", AMDGPULdExpOp>;		def AMDGPUldexp_impl : SDNode<"AMDGPUISD::LDEXP", AMDGPULdExpOp>;

def AMDGPUpkrtz_f16_f32_impl : SDNode<"AMDGPUISD::CVT_PKRTZ_F16_F32", AMDGPUFPPackOp>;		def AMDGPUpkrtz_f16_f32_impl : SDNode<"AMDGPUISD::CVT_PKRTZ_F16_F32", AMDGPUFPPackOp>;
def AMDGPUpknorm_i16_f32_impl : SDNode<"AMDGPUISD::CVT_PKNORM_I16_F32", AMDGPUFPPackOp>;		def AMDGPUpknorm_i16_f32_impl : SDNode<"AMDGPUISD::CVT_PKNORM_I16_F32", AMDGPUFPPackOp>;
def AMDGPUpknorm_u16_f32_impl : SDNode<"AMDGPUISD::CVT_PKNORM_U16_F32", AMDGPUFPPackOp>;		def AMDGPUpknorm_u16_f32_impl : SDNode<"AMDGPUISD::CVT_PKNORM_U16_F32", AMDGPUFPPackOp>;
def AMDGPUpk_i16_i32_impl : SDNode<"AMDGPUISD::CVT_PK_I16_I32", AMDGPUIntPackOp>;		def AMDGPUpk_i16_i32_impl : SDNode<"AMDGPUISD::CVT_PK_I16_I32", AMDGPUIntPackOp>;
def AMDGPUpk_u16_u32_impl : SDNode<"AMDGPUISD::CVT_PK_U16_U32", AMDGPUIntPackOp>;		def AMDGPUpk_u16_u32_impl : SDNode<"AMDGPUISD::CVT_PK_U16_U32", AMDGPUIntPackOp>;
def AMDGPUfp_to_f16 : SDNode<"AMDGPUISD::FP_TO_FP16" , SDTFPToIntOp>;		def AMDGPUfp_to_f16 : SDNode<"AMDGPUISD::FP_TO_FP16" , SDTFPToIntOp>;


def AMDGPUfp_class_impl : SDNode<"AMDGPUISD::FP_CLASS", AMDGPUFPClassOp>;		def AMDGPUfp_class_impl : SDNode<"AMDGPUISD::FP_CLASS", AMDGPUFPClassOp>;
		arsenmUnsubmitted Done Reply Inline Actions Should avoid defining an AMDGPU node for this and move this to generic code arsenm: Should avoid defining an AMDGPU node for this and move this to generic code
		arsenmUnsubmitted Done Reply Inline Actions Whole file is now whitespace only changes which can be dropped arsenm: Whole file is now whitespace only changes which can be dropped

// out = max(a, b) a and b are floats, where a nan comparison fails.		// out = max(a, b) a and b are floats, where a nan comparison fails.
// This is not commutative because this gives the second operand:		// This is not commutative because this gives the second operand:
// x < nan ? x : nan -> nan		// x < nan ? x : nan -> nan
// nan < x ? nan : x -> x		// nan < x ? nan : x -> x
def AMDGPUfmax_legacy : SDNode<"AMDGPUISD::FMAX_LEGACY", SDTFPBinOp,		def AMDGPUfmax_legacy : SDNode<"AMDGPUISD::FMAX_LEGACY", SDTFPBinOp,
[]		[]
>;		>;
▲ Show 20 Lines • Show All 240 Lines • ▼ Show 20 Lines	def AMDGPUfract : PatFrags<(ops node:$src), [(int_amdgcn_fract node:$src),
(AMDGPUfract_impl node:$src)]>;		(AMDGPUfract_impl node:$src)]>;

def AMDGPUldexp : PatFrags<(ops node:$src0, node:$src1),		def AMDGPUldexp : PatFrags<(ops node:$src0, node:$src1),
[(int_amdgcn_ldexp node:$src0, node:$src1),		[(int_amdgcn_ldexp node:$src0, node:$src1),
(AMDGPUldexp_impl node:$src0, node:$src1)]>;		(AMDGPUldexp_impl node:$src0, node:$src1)]>;

def AMDGPUfp_class : PatFrags<(ops node:$src0, node:$src1),		def AMDGPUfp_class : PatFrags<(ops node:$src0, node:$src1),
[(int_amdgcn_class node:$src0, node:$src1),		[(int_amdgcn_class node:$src0, node:$src1),
(AMDGPUfp_class_impl node:$src0, node:$src1)]>;		(AMDGPUfp_class_impl node:$src0, node:$src1),
		(is_fpclass node:$src0, node:$src1)]>;

def AMDGPUfmed3 : PatFrags<(ops node:$src0, node:$src1, node:$src2),		def AMDGPUfmed3 : PatFrags<(ops node:$src0, node:$src1, node:$src2),
[(int_amdgcn_fmed3 node:$src0, node:$src1, node:$src2),		[(int_amdgcn_fmed3 node:$src0, node:$src1, node:$src2),
(AMDGPUfmed3_impl node:$src0, node:$src1, node:$src2)]>;		(AMDGPUfmed3_impl node:$src0, node:$src1, node:$src2)]>;

def AMDGPUdiv_fixup : PatFrags<(ops node:$src0, node:$src1, node:$src2),		def AMDGPUdiv_fixup : PatFrags<(ops node:$src0, node:$src1, node:$src2),
[(int_amdgcn_div_fixup node:$src0, node:$src1, node:$src2),		[(int_amdgcn_div_fixup node:$src0, node:$src1, node:$src2),
(AMDGPUdiv_fixup_impl node:$src0, node:$src1, node:$src2)]>;		(AMDGPUdiv_fixup_impl node:$src0, node:$src1, node:$src2)]>;
▲ Show 20 Lines • Show All 76 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h

Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	private:
bool selectG_FMA_FMAD(MachineInstr &I) const;		bool selectG_FMA_FMAD(MachineInstr &I) const;
bool selectG_MERGE_VALUES(MachineInstr &I) const;		bool selectG_MERGE_VALUES(MachineInstr &I) const;
bool selectG_UNMERGE_VALUES(MachineInstr &I) const;		bool selectG_UNMERGE_VALUES(MachineInstr &I) const;
bool selectG_BUILD_VECTOR(MachineInstr &I) const;		bool selectG_BUILD_VECTOR(MachineInstr &I) const;
bool selectG_PTR_ADD(MachineInstr &I) const;		bool selectG_PTR_ADD(MachineInstr &I) const;
bool selectG_IMPLICIT_DEF(MachineInstr &I) const;		bool selectG_IMPLICIT_DEF(MachineInstr &I) const;
bool selectG_INSERT(MachineInstr &I) const;		bool selectG_INSERT(MachineInstr &I) const;
bool selectG_SBFX_UBFX(MachineInstr &I) const;		bool selectG_SBFX_UBFX(MachineInstr &I) const;
		bool selectG_IS_FPCLASS(MachineInstr &I) const;

bool selectInterpP1F16(MachineInstr &MI) const;		bool selectInterpP1F16(MachineInstr &MI) const;
bool selectWritelane(MachineInstr &MI) const;		bool selectWritelane(MachineInstr &MI) const;
bool selectDivScale(MachineInstr &MI) const;		bool selectDivScale(MachineInstr &MI) const;
bool selectIntrinsicIcmp(MachineInstr &MI) const;		bool selectIntrinsicIcmp(MachineInstr &MI) const;
bool selectBallot(MachineInstr &I) const;		bool selectBallot(MachineInstr &I) const;
bool selectRelocConstant(MachineInstr &I) const;		bool selectRelocConstant(MachineInstr &I) const;
bool selectGroupStaticSize(MachineInstr &I) const;		bool selectGroupStaticSize(MachineInstr &I) const;
▲ Show 20 Lines • Show All 237 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp

Show First 20 Lines • Show All 904 Lines • ▼ Show 20 Lines	bool AMDGPUInstructionSelector::selectG_SBFX_UBFX(MachineInstr &MI) const {
auto MIB = BuildMI(*MBB, &MI, DL, TII.get(Opc), DstReg)		auto MIB = BuildMI(*MBB, &MI, DL, TII.get(Opc), DstReg)
.addReg(SrcReg)		.addReg(SrcReg)
.addReg(OffsetReg)		.addReg(OffsetReg)
.addReg(WidthReg);		.addReg(WidthReg);
MI.eraseFromParent();		MI.eraseFromParent();
return constrainSelectedInstRegOperands(*MIB, TII, TRI, RBI);		return constrainSelectedInstRegOperands(*MIB, TII, TRI, RBI);
}		}

		static int getV_CMP_CLASSOpcode(unsigned size, bool hasTrue16BitInsts) {
		switch(size) {
		default: return -1;
		case 16:
		return hasTrue16BitInsts ? AMDGPU::V_CMP_CLASS_F16_t16_e64
		: AMDGPU::V_CMP_CLASS_F16_e64;
		case 32: return AMDGPU::V_CMP_CLASS_F32_e64;
		case 64: return AMDGPU::V_CMP_CLASS_F64_e64;
		}
		}
		arsenmUnsubmitted Done Reply Inline Actions Dead code arsenm: Dead code

		bool AMDGPUInstructionSelector::selectG_IS_FPCLASS(MachineInstr &I) const {
		arsenmUnsubmitted Not Done Reply Inline Actions I don't see why you need to manually select this (maybe sharing the pattern between the existing intrinsic is annoying because the new intrinsic uses immarg?) arsenm: I don't see why you need to manually select this (maybe sharing the pattern between the…
		JanekvOAuthorUnsubmitted Done Reply Inline Actions I did look on whether I could re-use some of the existing tablegen but I couldn't get it quite into the right shape for it to match. `llvm.is.fpclass` requires the mask to be an immarg as you mentioned so materializing the immediate into a register anywhere before this function results in a verifier error. JanekvO: I did look on whether I could re-use some of the existing tablegen but I couldn't get it quite…
		arsenmUnsubmitted Done Reply Inline Actions You might need to split it into a different pattern instantiation, but you would just need the S_MOV_B32 from the mask to the constant (although I actually would expect it to work if you directly folded the constant anyway, since the operand should have been copied to VGPR anyway). Something like: class ClassPat<Instruction inst, ValueType vt> : GCNPat < (fp_class (VOP3Mods vt:$src0, i32:$src0_mods), (i32 timm:mask)) (inst $src0_mods, VSrc_b32:$src0, $src0_mods, (S_MOV_B32 $mask)) >; arsenm: You might need to split it into a different pattern instantiation, but you would just need the…
		MachineBasicBlock *BB = I.getParent();
		const DebugLoc &DL = I.getDebugLoc();

		Register CCReg = I.getOperand(0).getReg();
		Register SrcReg;
		unsigned Mods;
		std::tie(SrcReg, Mods) = selectVOP3ModsImpl(I.getOperand(1));
		unsigned Mask = I.getOperand(2).getImm();
		unsigned Size = RBI.getSizeInBits(SrcReg, *MRI, TRI);

		int Opcode = getV_CMP_CLASSOpcode(Size, STI.hasTrue16BitInsts());
		if (Opcode == -1)
		return false;
		arsenmUnsubmitted Done Reply Inline Actions Should be no reason to check this here arsenm: Should be no reason to check this here

		Register ConstantReg = MRI->createVirtualRegister(&AMDGPU::VGPR_32RegClass);
		BuildMI(*BB, &I, DL, TII.get(AMDGPU::V_MOV_B32_e32), ConstantReg)
		.addImm(Mask);
		MachineInstrBuilder CmpClassBuilder =
		BuildMI(*BB, &I, DL, TII.get(Opcode), CCReg)
		.addImm(Mods)
		.addReg(SrcReg)
		.addReg(ConstantReg);

		MachineInstr *CmpClass = CmpClassBuilder;
		bool Ret = constrainSelectedInstRegOperands(*CmpClass, TII, TRI, RBI);
		I.eraseFromParent();
		return Ret;
		}

bool AMDGPUInstructionSelector::selectInterpP1F16(MachineInstr &MI) const {		bool AMDGPUInstructionSelector::selectInterpP1F16(MachineInstr &MI) const {
if (STI.getLDSBankCount() != 16)		if (STI.getLDSBankCount() != 16)
return selectImpl(MI, *CoverageInfo);		return selectImpl(MI, *CoverageInfo);
		arsenmUnsubmitted Done Reply Inline Actions You can just unconditionally materialize the constant into a register and let SIFoldOperands sort out the constant bus restriction arsenm: You can just unconditionally materialize the constant into a register and let SIFoldOperands…

Register Dst = MI.getOperand(0).getReg();		Register Dst = MI.getOperand(0).getReg();
Register Src0 = MI.getOperand(2).getReg();		Register Src0 = MI.getOperand(2).getReg();
Register M0Val = MI.getOperand(6).getReg();		Register M0Val = MI.getOperand(6).getReg();
		arsenmUnsubmitted Done Reply Inline Actions You shouldn't need to special case the result constraint arsenm: You shouldn't need to special case the result constraint
if (!RBI.constrainGenericRegister(M0Val, AMDGPU::SReg_32RegClass, *MRI) \|\|		if (!RBI.constrainGenericRegister(M0Val, AMDGPU::SReg_32RegClass, *MRI) \|\|
!RBI.constrainGenericRegister(Dst, AMDGPU::VGPR_32RegClass, *MRI) \|\|		!RBI.constrainGenericRegister(Dst, AMDGPU::VGPR_32RegClass, *MRI) \|\|
!RBI.constrainGenericRegister(Src0, AMDGPU::VGPR_32RegClass, *MRI))		!RBI.constrainGenericRegister(Src0, AMDGPU::VGPR_32RegClass, *MRI))
return false;		return false;

// This requires 2 instructions. It is possible to write a pattern to support		// This requires 2 instructions. It is possible to write a pattern to support
// this, but the generated isel emitter doesn't correctly deal with multiple		// this, but the generated isel emitter doesn't correctly deal with multiple
// output instructions using the same physical register input. The copy to m0		// output instructions using the same physical register input. The copy to m0
▲ Show 20 Lines • Show All 2,380 Lines • ▼ Show 20 Lines	bool AMDGPUInstructionSelector::select(MachineInstr &I) {
case TargetOpcode::G_INTRINSIC:		case TargetOpcode::G_INTRINSIC:
return selectG_INTRINSIC(I);		return selectG_INTRINSIC(I);
case TargetOpcode::G_INTRINSIC_W_SIDE_EFFECTS:		case TargetOpcode::G_INTRINSIC_W_SIDE_EFFECTS:
return selectG_INTRINSIC_W_SIDE_EFFECTS(I);		return selectG_INTRINSIC_W_SIDE_EFFECTS(I);
case TargetOpcode::G_ICMP:		case TargetOpcode::G_ICMP:
if (selectG_ICMP(I))		if (selectG_ICMP(I))
return true;		return true;
return selectImpl(I, *CoverageInfo);		return selectImpl(I, *CoverageInfo);
		case TargetOpcode::G_IS_FPCLASS:
		return selectG_IS_FPCLASS(I);
case TargetOpcode::G_LOAD:		case TargetOpcode::G_LOAD:
case TargetOpcode::G_STORE:		case TargetOpcode::G_STORE:
case TargetOpcode::G_ATOMIC_CMPXCHG:		case TargetOpcode::G_ATOMIC_CMPXCHG:
case TargetOpcode::G_ATOMICRMW_XCHG:		case TargetOpcode::G_ATOMICRMW_XCHG:
case TargetOpcode::G_ATOMICRMW_ADD:		case TargetOpcode::G_ATOMICRMW_ADD:
case TargetOpcode::G_ATOMICRMW_SUB:		case TargetOpcode::G_ATOMICRMW_SUB:
case TargetOpcode::G_ATOMICRMW_AND:		case TargetOpcode::G_ATOMICRMW_AND:
case TargetOpcode::G_ATOMICRMW_OR:		case TargetOpcode::G_ATOMICRMW_OR:
▲ Show 20 Lines • Show All 1,636 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp

Show First 20 Lines • Show All 972 Lines • ▼ Show 20 Lines	AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
getActionDefinitionsBuilder(G_CTPOP)		getActionDefinitionsBuilder(G_CTPOP)
.legalFor({{S32, S32}, {S32, S64}})		.legalFor({{S32, S32}, {S32, S64}})
.clampScalar(0, S32, S32)		.clampScalar(0, S32, S32)
.widenScalarToNextPow2(1, 32)		.widenScalarToNextPow2(1, 32)
.clampScalar(1, S32, S64)		.clampScalar(1, S32, S64)
.scalarize(0)		.scalarize(0)
.widenScalarToNextPow2(0, 32);		.widenScalarToNextPow2(0, 32);

		getActionDefinitionsBuilder(G_IS_FPCLASS)
		.legalForCartesianProduct({S1}, ST.has16BitInsts() ? FPTypes16 : FPTypesBase)
		.widenScalarToNextPow2(1)
		.clampScalar(1, S32, S64)
		arsenmUnsubmitted Done Reply Inline Actions I think this clampScalar isn't doing anything and can be dropped arsenm: I think this clampScalar isn't doing anything and can be dropped
		.scalarize(0);

// The hardware instructions return a different result on 0 than the generic		// The hardware instructions return a different result on 0 than the generic
// instructions expect. The hardware produces -1, but these produce the		// instructions expect. The hardware produces -1, but these produce the
// bitwidth.		// bitwidth.
getActionDefinitionsBuilder({G_CTLZ, G_CTTZ})		getActionDefinitionsBuilder({G_CTLZ, G_CTTZ})
.scalarize(0)		.scalarize(0)
.clampScalar(0, S32, S32)		.clampScalar(0, S32, S32)
.clampScalar(1, S32, S64)		.clampScalar(1, S32, S64)
▲ Show 20 Lines • Show All 4,789 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp

Show First 20 Lines • Show All 3,930 Lines • ▼ Show 20 Lines	AMDGPURegisterBankInfo::getInstrMapping(const MachineInstr &MI) const {
case AMDGPU::G_FCMP: {		case AMDGPU::G_FCMP: {
unsigned Size = MRI.getType(MI.getOperand(2).getReg()).getSizeInBits();		unsigned Size = MRI.getType(MI.getOperand(2).getReg()).getSizeInBits();
OpdsMapping[0] = AMDGPU::getValueMapping(AMDGPU::VCCRegBankID, 1);		OpdsMapping[0] = AMDGPU::getValueMapping(AMDGPU::VCCRegBankID, 1);
OpdsMapping[1] = nullptr; // Predicate Operand.		OpdsMapping[1] = nullptr; // Predicate Operand.
OpdsMapping[2] = AMDGPU::getValueMapping(AMDGPU::VGPRRegBankID, Size);		OpdsMapping[2] = AMDGPU::getValueMapping(AMDGPU::VGPRRegBankID, Size);
OpdsMapping[3] = AMDGPU::getValueMapping(AMDGPU::VGPRRegBankID, Size);		OpdsMapping[3] = AMDGPU::getValueMapping(AMDGPU::VGPRRegBankID, Size);
break;		break;
}		}
		case AMDGPU::G_IS_FPCLASS: {
		Register SrcReg = MI.getOperand(1).getReg();
		unsigned SrcSize = MRI.getType(SrcReg).getSizeInBits();
		unsigned DstSize = MRI.getType(MI.getOperand(0).getReg()).getSizeInBits();
		OpdsMapping[0] = AMDGPU::getValueMapping(AMDGPU::VCCRegBankID, DstSize);
		OpdsMapping[1] = AMDGPU::getValueMapping(AMDGPU::VGPRRegBankID, SrcSize);
		break;
		arsenmUnsubmitted Done Reply Inline Actions Pretty sure this default constructs to null arsenm: Pretty sure this default constructs to null
		}
case AMDGPU::G_STORE: {		case AMDGPU::G_STORE: {
assert(MI.getOperand(0).isReg());		assert(MI.getOperand(0).isReg());
unsigned Size = MRI.getType(MI.getOperand(0).getReg()).getSizeInBits();		unsigned Size = MRI.getType(MI.getOperand(0).getReg()).getSizeInBits();

// FIXME: We need to specify a different reg bank once scalar stores are		// FIXME: We need to specify a different reg bank once scalar stores are
// supported.		// supported.
const ValueMapping *ValMapping =		const ValueMapping *ValMapping =
AMDGPU::getValueMapping(AMDGPU::VGPRRegBankID, Size);		AMDGPU::getValueMapping(AMDGPU::VGPRRegBankID, Size);
▲ Show 20 Lines • Show All 866 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 250 Lines • ▼ Show 20 Lines	for (unsigned Op = 0; Op < ISD::BUILTIN_OP_END; ++Op) {
case ISD::STORE:		case ISD::STORE:
case ISD::BUILD_VECTOR:		case ISD::BUILD_VECTOR:
case ISD::BITCAST:		case ISD::BITCAST:
case ISD::UNDEF:		case ISD::UNDEF:
case ISD::EXTRACT_VECTOR_ELT:		case ISD::EXTRACT_VECTOR_ELT:
case ISD::INSERT_VECTOR_ELT:		case ISD::INSERT_VECTOR_ELT:
case ISD::EXTRACT_SUBVECTOR:		case ISD::EXTRACT_SUBVECTOR:
case ISD::SCALAR_TO_VECTOR:		case ISD::SCALAR_TO_VECTOR:
		case ISD::IS_FPCLASS:
break;		break;
case ISD::INSERT_SUBVECTOR:		case ISD::INSERT_SUBVECTOR:
case ISD::CONCAT_VECTORS:		case ISD::CONCAT_VECTORS:
setOperationAction(Op, VT, Custom);		setOperationAction(Op, VT, Custom);
break;		break;
default:		default:
setOperationAction(Op, VT, Expand);		setOperationAction(Op, VT, Expand);
break;		break;
▲ Show 20 Lines • Show All 253 Lines • ▼ Show 20 Lines	for (MVT VT : {MVT::v2i16, MVT::v2f16, MVT::v4i16, MVT::v4f16, MVT::v8i16,
case ISD::BUILD_VECTOR:		case ISD::BUILD_VECTOR:
case ISD::BITCAST:		case ISD::BITCAST:
case ISD::UNDEF:		case ISD::UNDEF:
case ISD::EXTRACT_VECTOR_ELT:		case ISD::EXTRACT_VECTOR_ELT:
case ISD::INSERT_VECTOR_ELT:		case ISD::INSERT_VECTOR_ELT:
case ISD::INSERT_SUBVECTOR:		case ISD::INSERT_SUBVECTOR:
case ISD::EXTRACT_SUBVECTOR:		case ISD::EXTRACT_SUBVECTOR:
case ISD::SCALAR_TO_VECTOR:		case ISD::SCALAR_TO_VECTOR:
		case ISD::IS_FPCLASS:
break;		break;
case ISD::CONCAT_VECTORS:		case ISD::CONCAT_VECTORS:
setOperationAction(Op, VT, Custom);		setOperationAction(Op, VT, Custom);
break;		break;
default:		default:
setOperationAction(Op, VT, Expand);		setOperationAction(Op, VT, Expand);
break;		break;
}		}
▲ Show 20 Lines • Show All 12,685 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-fpclass-flags.ll

This file was added.

				; RUN: llc -march=amdgcn -mcpu=gfx1030 -O0 -stop-after=irtranslator -global-isel %s -o - \| FileCheck %s

				arsenmUnsubmitted Not Done Reply Inline Actions -global-isel to front, also generate these checks arsenm: -global-isel to front, also generate these checks
				; CHECK-LABEL: name: fpclass_has_nofpexcept
				; CHECK: nofpexcept G_IS_FPCLASS
				define i1 @fpclass_has_nofpexcept(float %x) {
				%1 = call i1 @llvm.is.fpclass.f32(float %x, i32 3)
				ret i1 %1
				}

				; CHECK-LABEL: name: strict_fpclass
				; CHECK-NOT: nofpexcept
				define i1 @strict_fpclass(float %x) strictfp {
				%1 = call i1 @llvm.is.fpclass.f32(float %x, i32 3)
				ret i1 %1
				}

				declare i1 @llvm.is.fpclass.f32(float, i32)
				arsenmUnsubmitted Not Done Reply Inline Actions Needs additional checks with other flags besides the one just set arsenm: Needs additional checks with other flags besides the one just set
				JanekvOAuthorUnsubmitted Done Reply Inline Actions I've been wondering whether the flag copy from the IR intrinsic to G_IS_FPCLASS in IRTranslator should be removed altogether. I'd have to weaken the flags' constraints as they all require scalar or vector fp return types. Additionally, Any use of fast math flags outside of existing uses will most likely require amending langref. E.g., current descriptions of some fast math flags describe how input can result into a poison value but this wouldn't be possible for G_IS_FPCLASS as it's a bool return. Let me know what you think, I can see some of the flags being useful by folding into constant bool values (e.g., not a nan flag + G_IS_FPCLASS test for nans) but I may be a bit naïve on useful cases beyond said folding. JanekvO: I've been wondering whether the flag copy from the IR intrinsic to G_IS_FPCLASS in IRTranslator…
				arsenmUnsubmitted Done Reply Inline Actions OK, might as well drop this test if we have end to end tests and there's nothing unique to test in the IRTranslator arsenm: OK, might as well drop this test if we have end to end tests and there's nothing unique to test…

llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.f16.ll

This file was added.

				; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx704 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX7SELDAG %s
				; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx803 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX8SELDAG,GFX8CHECK %s
				; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx803 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX8GLISEL,GFX8CHECK %s
				; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx908 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX9SELDAG,GFX9CHECK %s
				; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx908 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX9GLISEL,GFX9CHECK %s
				; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx1031 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX10SELDAG,GFX10CHECK %s
				; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx1031 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX10GLISEL,GFX10CHECK %s
				; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx1100 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX11SELDAG,GFX11CHECK %s
				; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx1100 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX11GLISEL,GFX11CHECK %s

				define i1 @isnan_half(half %x) nounwind {
				; GFX7SELDAG-LABEL: isnan_half:
				; GFX7SELDAG: ; %bb.0:
				; GFX7SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v0, v0
				; GFX7SELDAG-NEXT: s_movk_i32 s4, 0x7c00
				; GFX7SELDAG-NEXT: v_and_b32_e32 v0, 0x7fff, v0
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v0
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX7GLISEL-LABEL: isnan_half:
				; GFX7GLISEL: ; %bb.0:
				; GFX7GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v0, v0
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_half:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_half:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_half:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f16_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_half:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f16_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f16(half %x, i32 3) ; nan
				ret i1 %1
				}

				define <2 x i1> @isnan_v2half(<2 x half> %x) nounwind {
				; GFX7SELDAG-LABEL: isnan_v2half:
				; GFX7SELDAG: ; %bb.0:
				; GFX7SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v0, v0
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v1, v1
				; GFX7SELDAG-NEXT: s_movk_i32 s4, 0x7c00
				; GFX7SELDAG-NEXT: v_and_b32_e32 v0, 0x7fff, v0
				; GFX7SELDAG-NEXT: v_and_b32_e32 v1, 0x7fff, v1
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v0
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v1
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v1, 0, 1, vcc
				; GFX7SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX7GLISEL-LABEL: isnan_v2half:
				; GFX7GLISEL: ; %bb.0:
				; GFX7GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v0, v0
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v1, v1
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8SELDAG-LABEL: isnan_v2half:
				; GFX8SELDAG: ; %bb.0:
				; GFX8SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8SELDAG-NEXT: v_lshrrev_b32_e32 v1, 16, v0
				; GFX8SELDAG-NEXT: v_cmp_class_f16_e64 s[4:5], v1, 3
				; GFX8SELDAG-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX8SELDAG-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				arsenmUnsubmitted Done Reply Inline Actions This is broken for signaling nans. You dropped this from the patch but left these dead checks around arsenm: This is broken for signaling nans. You dropped this from the patch but left these dead checks…
				; GFX8SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8GLISEL-LABEL: isnan_v2half:
				; GFX8GLISEL: ; %bb.0:
				; GFX8GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8GLISEL-NEXT: v_lshrrev_b32_e32 v1, 16, v0
				; GFX8GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX8GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v1, 3
				; GFX8GLISEL-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX8GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9SELDAG-LABEL: isnan_v2half:
				; GFX9SELDAG: ; %bb.0:
				; GFX9SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9SELDAG-NEXT: v_mov_b32_e32 v1, 3
				; GFX9SELDAG-NEXT: v_cmp_class_f16_sdwa s[4:5], v0, v1 src0_sel:WORD_1 src1_sel:DWORD
				; GFX9SELDAG-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX9SELDAG-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX9SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9GLISEL-LABEL: isnan_v2half:
				; GFX9GLISEL: ; %bb.0:
				; GFX9GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9GLISEL-NEXT: v_mov_b32_e32 v1, 3
				; GFX9GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX9GLISEL-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9GLISEL-NEXT: v_cmp_class_f16_sdwa s[4:5], v0, v1 src0_sel:WORD_1 src1_sel:DWORD
				; GFX9GLISEL-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX9GLISEL-NEXT: v_mov_b32_e32 v0, v2
				; GFX9GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_v2half:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_mov_b32_e32 v1, 3
				; GFX10CHECK-NEXT: v_cmp_class_f16_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f16_sdwa s4, v0, v1 src0_sel:WORD_1 src1_sel:DWORD
				; GFX10CHECK-NEXT: v_mov_b32_e32 v0, v2
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_v2half:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_lshrrev_b32_e32 v1, 16, v0
				; GFX11CHECK-NEXT: v_cmp_class_f16_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(NEXT) \| instid1(VALU_DEP_3)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f16_e64 s0, v1, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <2 x i1> @llvm.is.fpclass.v2f16(<2 x half> %x, i32 3) ; nan
				ret <2 x i1> %1
				}

				define <3 x i1> @isnan_v3half(<3 x half> %x) nounwind {
				; GFX7SELDAG-LABEL: isnan_v3half:
				; GFX7SELDAG: ; %bb.0:
				; GFX7SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v0, v0
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v1, v1
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v2, v2
				; GFX7SELDAG-NEXT: s_movk_i32 s4, 0x7c00
				; GFX7SELDAG-NEXT: v_and_b32_e32 v0, 0x7fff, v0
				; GFX7SELDAG-NEXT: v_and_b32_e32 v1, 0x7fff, v1
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v0
				; GFX7SELDAG-NEXT: v_and_b32_e32 v2, 0x7fff, v2
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v1
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v1, 0, 1, vcc
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v2
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v2, 0, 1, vcc
				; GFX7SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX7GLISEL-LABEL: isnan_v3half:
				; GFX7GLISEL: ; %bb.0:
				; GFX7GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v0, v0
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v1, v1
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v2, v2
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8SELDAG-LABEL: isnan_v3half:
				; GFX8SELDAG: ; %bb.0:
				; GFX8SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8SELDAG-NEXT: v_lshrrev_b32_e32 v2, 16, v0
				; GFX8SELDAG-NEXT: v_cmp_u_f16_e32 vcc, v2, v2
				; GFX8SELDAG-NEXT: v_cndmask_b32_e64 v3, 0, 1, vcc
				; GFX8SELDAG-NEXT: v_cmp_u_f16_e32 vcc, v0, v0
				; GFX8SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8SELDAG-NEXT: v_cmp_u_f16_e32 vcc, v1, v1
				; GFX8SELDAG-NEXT: v_cndmask_b32_e64 v2, 0, 1, vcc
				; GFX8SELDAG-NEXT: v_mov_b32_e32 v1, v3
				; GFX8SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8GLISEL-LABEL: isnan_v3half:
				; GFX8GLISEL: ; %bb.0:
				; GFX8GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8GLISEL-NEXT: v_lshrrev_b32_e32 v2, 16, v0
				; GFX8GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX8GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v2, 3
				; GFX8GLISEL-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX8GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v1, 3
				; GFX8GLISEL-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX8GLISEL-NEXT: v_mov_b32_e32 v1, v3
				; GFX8GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9SELDAG-LABEL: isnan_v3half:
				; GFX9SELDAG: ; %bb.0:
				; GFX9SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9SELDAG-NEXT: v_cmp_u_f16_sdwa s[4:5], v0, v0 src0_sel:WORD_1 src1_sel:WORD_1
				; GFX9SELDAG-NEXT: v_cmp_u_f16_e32 vcc, v0, v0
				; GFX9SELDAG-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX9SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9SELDAG-NEXT: v_cmp_u_f16_e32 vcc, v1, v1
				; GFX9SELDAG-NEXT: v_cndmask_b32_e64 v2, 0, 1, vcc
				; GFX9SELDAG-NEXT: v_mov_b32_e32 v1, v3
				; GFX9SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9GLISEL-LABEL: isnan_v3half:
				; GFX9GLISEL: ; %bb.0:
				; GFX9GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9GLISEL-NEXT: v_mov_b32_e32 v2, 3
				; GFX9GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX9GLISEL-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX9GLISEL-NEXT: v_cmp_class_f16_sdwa s[4:5], v0, v2 src0_sel:WORD_1 src1_sel:DWORD
				; GFX9GLISEL-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX9GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v1, 3
				; GFX9GLISEL-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9GLISEL-NEXT: v_mov_b32_e32 v0, v4
				; GFX9GLISEL-NEXT: v_mov_b32_e32 v1, v3
				; GFX9GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10SELDAG-LABEL: isnan_v3half:
				; GFX10SELDAG: ; %bb.0:
				; GFX10SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10SELDAG-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10SELDAG-NEXT: v_cmp_u_f16_sdwa s4, v0, v0 src0_sel:WORD_1 src1_sel:WORD_1
				; GFX10SELDAG-NEXT: v_cmp_u_f16_e32 vcc_lo, v0, v0
				; GFX10SELDAG-NEXT: v_cndmask_b32_e64 v3, 0, 1, s4
				; GFX10SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc_lo
				; GFX10SELDAG-NEXT: v_cmp_u_f16_e32 vcc_lo, v1, v1
				; GFX10SELDAG-NEXT: v_mov_b32_e32 v1, v3
				; GFX10SELDAG-NEXT: v_cndmask_b32_e64 v2, 0, 1, vcc_lo
				; GFX10SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10GLISEL-LABEL: isnan_v3half:
				; GFX10GLISEL: ; %bb.0:
				; GFX10GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10GLISEL-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10GLISEL-NEXT: v_mov_b32_e32 v2, 3
				; GFX10GLISEL-NEXT: v_cmp_class_f16_e64 s4, v0, 3
				; GFX10GLISEL-NEXT: v_cndmask_b32_e64 v4, 0, 1, s4
				; GFX10GLISEL-NEXT: v_cmp_class_f16_sdwa s4, v0, v2 src0_sel:WORD_1 src1_sel:DWORD
				; GFX10GLISEL-NEXT: v_mov_b32_e32 v0, v4
				; GFX10GLISEL-NEXT: v_cndmask_b32_e64 v3, 0, 1, s4
				; GFX10GLISEL-NEXT: v_cmp_class_f16_e64 s4, v1, 3
				; GFX10GLISEL-NEXT: v_mov_b32_e32 v1, v3
				; GFX10GLISEL-NEXT: v_cndmask_b32_e64 v2, 0, 1, s4
				; GFX10GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11SELDAG-LABEL: isnan_v3half:
				; GFX11SELDAG: ; %bb.0:
				; GFX11SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11SELDAG-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11SELDAG-NEXT: v_lshrrev_b32_e32 v2, 16, v0
				; GFX11SELDAG-NEXT: v_cmp_u_f16_e32 vcc_lo, v0, v0
				; GFX11SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc_lo
				; GFX11SELDAG-NEXT: s_delay_alu instid0(VALU_DEP_3) \| instskip(SKIP_2) \| instid1(VALU_DEP_2)
				; GFX11SELDAG-NEXT: v_cmp_u_f16_e32 vcc_lo, v2, v2
				; GFX11SELDAG-NEXT: v_cndmask_b32_e64 v3, 0, 1, vcc_lo
				; GFX11SELDAG-NEXT: v_cmp_u_f16_e32 vcc_lo, v1, v1
				; GFX11SELDAG-NEXT: v_mov_b32_e32 v1, v3
				; GFX11SELDAG-NEXT: v_cndmask_b32_e64 v2, 0, 1, vcc_lo
				; GFX11SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11GLISEL-LABEL: isnan_v3half:
				; GFX11GLISEL: ; %bb.0:
				; GFX11GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11GLISEL-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11GLISEL-NEXT: v_lshrrev_b32_e32 v2, 16, v0
				; GFX11GLISEL-NEXT: v_cmp_class_f16_e64 s0, v0, 3
				; GFX11GLISEL-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(NEXT) \| instid1(VALU_DEP_3)
				; GFX11GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11GLISEL-NEXT: v_cmp_class_f16_e64 s0, v2, 3
				; GFX11GLISEL-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_2)
				; GFX11GLISEL-NEXT: v_cndmask_b32_e64 v3, 0, 1, s0
				; GFX11GLISEL-NEXT: v_cmp_class_f16_e64 s0, v1, 3
				; GFX11GLISEL-NEXT: v_mov_b32_e32 v1, v3
				; GFX11GLISEL-NEXT: s_delay_alu instid0(VALU_DEP_2)
				; GFX11GLISEL-NEXT: v_cndmask_b32_e64 v2, 0, 1, s0
				; GFX11GLISEL-NEXT: s_setpc_b64 s[30:31]
				%1 = call <3 x i1> @llvm.is.fpclass.v3f16(<3 x half> %x, i32 3) ; nan
				ret <3 x i1> %1
				}

				define <4 x i1> @isnan_v4half(<4 x half> %x) nounwind {
				; GFX7SELDAG-LABEL: isnan_v4half:
				; GFX7SELDAG: ; %bb.0:
				; GFX7SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v0, v0
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v1, v1
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v2, v2
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v3, v3
				; GFX7SELDAG-NEXT: s_movk_i32 s4, 0x7c00
				; GFX7SELDAG-NEXT: v_and_b32_e32 v0, 0x7fff, v0
				; GFX7SELDAG-NEXT: v_and_b32_e32 v1, 0x7fff, v1
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v0
				; GFX7SELDAG-NEXT: v_and_b32_e32 v2, 0x7fff, v2
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v1
				; GFX7SELDAG-NEXT: v_and_b32_e32 v3, 0x7fff, v3
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v1, 0, 1, vcc
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v2
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v2, 0, 1, vcc
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v3
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v3, 0, 1, vcc
				; GFX7SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX7GLISEL-LABEL: isnan_v4half:
				; GFX7GLISEL: ; %bb.0:
				; GFX7GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v0, v0
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v1, v1
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v2, v2
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v3, v3
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8SELDAG-LABEL: isnan_v4half:
				; GFX8SELDAG: ; %bb.0:
				; GFX8SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8SELDAG-NEXT: v_lshrrev_b32_e32 v2, 16, v0
				; GFX8SELDAG-NEXT: v_lshrrev_b32_e32 v3, 16, v1
				; GFX8SELDAG-NEXT: v_cmp_class_f16_e64 s[4:5], v2, 3
				; GFX8SELDAG-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX8SELDAG-NEXT: v_cmp_class_f16_e64 s[4:5], v3, 3
				; GFX8SELDAG-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX8SELDAG-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX8SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8SELDAG-NEXT: v_cmp_class_f16_e64 s[4:5], v1, 3
				; GFX8SELDAG-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX8SELDAG-NEXT: v_mov_b32_e32 v1, v4
				; GFX8SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8GLISEL-LABEL: isnan_v4half:
				; GFX8GLISEL: ; %bb.0:
				; GFX8GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8GLISEL-NEXT: v_lshrrev_b32_e32 v2, 16, v0
				; GFX8GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX8GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v2, 3
				; GFX8GLISEL-NEXT: v_lshrrev_b32_e32 v3, 16, v1
				; GFX8GLISEL-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX8GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v1, 3
				; GFX8GLISEL-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX8GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v3, 3
				; GFX8GLISEL-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX8GLISEL-NEXT: v_mov_b32_e32 v1, v4
				; GFX8GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9SELDAG-LABEL: isnan_v4half:
				; GFX9SELDAG: ; %bb.0:
				; GFX9SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9SELDAG-NEXT: v_mov_b32_e32 v2, 3
				; GFX9SELDAG-NEXT: v_cmp_class_f16_sdwa s[4:5], v0, v2 src0_sel:WORD_1 src1_sel:DWORD
				; GFX9SELDAG-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX9SELDAG-NEXT: v_cmp_class_f16_sdwa s[4:5], v1, v2 src0_sel:WORD_1 src1_sel:DWORD
				; GFX9SELDAG-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX9SELDAG-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX9SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9SELDAG-NEXT: v_cmp_class_f16_e64 s[4:5], v1, 3
				; GFX9SELDAG-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9SELDAG-NEXT: v_mov_b32_e32 v1, v4
				; GFX9SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9GLISEL-LABEL: isnan_v4half:
				; GFX9GLISEL: ; %bb.0:
				; GFX9GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9GLISEL-NEXT: v_mov_b32_e32 v3, 3
				; GFX9GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX9GLISEL-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX9GLISEL-NEXT: v_cmp_class_f16_sdwa s[4:5], v0, v3 src0_sel:WORD_1 src1_sel:DWORD
				; GFX9GLISEL-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX9GLISEL-NEXT: v_cmp_class_f16_e64 s[4:5], v1, 3
				; GFX9GLISEL-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9GLISEL-NEXT: v_cmp_class_f16_sdwa s[4:5], v1, v3 src0_sel:WORD_1 src1_sel:DWORD
				; GFX9GLISEL-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX9GLISEL-NEXT: v_mov_b32_e32 v0, v4
				; GFX9GLISEL-NEXT: v_mov_b32_e32 v1, v5
				; GFX9GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10SELDAG-LABEL: isnan_v4half:
				; GFX10SELDAG: ; %bb.0:
				; GFX10SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10SELDAG-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10SELDAG-NEXT: v_mov_b32_e32 v2, 3
				; GFX10SELDAG-NEXT: v_cmp_class_f16_sdwa s4, v1, v2 src0_sel:WORD_1 src1_sel:DWORD
				; GFX10SELDAG-NEXT: v_cmp_class_f16_sdwa s5, v0, v2 src0_sel:WORD_1 src1_sel:DWORD
				; GFX10SELDAG-NEXT: v_cndmask_b32_e64 v3, 0, 1, s4
				; GFX10SELDAG-NEXT: v_cndmask_b32_e64 v4, 0, 1, s5
				; GFX10SELDAG-NEXT: v_cmp_class_f16_e64 s5, v0, 3
				; GFX10SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, s5
				; GFX10SELDAG-NEXT: v_cmp_class_f16_e64 s5, v1, 3
				; GFX10SELDAG-NEXT: v_mov_b32_e32 v1, v4
				; GFX10SELDAG-NEXT: v_cndmask_b32_e64 v2, 0, 1, s5
				; GFX10SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10GLISEL-LABEL: isnan_v4half:
				; GFX10GLISEL: ; %bb.0:
				; GFX10GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10GLISEL-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10GLISEL-NEXT: v_mov_b32_e32 v3, 3
				; GFX10GLISEL-NEXT: v_cmp_class_f16_e64 s4, v0, 3
				; GFX10GLISEL-NEXT: v_cndmask_b32_e64 v4, 0, 1, s4
				; GFX10GLISEL-NEXT: v_cmp_class_f16_sdwa s4, v0, v3 src0_sel:WORD_1 src1_sel:DWORD
				; GFX10GLISEL-NEXT: v_mov_b32_e32 v0, v4
				; GFX10GLISEL-NEXT: v_cndmask_b32_e64 v5, 0, 1, s4
				; GFX10GLISEL-NEXT: v_cmp_class_f16_e64 s4, v1, 3
				; GFX10GLISEL-NEXT: v_cndmask_b32_e64 v2, 0, 1, s4
				; GFX10GLISEL-NEXT: v_cmp_class_f16_sdwa s4, v1, v3 src0_sel:WORD_1 src1_sel:DWORD
				; GFX10GLISEL-NEXT: v_mov_b32_e32 v1, v5
				; GFX10GLISEL-NEXT: v_cndmask_b32_e64 v3, 0, 1, s4
				; GFX10GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_v4half:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f16_e64 s0, v0, 3
				; GFX11CHECK-NEXT: v_lshrrev_b32_e32 v3, 16, v0
				; GFX11CHECK-NEXT: v_lshrrev_b32_e32 v4, 16, v1
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_3) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f16_e64 s0, v1, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f16_e64 s0, v3, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f16_e64 s0, v4, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <4 x i1> @llvm.is.fpclass.v4f16(<4 x half> %x, i32 3) ; nan
				ret <4 x i1> %1
				}

				define i1 @isnan_half_strictfp(half %x) strictfp nounwind {
				; GFX7SELDAG-LABEL: isnan_half_strictfp:
				; GFX7SELDAG: ; %bb.0:
				; GFX7SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v0, v0
				; GFX7SELDAG-NEXT: s_movk_i32 s4, 0x7c00
				; GFX7SELDAG-NEXT: v_and_b32_e32 v0, 0x7fff, v0
				; GFX7SELDAG-NEXT: v_cmp_lt_i32_e32 vcc, s4, v0
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX7GLISEL-LABEL: isnan_half_strictfp:
				; GFX7GLISEL: ; %bb.0:
				; GFX7GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v0, v0
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_half_strictfp:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_half_strictfp:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f16_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_half_strictfp:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f16_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_half_strictfp:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f16_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f16(half %x, i32 3) ; nan
				ret i1 %1
				}

				define i1 @isinf_half(half %x) nounwind {
				; GFX7SELDAG-LABEL: isinf_half:
				; GFX7SELDAG: ; %bb.0:
				; GFX7SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v0, v0
				; GFX7SELDAG-NEXT: s_movk_i32 s4, 0x7c00
				; GFX7SELDAG-NEXT: v_and_b32_e32 v0, 0x7fff, v0
				; GFX7SELDAG-NEXT: v_cmp_eq_u32_e32 vcc, s4, v0
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX7GLISEL-LABEL: isinf_half:
				; GFX7GLISEL: ; %bb.0:
				; GFX7GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v0, v0
				; GFX7GLISEL-NEXT: v_mov_b32_e32 v1, 0x204
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isinf_half:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_mov_b32_e32 v1, 0x204
				; GFX8CHECK-NEXT: v_cmp_class_f16_e32 vcc, v0, v1
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isinf_half:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_mov_b32_e32 v1, 0x204
				; GFX9CHECK-NEXT: v_cmp_class_f16_e32 vcc, v0, v1
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isinf_half:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f16_e64 s4, v0, 0x204
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isinf_half:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f16_e64 s0, v0, 0x204
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f16(half %x, i32 516) ; 0x204 = "inf"
				ret i1 %1
				}

				define i1 @isfinite_half(half %x) nounwind {
				; GFX7SELDAG-LABEL: isfinite_half:
				; GFX7SELDAG: ; %bb.0:
				; GFX7SELDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7SELDAG-NEXT: v_cvt_f16_f32_e32 v0, v0
				; GFX7SELDAG-NEXT: s_movk_i32 s4, 0x7c00
				; GFX7SELDAG-NEXT: v_and_b32_e32 v0, 0x7fff, v0
				; GFX7SELDAG-NEXT: v_cmp_gt_i32_e32 vcc, s4, v0
				; GFX7SELDAG-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7SELDAG-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX7GLISEL-LABEL: isfinite_half:
				; GFX7GLISEL: ; %bb.0:
				; GFX7GLISEL-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7GLISEL-NEXT: v_cvt_f32_f16_e32 v0, v0
				; GFX7GLISEL-NEXT: v_mov_b32_e32 v1, 0x1f8
				; GFX7GLISEL-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX7GLISEL-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7GLISEL-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isfinite_half:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_mov_b32_e32 v1, 0x1f8
				; GFX8CHECK-NEXT: v_cmp_class_f16_e32 vcc, v0, v1
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isfinite_half:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_mov_b32_e32 v1, 0x1f8
				; GFX9CHECK-NEXT: v_cmp_class_f16_e32 vcc, v0, v1
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isfinite_half:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f16_e64 s4, v0, 0x1f8
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isfinite_half:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f16_e64 s0, v0, 0x1f8
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f16(half %x, i32 504) ; 0x1f8 = "finite"
				ret i1 %1
				}

				declare i1 @llvm.is.fpclass.f16(half, i32)
				declare <2 x i1> @llvm.is.fpclass.v2f16(<2 x half>, i32)
				declare <3 x i1> @llvm.is.fpclass.v3f16(<3 x half>, i32)
				declare <4 x i1> @llvm.is.fpclass.v4f16(<4 x half>, i32)

llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.ll

This file was added.

				; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx704 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX7CHECK %s
				; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx704 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX7CHECK %s
				; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx803 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX8CHECK %s
				arsenmUnsubmitted Done Reply Inline Actions Should use some share prefixes, a lot of these functions are the same. Also needs a gfx7 and 8 run lines for the half promotion arsenm: Should use some share prefixes, a lot of these functions are the same. Also needs a gfx7 and 8…
				JanekvOAuthorUnsubmitted Done Reply Inline Actions I'm not that well versed in how gfx7 should do half promotion. I feel like either gfx7 selectiondag or gfx7 globalisel half promotion tests are incorrect (and if not, selectiondag version does seem suboptimal). JanekvO: I'm not that well versed in how gfx7 should do half promotion. I feel like either gfx7…
				; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx803 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX8CHECK %s
				arsenmUnsubmitted Done Reply Inline Actions Should also test/handle globalisel arsenm: Should also test/handle globalisel
				; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx908 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX9CHECK %s
				; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx908 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX9CHECK %s
				; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx1031 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX10CHECK %s
				; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx1031 -verify-machineinstrs < %s \| FileCheck --check-prefixes=GFX10CHECK %s
				; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx1100 -verify-machineinstrs < %s \| FileCheck --check-prefix=GFX11CHECK %s
				; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx1100 -verify-machineinstrs < %s \| FileCheck --check-prefix=GFX11CHECK %s

				define i1 @isnan_float(float %x) nounwind {
				arsenmUnsubmitted Done Reply Inline Actions Can you also add some cases where the input will be an SGPR? arsenm: Can you also add some cases where the input will be an SGPR?
				; GFX7CHECK-LABEL: isnan_float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f32(float %x, i32 3) ; nan
				ret i1 %1
				}

				define <2 x i1> @isnan_v2float(<2 x float> %x) nounwind {
				; GFX7CHECK-LABEL: isnan_v2float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_v2float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_v2float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_v2float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v1, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_v2float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v1, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <2 x i1> @llvm.is.fpclass.v2f32(<2 x float> %x, i32 3) ; nan
				ret <2 x i1> %1
				}

				define <3 x i1> @isnan_v3float(<3 x float> %x) nounwind {
				; GFX7CHECK-LABEL: isnan_v3float:
				arsenmUnsubmitted Done Reply Inline Actions s/float/f32 in these function names arsenm: s/float/f32 in these function names
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_v3float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_v3float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_v3float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v1, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v2, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_v3float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v1, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v2, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <3 x i1> @llvm.is.fpclass.v3f32(<3 x float> %x, i32 3) ; nan
				ret <3 x i1> %1
				}

				define <4 x i1> @isnan_v4float(<4 x float> %x) nounwind {
				; GFX7CHECK-LABEL: isnan_v4float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_v4float:
				arsenmUnsubmitted Done Reply Inline Actions Should add some vector cases too arsenm: Should add some vector cases too
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_v4float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_v4float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v1, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v2, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v3, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_v4float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v1, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v2, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v3, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <4 x i1> @llvm.is.fpclass.v4f32(<4 x float> %x, i32 3) ; nan
				ret <4 x i1> %1
				}

				define <5 x i1> @isnan_v5float(<5 x float> %x) nounwind {
				; GFX7CHECK-LABEL: isnan_v5float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_v5float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_v5float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_v5float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v1, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v2, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v3, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v4, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_v5float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v1, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v2, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v3, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v4, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <5 x i1> @llvm.is.fpclass.v5f32(<5 x float> %x, i32 3) ; nan
				ret <5 x i1> %1
				}

				define <6 x i1> @isnan_v6float(<6 x float> %x) nounwind {
				; GFX7CHECK-LABEL: isnan_v6float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_v6float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_v6float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_v6float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v1, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v2, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v3, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v4, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v5, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_v6float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v1, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v2, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v3, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v4, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v5, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <6 x i1> @llvm.is.fpclass.v6f32(<6 x float> %x, i32 3) ; nan
				ret <6 x i1> %1
				}

				define <7 x i1> @isnan_v7float(<7 x float> %x) nounwind {
				; GFX7CHECK-LABEL: isnan_v7float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v6, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_v7float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v6, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_v7float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v6, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_v7float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v1, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v2, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v3, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v4, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v5, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v6, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_v7float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v1, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v2, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v3, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v4, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v5, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v6, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <7 x i1> @llvm.is.fpclass.v7f32(<7 x float> %x, i32 3) ; nan
				ret <7 x i1> %1
				}

				define <8 x i1> @isnan_v8float(<8 x float> %x) nounwind {
				; GFX7CHECK-LABEL: isnan_v8float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v6, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v7, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v7, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_v8float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v6, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v7, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v7, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_v8float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v6, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v7, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v7, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_v8float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v1, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v2, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v3, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v4, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v5, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v6, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v7, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v7, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_v8float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v1, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v2, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v3, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v4, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v5, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v6, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v7, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v7, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <8 x i1> @llvm.is.fpclass.v8f32(<8 x float> %x, i32 3) ; nan
				ret <8 x i1> %1
				}

				define <16 x i1> @isnan_v16float(<16 x float> %x) nounwind {
				; GFX7CHECK-LABEL: isnan_v16float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v6, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v7, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v7, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v8, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v8, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v9, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v9, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v10, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v10, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v11, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v11, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v12, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v12, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v13, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v13, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v14, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v14, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v15, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v15, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_v16float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v6, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v7, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v7, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v8, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v8, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v9, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v9, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v10, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v10, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v11, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v11, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v12, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v12, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v13, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v13, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v14, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v14, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v15, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v15, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_v16float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v1, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v2, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v3, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v4, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v5, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v6, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v7, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v7, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v8, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v8, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v9, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v9, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v10, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v10, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v11, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v11, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v12, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v12, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v13, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v13, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v14, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v14, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v15, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v15, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_v16float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v1, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v2, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v3, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v4, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v5, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v6, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v7, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v7, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v8, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v8, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v9, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v9, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v10, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v10, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v11, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v11, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v12, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v12, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v13, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v13, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v14, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v14, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v15, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v15, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_v16float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v1, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v2, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v2, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v3, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v4, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v4, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v5, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v5, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v6, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v6, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v7, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v7, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v8, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v8, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v9, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v9, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v10, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v10, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v11, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v11, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v12, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v12, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v13, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v13, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v14, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v14, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v15, 3
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v15, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <16 x i1> @llvm.is.fpclass.v16f32(<16 x float> %x, i32 3) ; nan
				ret <16 x i1> %1
				}

				define i1 @isnan_double(double %x) nounwind {
				; GFX7CHECK-LABEL: isnan_double:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f64_e64 s[4:5], v[0:1], 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_double:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f64_e64 s[4:5], v[0:1], 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_double:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f64_e64 s[4:5], v[0:1], 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_double:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f64_e64 s4, v[0:1], 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_double:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f64_e64 s0, v[0:1], 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f64(double %x, i32 3) ; nan
				ret i1 %1
				}

				define i1 @isnan_float_strictfp(float %x) strictfp nounwind {
				; GFX7CHECK-LABEL: isnan_float_strictfp:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_float_strictfp:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_float_strictfp:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f32_e64 s[4:5], v0, 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_float_strictfp:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_float_strictfp:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f32(float %x, i32 3) ; nan
				ret i1 %1
				}

				define i1 @isnan_double_strictfp(double %x) strictfp nounwind {
				; GFX7CHECK-LABEL: isnan_double_strictfp:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_cmp_class_f64_e64 s[4:5], v[0:1], 3
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnan_double_strictfp:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_cmp_class_f64_e64 s[4:5], v[0:1], 3
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnan_double_strictfp:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_cmp_class_f64_e64 s[4:5], v[0:1], 3
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnan_double_strictfp:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f64_e64 s4, v[0:1], 3
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnan_double_strictfp:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f64_e64 s0, v[0:1], 3
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f64(double %x, i32 3) ; nan
				ret i1 %1
				}

				define i1 @isinf_float(float %x) nounwind {
				; GFX7CHECK-LABEL: isinf_float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_mov_b32_e32 v1, 0x204
				; GFX7CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isinf_float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_mov_b32_e32 v1, 0x204
				; GFX8CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isinf_float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_mov_b32_e32 v1, 0x204
				; GFX9CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isinf_float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 0x204
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isinf_float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 0x204
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f32(float %x, i32 516) ; 0x204 = "inf"
				ret i1 %1
				}

				define i1 @isinf_double(double %x) nounwind {
				; GFX7CHECK-LABEL: isinf_double:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_mov_b32_e32 v2, 0x204
				; GFX7CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[0:1], v2
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isinf_double:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_mov_b32_e32 v2, 0x204
				; GFX8CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[0:1], v2
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isinf_double:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_mov_b32_e32 v2, 0x204
				; GFX9CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[0:1], v2
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isinf_double:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f64_e64 s4, v[0:1], 0x204
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isinf_double:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f64_e64 s0, v[0:1], 0x204
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f64(double %x, i32 516) ; 0x204 = "inf"
				ret i1 %1
				}

				define i1 @isfinite_float(float %x) nounwind {
				; GFX7CHECK-LABEL: isfinite_float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_mov_b32_e32 v1, 0x1f8
				; GFX7CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isfinite_float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_mov_b32_e32 v1, 0x1f8
				; GFX8CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isfinite_float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_mov_b32_e32 v1, 0x1f8
				; GFX9CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isfinite_float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 0x1f8
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isfinite_float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 0x1f8
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f32(float %x, i32 504) ; 0x1f8 = "finite"
				ret i1 %1
				}

				define i1 @isfinite_double(double %x) nounwind {
				; GFX7CHECK-LABEL: isfinite_double:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_mov_b32_e32 v2, 0x1f8
				; GFX7CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[0:1], v2
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isfinite_double:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_mov_b32_e32 v2, 0x1f8
				; GFX8CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[0:1], v2
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isfinite_double:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_mov_b32_e32 v2, 0x1f8
				; GFX9CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[0:1], v2
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isfinite_double:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f64_e64 s4, v[0:1], 0x1f8
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isfinite_double:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f64_e64 s0, v[0:1], 0x1f8
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f64(double %x, i32 504) ; 0x1f8 = "finite"
				ret i1 %1
				}

				define i1 @isnormal_float(float %x) nounwind {
				; GFX7CHECK-LABEL: isnormal_float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_mov_b32_e32 v1, 0x108
				; GFX7CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnormal_float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_mov_b32_e32 v1, 0x108
				; GFX8CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnormal_float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_mov_b32_e32 v1, 0x108
				; GFX9CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnormal_float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 0x108
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnormal_float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 0x108
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f32(float %x, i32 264) ; 0x108 = "normal"
				ret i1 %1
				}

				define <2 x i1> @isnormal_v2double(<2 x double> %x) nounwind {
				; GFX7CHECK-LABEL: isnormal_v2double:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_mov_b32_e32 v4, 0x108
				; GFX7CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[0:1], v4
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[2:3], v4
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, vcc
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: isnormal_v2double:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_mov_b32_e32 v4, 0x108
				; GFX8CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[0:1], v4
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[2:3], v4
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, vcc
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: isnormal_v2double:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_mov_b32_e32 v4, 0x108
				; GFX9CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[0:1], v4
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9CHECK-NEXT: v_cmp_class_f64_e32 vcc, v[2:3], v4
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, vcc
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: isnormal_v2double:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f64_e64 s4, v[0:1], 0x108
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: v_cmp_class_f64_e64 s4, v[2:3], 0x108
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: isnormal_v2double:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f64_e64 s0, v[0:1], 0x108
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1) \| instskip(SKIP_1) \| instid1(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: v_cmp_class_f64_e64 s0, v[2:3], 0x108
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v1, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call <2 x i1> @llvm.is.fpclass.v2f64(<2 x double> %x, i32 264) ; 0x108 = "normal"
				ret <2 x i1> %1
				}

				define i1 @issubnormal_float(float %x) nounwind {
				; GFX7CHECK-LABEL: issubnormal_float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_mov_b32_e32 v1, 0x90
				; GFX7CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: issubnormal_float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_mov_b32_e32 v1, 0x90
				; GFX8CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: issubnormal_float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_mov_b32_e32 v1, 0x90
				; GFX9CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: issubnormal_float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 0x90
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: issubnormal_float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 0x90
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f32(float %x, i32 144) ; 0x90 = "subnormal"
				ret i1 %1
				}

				define i1 @iszero_float(float %x) nounwind {
				; GFX7CHECK-LABEL: iszero_float:
				; GFX7CHECK: ; %bb.0:
				; GFX7CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX7CHECK-NEXT: v_mov_b32_e32 v1, 0x60
				; GFX7CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX7CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX7CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX8CHECK-LABEL: iszero_float:
				; GFX8CHECK: ; %bb.0:
				; GFX8CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX8CHECK-NEXT: v_mov_b32_e32 v1, 0x60
				; GFX8CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX8CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX8CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX9CHECK-LABEL: iszero_float:
				; GFX9CHECK: ; %bb.0:
				; GFX9CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX9CHECK-NEXT: v_mov_b32_e32 v1, 0x60
				; GFX9CHECK-NEXT: v_cmp_class_f32_e32 vcc, v0, v1
				; GFX9CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, vcc
				; GFX9CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX10CHECK-LABEL: iszero_float:
				; GFX10CHECK: ; %bb.0:
				; GFX10CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX10CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX10CHECK-NEXT: v_cmp_class_f32_e64 s4, v0, 0x60
				; GFX10CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s4
				; GFX10CHECK-NEXT: s_setpc_b64 s[30:31]
				;
				; GFX11CHECK-LABEL: iszero_float:
				; GFX11CHECK: ; %bb.0:
				; GFX11CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
				; GFX11CHECK-NEXT: s_waitcnt_vscnt null, 0x0
				; GFX11CHECK-NEXT: v_cmp_class_f32_e64 s0, v0, 0x60
				; GFX11CHECK-NEXT: s_delay_alu instid0(VALU_DEP_1)
				; GFX11CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s0
				; GFX11CHECK-NEXT: s_setpc_b64 s[30:31]
				%1 = call i1 @llvm.is.fpclass.f32(float %x, i32 96) ; 0x60 = "zero"
				ret i1 %1
				}

				declare i1 @llvm.is.fpclass.f32(float, i32)
				declare i1 @llvm.is.fpclass.f64(double, i32)
				declare <2 x i1> @llvm.is.fpclass.v2f32(<2 x float>, i32)
				declare <3 x i1> @llvm.is.fpclass.v3f32(<3 x float>, i32)
				declare <4 x i1> @llvm.is.fpclass.v4f32(<4 x float>, i32)
				declare <5 x i1> @llvm.is.fpclass.v5f32(<5 x float>, i32)
				declare <6 x i1> @llvm.is.fpclass.v6f32(<6 x float>, i32)
				declare <7 x i1> @llvm.is.fpclass.v7f32(<7 x float>, i32)
				declare <8 x i1> @llvm.is.fpclass.v8f32(<8 x float>, i32)
				declare <16 x i1> @llvm.is.fpclass.v16f32(<16 x float>, i32)
				declare <2 x i1> @llvm.is.fpclass.v2f64(<2 x double>, i32)
				arsenmUnsubmitted Done Reply Inline Actions v3f16 and v4f16 are also potentially interesting arsenm: v3f16 and v4f16 are also potentially interesting