This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AMDGPU/
-
Target/
-
AMDGPU/
-
AsmParser/
-
AMDGPUAsmParser.cpp
-
Disassembler/
-
AMDGPUDisassembler.h
-
AMDGPUDisassembler.cpp
3/4
GCNDPPCombine.cpp
-
SIFoldOperands.cpp
3/4
SIInstrInfo.cpp
-
SIInstrInfo.td
4/5
SIInstructions.td
3/6
SIRegisterInfo.td
3/3
SIShrinkInstructions.cpp
-
Utils/
-
AMDGPUBaseInfo.h
-
AMDGPUBaseInfo.cpp
-
VOP1Instructions.td
-
VOP2Instructions.td
2/2
VOP3Instructions.td
-
VOPCInstructions.td
-
VOPInstructions.td
-
test/
-
CodeGen/AMDGPU/
-
AMDGPU/
-
GlobalISel/
-
fma.ll
-
inst-select-ashr.s16.mir
-
inst-select-fcanonicalize.mir
-
inst-select-fcmp.s16.mir
-
inst-select-fmaxnum-ieee.s16.mir
-
inst-select-fmaxnum.s16.mir
-
inst-select-fminnum-ieee.s16.mir
-
inst-select-fminnum.s16.mir
-
inst-select-fptosi.mir
-
inst-select-fptoui.mir
-
inst-select-icmp.s16.mir
-
inst-select-lshr.s16.mir
-
inst-select-pattern-smed3.s16.mir
-
inst-select-pattern-umed3.s16.mir
-
inst-select-shl.s16.mir
2/2
inst-select-sitofp.mir
-
inst-select-uitofp.mir
-
irtranslator-inline-asm.ll
-
attr-amdgpu-flat-work-group-size-vgpr-limit.ll
-
coalescer-early-clobber-subreg.mir
-
gfx10-shrink-mad-fma.mir
-
gfx10-twoaddr-fma.mir
-
gfx11-twoaddr-fma.mir
-
inline-asm.i128.ll
-
partial-regcopy-and-spill-missed-at-regalloc.ll
-
preserve-hi16.ll
-
shrink-mad-fma.mir
-
spill-vector-superclass.ll
-
strict_fma.f16.ll
-
true16-ra-f128-fail.mir
-
true16-ra-pre-gfx11-regression-test.mir
-
twoaddr-fma.mir
-
vopc_dpp.mir
-
MC/AMDGPU/
-
AMDGPU/
-
gfx11_asm_vop1_t16_err.s
-
gfx11_asm_vop1_t16_promote.s
-
gfx11_asm_vop2_t16_err.s
-
gfx11_asm_vop2_t16_promote.s
-
gfx11_asm_vopc_t16_err.s
-
gfx11_asm_vopc_t16_promote.s
-
gfx11_asm_vopcx_t16_err.s
-
gfx11_asm_vopcx_t16_promote.s

Differential D133723

[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C
ClosedPublic

Authored by Joe_Nash on Sep 12 2022, 1:11 PM.

Download Raw Diff

Details

Reviewers

foad
arsenm
rampitec
nhaehnle
dp

Group Reviewers

Restricted Project

Commits

rGb982ba2a6e0f: [AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C

Summary

Due to the encoding changes in GFX11, we had a hack in place that
disables the use of VGPRs above 128. This patch removes the need for
that hack.

We introduce a new register class VGPR_32_Lo128 which is used for 16-bit
operands of VOP1, VOP2, and VOPC instructions. This register class only has the
low 128 VGPRs, but is otherwise identical to VGPR_32. Therefore, 16-bit VOP1,
VOP2, and VOPC instructions are correctly limited to use the first 128
VGPRs, while the other instructions can freely use all 256.

We introduce new pseduo-instructions used on GFX11 which have the suffix
t16 (True 16) to use the VGPR_32_Lo128 register class.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,070 ms	x64 debian > libFuzzer.libFuzzer::fuzzer-leak.test

Event Timeline

Joe_Nash created this revision.Sep 12 2022, 1:11 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2022, 1:11 PM

Herald added subscribers: kosarev, foad, mstorsjo and 14 others. · View Herald Transcript

Joe_Nash requested review of this revision.Sep 12 2022, 1:11 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2022, 1:11 PM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

Joe_Nash added reviewers: foad, arsenm, rampitec, nhaehnle, dp, Restricted Project.Sep 12 2022, 1:12 PM

Harbormaster completed remote builds in B186214: Diff 459540.Sep 12 2022, 2:08 PM

Joe, is this class really needed? The patch overhauls all VOP 16 bit instructions with none of them turned into True16 [just yet?]. A true 16 bit instruction shall use a 16 bit register, not a 32 bit VGPR. I.e. operands shall belong to a class composed of (VGPR_LO16, VGPR_HI16) or (VGPR_LO16, VGPR_HI16, SGPR_LO16) if scalars are accepted. Therefore, I would expect special classes limiting those, not VGPR_32 itself. As is the patch limits the use of the 16 bit operations to half of the available registers. What's the plan here?

llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
155	No else after return.

rampitec added inline comments.Sep 12 2022, 2:31 PM

llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2880 ↗	(On Diff #459540)	You probably do not need PSet for it, it is not handled anywhere.

foad added inline comments.Sep 13 2022, 2:14 AM

llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
126	Don't need the `llvm::` (there are a few occurrences in this file and in SIShrinkInstructions.cpp).
127	Maybe add a brief comment here? These instructions are "shrinkable" but we don't want to shrink them pre-RA because <reasons>.
608–610	This may be true but I don't see why you're asserting it here in particular. Couldn't this go into isShrinkable, near the other check that you added?

foad added inline comments.Sep 13 2022, 3:09 AM

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sitofp.mir
92–93	This is undesirable. We don't want to see _f128 classes here. We should be selecting the _e64 form of the V_CVT instruction instead. If you rebase on 3743f9afeb51e0b7bdf2269583f32b7e35369168 you will see the same problem for fptosi as well as sitofp. The fix is to change the patterns in SIInstructions.td to always use the _e64 forms: https://reviews.llvm.org/differential/diff/459699/ Pre-GFX11 this should be pretty harmless even if selecting the _e64 forms doesn't actually give any benefit.

We introduce a new register class VGPR_32_F128 which is used for
encodings VOP1, VOP2, and VOPC. This register class only has the first
128 VGPRs, but is otherwise identical to VGPR_32. Therefore, VOP1, VOP2,
and VOPC instructions are correctly limited to use the first 128
VGPRs, while the other instructions can freely use all 256.

This paragraph needs to explain that the new register class is used for 16-bit operands only. The whole point of this patch is that 32-bit operands will no longer be restricted to using the first 128 VGPRs.

arsenm added inline comments.Sep 13 2022, 5:01 AM

llvm/lib/Target/AMDGPU/SIRegisterInfo.td
557	I don't know what "_F128" is supposed to mean. I read this as a class for long double

foad added inline comments.Sep 13 2022, 5:23 AM

llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
145	I would prefer to change SIShrinkInstructions in a more general way that does not specifically check for True16 instructions: D133769

foad mentioned this in D133769: [AMDGPU] Don't shrink VOP3 instructions pre-RA on GFX10+.Sep 13 2022, 5:24 AM

foad added inline comments.Sep 13 2022, 5:44 AM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
3279–3281	My understanding is that having a _T16_e32 instruction here should work correctly, but it is undesirable because it has more restrictive register classes than the _T16_e64 form. So perhaps we want a more general assertion in (or just before) the register allocator that there are no _T16_e32 instructions present? I'm not sure how or where to implement that.

In D133723#3784870, @rampitec wrote:

Joe, is this class really needed? The patch overhauls all VOP 16 bit instructions with none of them turned into True16 [just yet?]. A true 16 bit instruction shall use a 16 bit register, not a 32 bit VGPR. I.e. operands shall belong to a class composed of (VGPR_LO16, VGPR_HI16) or (VGPR_LO16, VGPR_HI16, SGPR_LO16) if scalars are accepted. Therefore, I would expect special classes limiting those, not VGPR_32 itself. As is the patch limits the use of the 16 bit operations to half of the available registers. What's the plan here?

This is not the intended final state of true 16 bit instructions. You are correct that a full solution would use VGPR_LO16, VGPR_HI16 or something like that. However, this patch is a step towards that. This patch is strictly better than what we had before (modulo some shrinking issues), because only 16-bit operands of VOP1, VOP2, and VOPC instructions are limited to half the registers, rather than all operands of all instructions. There is very little wasted code as well, because VGPR_32_F128 can be directly replaced with VGPR_16_F128 in time. I expect tracking down code quality issues from using VGPR_16 to take quite some time, hence why I have submitted the patch at this point.

In D133723#3786055, @foad wrote:

We introduce a new register class VGPR_32_F128 which is used for
encodings VOP1, VOP2, and VOPC. This register class only has the first
128 VGPRs, but is otherwise identical to VGPR_32. Therefore, VOP1, VOP2,
and VOPC instructions are correctly limited to use the first 128
VGPRs, while the other instructions can freely use all 256.

This paragraph needs to explain that the new register class is used for 16-bit operands only. The whole point of this patch is that 32-bit operands will no longer be restricted to using the first 128 VGPRs.

done

llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
608–610	isShrinkable guards against shrinking in this pass, but this assert was supposed to check if previous passes shrunk anything. In theory, it's not necessary, but I put the assert for future proofing.
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
3279–3281	From the perspective of the RA, allocating a _T16_e32 will work correctly. In this particular function, there is a restriction. V_FMAC_F16_T16_e32 -> V_FMA_F16_gfx9_e64 should not be allowed, because the register class in the former is VGPR_32_F128 and in the latter is VGPR_32. So it would require a COPY or something, which we are not doing.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2880 ↗	(On Diff #459540)	I don't fully understand this, but MachineLICMBase calls getRegPressureSetLimit and will hit llvm_unreachable without the PSet for VGPR_32_F128
llvm/lib/Target/AMDGPU/SIRegisterInfo.td
557	It is short for First 128. Is a set with only the first 128 VGPRs. I will add a comment noting this.
llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
145	That seems fine to me. I will rebase on that when its landed.
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sitofp.mir
92–93	I picked up those changes, thanks!

followed review suggestions and rebased

Harbormaster completed remote builds in B186428: Diff 459840.Sep 13 2022, 1:23 PM

rampitec added inline comments.Sep 13 2022, 1:29 PM

llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2880 ↗	(On Diff #459540)	Add `let GeneratePressureSet = 0;` to the RC definition and remove this.

I'm happy with the general approach and the C++ parts. Can anyone take a closer look at the TableGen parts - maybe @dp or @rampitec? Thanks!

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
3279–3281	Is a COPY needed in that situation? I thought perhaps the register allocator would just handle it, if the register classes overlapped. But I'm not sure. Anyway I guess this assert is fine for now at least.

Joe_Nash marked an inline comment as done.Sep 14 2022, 7:29 AM

Joe_Nash added inline comments.

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
3279–3281	The MIR verifier will fail immediately after the twoaddressinstruction pass if you make an instruction like that.

removed pressure set. Rebased on 2e8863b6a11f12d31490bc054da4d47c6adc8143

Harbormaster completed remote builds in B186618: Diff 460081.Sep 14 2022, 8:18 AM

arsenm added inline comments.Sep 14 2022, 10:15 AM

llvm/lib/Target/AMDGPU/SIRegisterInfo.td
557	Lo128 would probably be more consistent terminology over "first"
llvm/lib/Target/AMDGPU/VOP3Instructions.td
928–931	Since the _T16 doesn't appear in the instruction mnemonic it should be lowercased

dp added inline comments.Sep 14 2022, 11:20 AM

llvm/lib/Target/AMDGPU/SIInstructions.td
928	The names `cvt16to32_e64` and `cvt32to16_e64` are misleading. They should be the other way around.
987	Why are these patterns special? `_E64` variants have no VGPR limitations. Is this required for future changes?

rampitec added inline comments.Sep 14 2022, 2:07 PM

llvm/lib/Target/AMDGPU/SIRegisterInfo.td
557	I probably agree to that. F128 also hints a long double to me. With that I still hope this class is transitional and will be replaced by a real 16 bit RC.

Joe_Nash marked 4 inline comments as done.Sep 14 2022, 2:27 PM

Joe_Nash added inline comments.

llvm/lib/Target/AMDGPU/SIInstructions.td
928	Thanks, I have fixed the names.
987	This is simply to account for the fact that we have new pseudo instructions for each instruction with 16 bit operands on GFX11. Therefore if a pattern directly directly refers to an instruction we need to duplicate it, one for each pseudo.
llvm/lib/Target/AMDGPU/SIRegisterInfo.td
557	Ok, replaced _F128 with _Lo128. Yes, this is a transitional class.
llvm/lib/Target/AMDGPU/VOP3Instructions.td
928–931	replaced _T16 with _t16 everywhere

replaced _T16 with _t16 and _F128 with _Lo128. corrected name of CVT instruction in isel pat

LGTM

This revision is now accepted and ready to land.Sep 14 2022, 2:39 PM

correct comment F128 -> Lo128

Update commit message for renaming and the separate patch disabling pre-RA shrinking.

Harbormaster completed remote builds in B186740: Diff 460250.Sep 14 2022, 4:05 PM

In D133723#3791006, @Joe_Nash wrote:

Update commit message for renaming and the separate patch disabling pre-RA shrinking.

Can you copy the new commit message into the summary of this patch? You have to do that manually.

llvm/lib/Target/AMDGPU/SIInstructions.td
987	Is this required for future changes? Yes. In future the _t16_e64 version will use 16-bit register classes for 16-bit operands but the regular _e64 version will continue to use 32-bit classes for them.
llvm/lib/Target/AMDGPU/SIRegisterInfo.td
557	Just saying: Lo128 is not ideal either because it will be confused with the _LO16/HI16 classes which mean something completely different.

Joe_Nash retitled this revision from [AMDGPU][GFX11] Use VGPR_32_F128 for VOP1,2,C to [AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C.Sep 15 2022, 7:02 AM

Joe_Nash edited the summary of this revision. (Show Details)

Updated some folding/shrinking cases that needed to have true16 pseudo instructions added.

Harbormaster completed remote builds in B187232: Diff 460878.Sep 16 2022, 3:21 PM

Use the correct t16 instruction when setting mode reg to fix an issue found in testing. All known issue are resolved, please take a final look.

Still LGTM

Harbormaster completed remote builds in B187570: Diff 461328.Sep 19 2022, 2:37 PM

foad accepted this revision.Sep 20 2022, 2:22 AM

foad added inline comments.

llvm/lib/Target/AMDGPU/SIModeRegister.cpp
177 ↗	(On Diff #461328)	This is fine for now. As a follow up I'll try changing FPTRUNC_UPWARD_PSEUDO into an e64 instruction, so we can go back to using a simple setDesc call here.

This revision was landed with ongoing or failed builds.Sep 20 2022, 7:28 AM

Closed by commit rGb982ba2a6e0f: [AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C (authored by Joe_Nash). · Explain Why

This revision was automatically updated to reflect the committed changes.

Joe_Nash added a commit: rGb982ba2a6e0f: [AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C.

foad mentioned this in D134723: [AMDGPU] Set memory bound occupancy based on addressable VGPRs.Sep 28 2022, 5:28 AM

Joe_Nash mentioned this in D134897: [AMDGPU] Fix V_CMP_CLASS_F16_t16_e64 src1 type..Sep 29 2022, 11:16 AM

Joe_Nash mentioned this in rG203d0b0ee136: [AMDGPU] Fix V_CMP_CLASS_F16_t16_e64 src1 type..Oct 5 2022, 8:16 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

AMDGPU/

AsmParser/

AMDGPUAsmParser.cpp

32 lines

Disassembler/

AMDGPUDisassembler.h

1 line

AMDGPUDisassembler.cpp

7 lines

6 lines

10 lines

21 lines

86 lines

158 lines

52 lines

SIShrinkInstructions.cpp

32 lines

Utils/

3 lines

26 lines

224 lines

247 lines

26 lines

308 lines

25 lines

test/

CodeGen/

AMDGPU/

GlobalISel/

fma.ll

2 lines

inst-select-ashr.s16.mir

16 lines

inst-select-fcanonicalize.mir

4 lines

inst-select-fcmp.s16.mir

28 lines

inst-select-fmaxnum-ieee.s16.mir

4 lines

inst-select-fmaxnum.s16.mir

4 lines

inst-select-fminnum-ieee.s16.mir

4 lines

inst-select-fminnum.s16.mir

4 lines

inst-select-fptosi.mir

36 lines

inst-select-fptoui.mir

36 lines

inst-select-icmp.s16.mir

32 lines

inst-select-lshr.s16.mir

16 lines

inst-select-pattern-smed3.s16.mir

14 lines

inst-select-pattern-umed3.s16.mir

14 lines

inst-select-shl.s16.mir

16 lines

inst-select-sitofp.mir

12 lines

inst-select-uitofp.mir

12 lines

irtranslator-inline-asm.ll

32 lines

attr-amdgpu-flat-work-group-size-vgpr-limit.ll

12 lines

coalescer-early-clobber-subreg.mir

8 lines

gfx10-shrink-mad-fma.mir

gfx10-twoaddr-fma.mir

84 lines

gfx11-twoaddr-fma.mir

101 lines

inline-asm.i128.ll

24 lines

partial-regcopy-and-spill-missed-at-regalloc.ll

24 lines

preserve-hi16.ll

2 lines

	shrink-mad-fma.mir
	gfx10-shrink-mad-fma.mir

81 lines

spill-vector-superclass.ll

4 lines

strict_fma.f16.ll

12 lines

true16-ra-f128-fail.mir

34 lines

true16-ra-pre-gfx11-regression-test.mir

55 lines

twoaddr-fma.mir

84 lines

vopc_dpp.mir

50 lines

MC/

AMDGPU/

gfx11_asm_vop1_t16_err.s

498 lines

gfx11_asm_vop1_t16_promote.s

1473 lines

gfx11_asm_vop2_t16_err.s

228 lines

gfx11_asm_vop2_t16_promote.s

192 lines

gfx11_asm_vopc_t16_err.s

1973 lines

gfx11_asm_vopc_t16_promote.s

1973 lines

gfx11_asm_vopcx_t16_err.s

542 lines

gfx11_asm_vopcx_t16_promote.s

542 lines

Diff 460250

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 337 Lines • ▼ Show 20 Lines	bool isVReg32OrOff() const {
return isOff() \|\| isVReg32();		return isOff() \|\| isVReg32();
}		}

bool isNull() const {		bool isNull() const {
return isRegKind() && getReg() == AMDGPU::SGPR_NULL;		return isRegKind() && getReg() == AMDGPU::SGPR_NULL;
}		}

bool isVRegWithInputMods() const;		bool isVRegWithInputMods() const;
		bool isT16VRegWithInputMods() const;

bool isSDWAOperand(MVT type) const;		bool isSDWAOperand(MVT type) const;
bool isSDWAFP16Operand() const;		bool isSDWAFP16Operand() const;
bool isSDWAFP32Operand() const;		bool isSDWAFP32Operand() const;
bool isSDWAInt16Operand() const;		bool isSDWAInt16Operand() const;
bool isSDWAInt32Operand() const;		bool isSDWAInt32Operand() const;

bool isImmTy(ImmTy ImmT) const {		bool isImmTy(ImmTy ImmT) const {
▲ Show 20 Lines • Show All 163 Lines • ▼ Show 20 Lines	public:
bool isVCSrcB32() const {		bool isVCSrcB32() const {
return isRegOrInlineNoMods(AMDGPU::VS_32RegClassID, MVT::i32);		return isRegOrInlineNoMods(AMDGPU::VS_32RegClassID, MVT::i32);
}		}

bool isVCSrcB64() const {		bool isVCSrcB64() const {
return isRegOrInlineNoMods(AMDGPU::VS_64RegClassID, MVT::i64);		return isRegOrInlineNoMods(AMDGPU::VS_64RegClassID, MVT::i64);
}		}

		bool isVCSrcTB16_Lo128() const {
		return isRegOrInlineNoMods(AMDGPU::VS_32_Lo128RegClassID, MVT::i16);
		}

bool isVCSrcB16() const {		bool isVCSrcB16() const {
return isRegOrInlineNoMods(AMDGPU::VS_32RegClassID, MVT::i16);		return isRegOrInlineNoMods(AMDGPU::VS_32RegClassID, MVT::i16);
}		}

bool isVCSrcV2B16() const {		bool isVCSrcV2B16() const {
return isVCSrcB16();		return isVCSrcB16();
}		}

bool isVCSrcF32() const {		bool isVCSrcF32() const {
return isRegOrInlineNoMods(AMDGPU::VS_32RegClassID, MVT::f32);		return isRegOrInlineNoMods(AMDGPU::VS_32RegClassID, MVT::f32);
}		}

bool isVCSrcF64() const {		bool isVCSrcF64() const {
return isRegOrInlineNoMods(AMDGPU::VS_64RegClassID, MVT::f64);		return isRegOrInlineNoMods(AMDGPU::VS_64RegClassID, MVT::f64);
}		}

		bool isVCSrcTF16_Lo128() const {
		return isRegOrInlineNoMods(AMDGPU::VS_32_Lo128RegClassID, MVT::f16);
		}

bool isVCSrcF16() const {		bool isVCSrcF16() const {
return isRegOrInlineNoMods(AMDGPU::VS_32RegClassID, MVT::f16);		return isRegOrInlineNoMods(AMDGPU::VS_32RegClassID, MVT::f16);
}		}

bool isVCSrcV2F16() const {		bool isVCSrcV2F16() const {
return isVCSrcF16();		return isVCSrcF16();
}		}

bool isVSrcB32() const {		bool isVSrcB32() const {
return isVCSrcF32() \|\| isLiteralImm(MVT::i32) \|\| isExpr();		return isVCSrcF32() \|\| isLiteralImm(MVT::i32) \|\| isExpr();
}		}

bool isVSrcB64() const {		bool isVSrcB64() const {
return isVCSrcF64() \|\| isLiteralImm(MVT::i64);		return isVCSrcF64() \|\| isLiteralImm(MVT::i64);
}		}

		bool isVSrcTB16_Lo128() const {
		return isVCSrcTB16_Lo128() \|\| isLiteralImm(MVT::i16);
		}

bool isVSrcB16() const {		bool isVSrcB16() const {
return isVCSrcB16() \|\| isLiteralImm(MVT::i16);		return isVCSrcB16() \|\| isLiteralImm(MVT::i16);
}		}

bool isVSrcV2B16() const {		bool isVSrcV2B16() const {
return isVSrcB16() \|\| isLiteralImm(MVT::v2i16);		return isVSrcB16() \|\| isLiteralImm(MVT::v2i16);
}		}

Show All 16 Lines	public:
bool isVSrcF32() const {		bool isVSrcF32() const {
return isVCSrcF32() \|\| isLiteralImm(MVT::f32) \|\| isExpr();		return isVCSrcF32() \|\| isLiteralImm(MVT::f32) \|\| isExpr();
}		}

bool isVSrcF64() const {		bool isVSrcF64() const {
return isVCSrcF64() \|\| isLiteralImm(MVT::f64);		return isVCSrcF64() \|\| isLiteralImm(MVT::f64);
}		}

		bool isVSrcTF16_Lo128() const {
		return isVCSrcTF16_Lo128() \|\| isLiteralImm(MVT::f16);
		}

bool isVSrcF16() const {		bool isVSrcF16() const {
return isVCSrcF16() \|\| isLiteralImm(MVT::f16);		return isVCSrcF16() \|\| isLiteralImm(MVT::f16);
}		}

bool isVSrcV2F16() const {		bool isVSrcV2F16() const {
return isVSrcF16() \|\| isLiteralImm(MVT::v2f16);		return isVSrcF16() \|\| isLiteralImm(MVT::v2f16);
}		}

▲ Show 20 Lines • Show All 1,446 Lines • ▼ Show 20 Lines

bool AMDGPUOperand::isVRegWithInputMods() const {		bool AMDGPUOperand::isVRegWithInputMods() const {
return isRegClass(AMDGPU::VGPR_32RegClassID) \|\|		return isRegClass(AMDGPU::VGPR_32RegClassID) \|\|
// GFX90A allows DPP on 64-bit operands.		// GFX90A allows DPP on 64-bit operands.
(isRegClass(AMDGPU::VReg_64RegClassID) &&		(isRegClass(AMDGPU::VReg_64RegClassID) &&
AsmParser->getFeatureBits()[AMDGPU::Feature64BitDPP]);		AsmParser->getFeatureBits()[AMDGPU::Feature64BitDPP]);
}		}

		bool AMDGPUOperand::isT16VRegWithInputMods() const {
		return isRegClass(AMDGPU::VGPR_32_Lo128RegClassID);
		}

bool AMDGPUOperand::isSDWAOperand(MVT type) const {		bool AMDGPUOperand::isSDWAOperand(MVT type) const {
if (AsmParser->isVI())		if (AsmParser->isVI())
return isVReg32();		return isVReg32();
else if (AsmParser->isGFX9Plus())		else if (AsmParser->isGFX9Plus())
return isRegClass(AMDGPU::VS_32RegClassID) \|\| isInlinableImm(type);		return isRegClass(AMDGPU::VS_32RegClassID) \|\| isInlinableImm(type);
else		else
return false;		return false;
}		}
▲ Show 20 Lines • Show All 6,215 Lines • ▼ Show 20 Lines	if (AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::omod) != -1) {
addOptionalImmOperand(Inst, Operands, OptionalIdx, AMDGPUOperand::ImmTyOModSI);		addOptionalImmOperand(Inst, Operands, OptionalIdx, AMDGPUOperand::ImmTyOModSI);
}		}

// Special case v_mac_{f16, f32} and v_fmac_{f16, f32} (gfx906/gfx10+):		// Special case v_mac_{f16, f32} and v_fmac_{f16, f32} (gfx906/gfx10+):
// it has src2 register operand that is tied to dst operand		// it has src2 register operand that is tied to dst operand
// we don't allow modifiers for this operand in assembler so src2_modifiers		// we don't allow modifiers for this operand in assembler so src2_modifiers
// should be 0.		// should be 0.
if (Opc == AMDGPU::V_MAC_F32_e64_gfx6_gfx7 \|\|		if (Opc == AMDGPU::V_MAC_F32_e64_gfx6_gfx7 \|\|
Opc == AMDGPU::V_MAC_F32_e64_gfx10 \|\|		Opc == AMDGPU::V_MAC_F32_e64_gfx10 \|\| Opc == AMDGPU::V_MAC_F32_e64_vi \|\|
Opc == AMDGPU::V_MAC_F32_e64_vi \|\|
Opc == AMDGPU::V_MAC_LEGACY_F32_e64_gfx6_gfx7 \|\|		Opc == AMDGPU::V_MAC_LEGACY_F32_e64_gfx6_gfx7 \|\|
Opc == AMDGPU::V_MAC_LEGACY_F32_e64_gfx10 \|\|		Opc == AMDGPU::V_MAC_LEGACY_F32_e64_gfx10 \|\|
Opc == AMDGPU::V_MAC_F16_e64_vi \|\|		Opc == AMDGPU::V_MAC_F16_e64_vi \|\| Opc == AMDGPU::V_FMAC_F64_e64_gfx90a \|\|
Opc == AMDGPU::V_FMAC_F64_e64_gfx90a \|\|
Opc == AMDGPU::V_FMAC_F32_e64_gfx10 \|\|		Opc == AMDGPU::V_FMAC_F32_e64_gfx10 \|\|
Opc == AMDGPU::V_FMAC_F32_e64_gfx11 \|\|		Opc == AMDGPU::V_FMAC_F32_e64_gfx11 \|\| Opc == AMDGPU::V_FMAC_F32_e64_vi \|\|
Opc == AMDGPU::V_FMAC_F32_e64_vi \|\|
Opc == AMDGPU::V_FMAC_LEGACY_F32_e64_gfx10 \|\|		Opc == AMDGPU::V_FMAC_LEGACY_F32_e64_gfx10 \|\|
Opc == AMDGPU::V_FMAC_DX9_ZERO_F32_e64_gfx11 \|\|		Opc == AMDGPU::V_FMAC_DX9_ZERO_F32_e64_gfx11 \|\|
Opc == AMDGPU::V_FMAC_F16_e64_gfx10 \|\|		Opc == AMDGPU::V_FMAC_F16_e64_gfx10 \|\|
Opc == AMDGPU::V_FMAC_F16_e64_gfx11) {		Opc == AMDGPU::V_FMAC_F16_t16_e64_gfx11) {
auto it = Inst.begin();		auto it = Inst.begin();
std::advance(it, AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src2_modifiers));		std::advance(it, AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src2_modifiers));
it = Inst.insert(it, MCOperand::createImm(0)); // no modifiers for src2		it = Inst.insert(it, MCOperand::createImm(0)); // no modifiers for src2
++it;		++it;
// Copy the operand to ensure it's not invalidated when Inst grows.		// Copy the operand to ensure it's not invalidated when Inst grows.
Inst.insert(it, MCOperand(Inst.getOperand(0))); // src2 = dst		Inst.insert(it, MCOperand(Inst.getOperand(0))); // src2 = dst
}		}
}		}
▲ Show 20 Lines • Show All 996 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h

Show First 20 Lines • Show All 161 Lines • ▼ Show 20 Lines	public:
DecodeStatus convertSDWAInst(MCInst &MI) const;		DecodeStatus convertSDWAInst(MCInst &MI) const;
DecodeStatus convertDPP8Inst(MCInst &MI) const;		DecodeStatus convertDPP8Inst(MCInst &MI) const;
DecodeStatus convertMIMGInst(MCInst &MI) const;		DecodeStatus convertMIMGInst(MCInst &MI) const;
DecodeStatus convertVOP3DPPInst(MCInst &MI) const;		DecodeStatus convertVOP3DPPInst(MCInst &MI) const;
DecodeStatus convertVOP3PDPPInst(MCInst &MI) const;		DecodeStatus convertVOP3PDPPInst(MCInst &MI) const;
DecodeStatus convertVOPCDPPInst(MCInst &MI) const;		DecodeStatus convertVOPCDPPInst(MCInst &MI) const;

MCOperand decodeOperand_VGPR_32(unsigned Val) const;		MCOperand decodeOperand_VGPR_32(unsigned Val) const;
		MCOperand decodeOperand_VGPR_32_Lo128(unsigned Val) const;
MCOperand decodeOperand_VRegOrLds_32(unsigned Val) const;		MCOperand decodeOperand_VRegOrLds_32(unsigned Val) const;

MCOperand decodeOperand_VS_32(unsigned Val) const;		MCOperand decodeOperand_VS_32(unsigned Val) const;
MCOperand decodeOperand_VS_64(unsigned Val) const;		MCOperand decodeOperand_VS_64(unsigned Val) const;
MCOperand decodeOperand_VS_128(unsigned Val) const;		MCOperand decodeOperand_VS_128(unsigned Val) const;
MCOperand decodeOperand_VSrc16(unsigned Val) const;		MCOperand decodeOperand_VSrc16(unsigned Val) const;
MCOperand decodeOperand_VSrcV216(unsigned Val) const;		MCOperand decodeOperand_VSrcV216(unsigned Val) const;
MCOperand decodeOperand_VSrcV232(unsigned Val) const;		MCOperand decodeOperand_VSrcV232(unsigned Val) const;
▲ Show 20 Lines • Show All 117 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	static DecodeStatus StaticDecoderName(MCInst &Inst, unsigned Imm, \
auto DAsm = static_cast<const AMDGPUDisassembler *>(Decoder); \		auto DAsm = static_cast<const AMDGPUDisassembler *>(Decoder); \
return addOperand(Inst, DAsm->DecoderName(Imm)); \		return addOperand(Inst, DAsm->DecoderName(Imm)); \
}		}

#define DECODE_OPERAND_REG(RegClass) \		#define DECODE_OPERAND_REG(RegClass) \
DECODE_OPERAND(Decode##RegClass##RegisterClass, decodeOperand_##RegClass)		DECODE_OPERAND(Decode##RegClass##RegisterClass, decodeOperand_##RegClass)

DECODE_OPERAND_REG(VGPR_32)		DECODE_OPERAND_REG(VGPR_32)
		DECODE_OPERAND_REG(VGPR_32_Lo128)
DECODE_OPERAND_REG(VRegOrLds_32)		DECODE_OPERAND_REG(VRegOrLds_32)
DECODE_OPERAND_REG(VS_32)		DECODE_OPERAND_REG(VS_32)
DECODE_OPERAND_REG(VS_64)		DECODE_OPERAND_REG(VS_64)
DECODE_OPERAND_REG(VS_128)		DECODE_OPERAND_REG(VS_128)

DECODE_OPERAND_REG(VReg_64)		DECODE_OPERAND_REG(VReg_64)
DECODE_OPERAND_REG(VReg_96)		DECODE_OPERAND_REG(VReg_96)
DECODE_OPERAND_REG(VReg_128)		DECODE_OPERAND_REG(VReg_128)
▲ Show 20 Lines • Show All 469 Lines • ▼ Show 20 Lines	if (Res && (MI.getOpcode() == AMDGPU::V_MAC_F32_e64_vi \|\|
MI.getOpcode() == AMDGPU::V_MAC_F16_e64_vi \|\|		MI.getOpcode() == AMDGPU::V_MAC_F16_e64_vi \|\|
MI.getOpcode() == AMDGPU::V_FMAC_F64_e64_gfx90a \|\|		MI.getOpcode() == AMDGPU::V_FMAC_F64_e64_gfx90a \|\|
MI.getOpcode() == AMDGPU::V_FMAC_F32_e64_vi \|\|		MI.getOpcode() == AMDGPU::V_FMAC_F32_e64_vi \|\|
MI.getOpcode() == AMDGPU::V_FMAC_F32_e64_gfx10 \|\|		MI.getOpcode() == AMDGPU::V_FMAC_F32_e64_gfx10 \|\|
MI.getOpcode() == AMDGPU::V_FMAC_F32_e64_gfx11 \|\|		MI.getOpcode() == AMDGPU::V_FMAC_F32_e64_gfx11 \|\|
MI.getOpcode() == AMDGPU::V_FMAC_LEGACY_F32_e64_gfx10 \|\|		MI.getOpcode() == AMDGPU::V_FMAC_LEGACY_F32_e64_gfx10 \|\|
MI.getOpcode() == AMDGPU::V_FMAC_DX9_ZERO_F32_e64_gfx11 \|\|		MI.getOpcode() == AMDGPU::V_FMAC_DX9_ZERO_F32_e64_gfx11 \|\|
MI.getOpcode() == AMDGPU::V_FMAC_F16_e64_gfx10 \|\|		MI.getOpcode() == AMDGPU::V_FMAC_F16_e64_gfx10 \|\|
MI.getOpcode() == AMDGPU::V_FMAC_F16_e64_gfx11)) {		MI.getOpcode() == AMDGPU::V_FMAC_F16_t16_e64_gfx11)) {
// Insert dummy unused src2_modifiers.		// Insert dummy unused src2_modifiers.
insertNamedMCOperand(MI, MCOperand::createImm(0),		insertNamedMCOperand(MI, MCOperand::createImm(0),
AMDGPU::OpName::src2_modifiers);		AMDGPU::OpName::src2_modifiers);
}		}

if (Res && (MCII->get(MI.getOpcode()).TSFlags &		if (Res && (MCII->get(MI.getOpcode()).TSFlags &
(SIInstrFlags::MUBUF \| SIInstrFlags::FLAT \| SIInstrFlags::SMRD))) {		(SIInstrFlags::MUBUF \| SIInstrFlags::FLAT \| SIInstrFlags::SMRD))) {
int CPolPos = AMDGPU::getNamedOperandIdx(MI.getOpcode(),		int CPolPos = AMDGPU::getNamedOperandIdx(MI.getOpcode(),
▲ Show 20 Lines • Show All 518 Lines • ▼ Show 20 Lines
MCOperand AMDGPUDisassembler::decodeOperand_VSrcV216(unsigned Val) const {		MCOperand AMDGPUDisassembler::decodeOperand_VSrcV216(unsigned Val) const {
return decodeSrcOp(OPWV216, Val);		return decodeSrcOp(OPWV216, Val);
}		}

MCOperand AMDGPUDisassembler::decodeOperand_VSrcV232(unsigned Val) const {		MCOperand AMDGPUDisassembler::decodeOperand_VSrcV232(unsigned Val) const {
return decodeSrcOp(OPWV232, Val);		return decodeSrcOp(OPWV232, Val);
}		}

		MCOperand AMDGPUDisassembler::decodeOperand_VGPR_32_Lo128(unsigned Val) const {
		return createRegOperand(AMDGPU::VGPR_32_Lo128RegClassID, Val);
		}

MCOperand AMDGPUDisassembler::decodeOperand_VGPR_32(unsigned Val) const {		MCOperand AMDGPUDisassembler::decodeOperand_VGPR_32(unsigned Val) const {
// Some instructions have operand restrictions beyond what the encoding		// Some instructions have operand restrictions beyond what the encoding
// allows. Some ordinarily VSrc_32 operands are VGPR_32, so clear the extra		// allows. Some ordinarily VSrc_32 operands are VGPR_32, so clear the extra
// high bit.		// high bit.
Val &= 255;		Val &= 255;

return createRegOperand(AMDGPU::VGPR_32RegClassID, Val);		return createRegOperand(AMDGPU::VGPR_32RegClassID, Val);
}		}
▲ Show 20 Lines • Show All 1,037 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp

Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	bool GCNDPPCombine::isShrinkable(MachineInstr &MI) const {
unsigned Op = MI.getOpcode();		unsigned Op = MI.getOpcode();
if (!TII->isVOP3(Op)) {		if (!TII->isVOP3(Op)) {
return false;		return false;
}		}
if (!TII->hasVALU32BitEncoding(Op)) {		if (!TII->hasVALU32BitEncoding(Op)) {
LLVM_DEBUG(dbgs() << " Inst hasn't e32 equivalent\n");		LLVM_DEBUG(dbgs() << " Inst hasn't e32 equivalent\n");
return false;		return false;
}		}
		// Do not shrink True16 instructions pre-RA to avoid the restriction in
		foadUnsubmitted Done Reply Inline Actions Don't need the `llvm::` (there are a few occurrences in this file and in SIShrinkInstructions.cpp). foad: Don't need the `llvm::` (there are a few occurrences in this file and in SIShrinkInstructions.
		// register allocation from only being able to use 128 VGPRs
		foadUnsubmitted Done Reply Inline Actions Maybe add a brief comment here? These instructions are "shrinkable" but we don't want to shrink them pre-RA because <reasons>. foad: Maybe add a brief comment here? These instructions are "shrinkable" but we don't want to shrink…
		if (AMDGPU::isTrue16Inst(Op))
		return false;
if (const auto *SDst = TII->getNamedOperand(MI, AMDGPU::OpName::sdst)) {		if (const auto *SDst = TII->getNamedOperand(MI, AMDGPU::OpName::sdst)) {
// Give up if there are any uses of the sdst in carry-out or VOPC.		// Give up if there are any uses of the sdst in carry-out or VOPC.
// The shrunken form of the instruction would write it to vcc instead of to		// The shrunken form of the instruction would write it to vcc instead of to
// a virtual register. If we rewrote the uses the shrinking would be		// a virtual register. If we rewrote the uses the shrinking would be
// possible.		// possible.
if (!MRI->use_nodbg_empty(SDst->getReg()))		if (!MRI->use_nodbg_empty(SDst->getReg()))
return false;		return false;
}		}
▲ Show 20 Lines • Show All 462 Lines • ▼ Show 20 Lines	bool GCNDPPCombine::combineDPPMov(MachineInstr &MovMI) const {
while (!Uses.empty()) {		while (!Uses.empty()) {
MachineOperand *Use = Uses.pop_back_val();		MachineOperand *Use = Uses.pop_back_val();
Rollback = true;		Rollback = true;

auto &OrigMI = *Use->getParent();		auto &OrigMI = *Use->getParent();
LLVM_DEBUG(dbgs() << " try: " << OrigMI);		LLVM_DEBUG(dbgs() << " try: " << OrigMI);

auto OrigOp = OrigMI.getOpcode();		auto OrigOp = OrigMI.getOpcode();
		assert((TII->get(OrigOp).Size != 4 \|\| !AMDGPU::isTrue16Inst(OrigOp)) &&
		"There should not be e32 True16 instructions pre-RA");
if (OrigOp == AMDGPU::REG_SEQUENCE) {		if (OrigOp == AMDGPU::REG_SEQUENCE) {
		foadUnsubmitted Not Done Reply Inline Actions This may be true but I don't see why you're asserting it here in particular. Couldn't this go into isShrinkable, near the other check that you added? foad: This may be true but I don't see why you're asserting it //here// in particular. Couldn't this…
		Joe_NashAuthorUnsubmitted Done Reply Inline Actions isShrinkable guards against shrinking in this pass, but this assert was supposed to check if previous passes shrunk anything. In theory, it's not necessary, but I put the assert for future proofing. Joe_Nash: isShrinkable guards against shrinking in this pass, but this assert was supposed to check if…
Register FwdReg = OrigMI.getOperand(0).getReg();		Register FwdReg = OrigMI.getOperand(0).getReg();
unsigned FwdSubReg = 0;		unsigned FwdSubReg = 0;

if (execMayBeModifiedBeforeAnyUse(*MRI, FwdReg, OrigMI)) {		if (execMayBeModifiedBeforeAnyUse(*MRI, FwdReg, OrigMI)) {
LLVM_DEBUG(dbgs() << " failed: EXEC mask should remain the same"		LLVM_DEBUG(dbgs() << " failed: EXEC mask should remain the same"
" for all uses\n");		" for all uses\n");
break;		break;
}		}
▲ Show 20 Lines • Show All 134 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

Show First 20 Lines • Show All 1,291 Lines • ▼ Show 20 Lines

// Clamp patterns are canonically selected to v_max_* instructions, so only		// Clamp patterns are canonically selected to v_max_* instructions, so only
// handle them.		// handle them.
const MachineOperand *SIFoldOperands::isClamp(const MachineInstr &MI) const {		const MachineOperand *SIFoldOperands::isClamp(const MachineInstr &MI) const {
unsigned Op = MI.getOpcode();		unsigned Op = MI.getOpcode();
switch (Op) {		switch (Op) {
case AMDGPU::V_MAX_F32_e64:		case AMDGPU::V_MAX_F32_e64:
case AMDGPU::V_MAX_F16_e64:		case AMDGPU::V_MAX_F16_e64:
		case AMDGPU::V_MAX_F16_t16_e64:
case AMDGPU::V_MAX_F64_e64:		case AMDGPU::V_MAX_F64_e64:
case AMDGPU::V_PK_MAX_F16: {		case AMDGPU::V_PK_MAX_F16: {
if (!TII->getNamedOperand(MI, AMDGPU::OpName::clamp)->getImm())		if (!TII->getNamedOperand(MI, AMDGPU::OpName::clamp)->getImm())
return nullptr;		return nullptr;

// Make sure sources are identical.		// Make sure sources are identical.
const MachineOperand *Src0 = TII->getNamedOperand(MI, AMDGPU::OpName::src0);		const MachineOperand *Src0 = TII->getNamedOperand(MI, AMDGPU::OpName::src0);
const MachineOperand *Src1 = TII->getNamedOperand(MI, AMDGPU::OpName::src1);		const MachineOperand *Src1 = TII->getNamedOperand(MI, AMDGPU::OpName::src1);
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	case AMDGPU::V_MUL_F32_e64: {
case 0x40000000: // 2.0		case 0x40000000: // 2.0
return SIOutMods::MUL2;		return SIOutMods::MUL2;
case 0x40800000: // 4.0		case 0x40800000: // 4.0
return SIOutMods::MUL4;		return SIOutMods::MUL4;
default:		default:
return SIOutMods::NONE;		return SIOutMods::NONE;
}		}
}		}
case AMDGPU::V_MUL_F16_e64: {		case AMDGPU::V_MUL_F16_e64:
		case AMDGPU::V_MUL_F16_t16_e64: {
switch (static_cast<uint16_t>(Val)) {		switch (static_cast<uint16_t>(Val)) {
case 0x3800: // 0.5		case 0x3800: // 0.5
return SIOutMods::DIV2;		return SIOutMods::DIV2;
case 0x4000: // 2.0		case 0x4000: // 2.0
return SIOutMods::MUL2;		return SIOutMods::MUL2;
case 0x4400: // 4.0		case 0x4400: // 4.0
return SIOutMods::MUL4;		return SIOutMods::MUL4;
default:		default:
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	if (OMod == SIOutMods::NONE \|\|
TII->hasModifiersSet(MI, AMDGPU::OpName::omod) \|\|		TII->hasModifiersSet(MI, AMDGPU::OpName::omod) \|\|
TII->hasModifiersSet(MI, AMDGPU::OpName::clamp))		TII->hasModifiersSet(MI, AMDGPU::OpName::clamp))
return std::make_pair(nullptr, SIOutMods::NONE);		return std::make_pair(nullptr, SIOutMods::NONE);

return std::make_pair(RegOp, OMod);		return std::make_pair(RegOp, OMod);
}		}
case AMDGPU::V_ADD_F64_e64:		case AMDGPU::V_ADD_F64_e64:
case AMDGPU::V_ADD_F32_e64:		case AMDGPU::V_ADD_F32_e64:
case AMDGPU::V_ADD_F16_e64: {		case AMDGPU::V_ADD_F16_e64:
		case AMDGPU::V_ADD_F16_t16_e64: {
// If output denormals are enabled, omod is ignored.		// If output denormals are enabled, omod is ignored.
if ((Op == AMDGPU::V_ADD_F32_e64 && MFI->getMode().FP32OutputDenormals) \|\|		if ((Op == AMDGPU::V_ADD_F32_e64 && MFI->getMode().FP32OutputDenormals) \|\|
((Op == AMDGPU::V_ADD_F64_e64 \|\| Op == AMDGPU::V_ADD_F16_e64) &&		((Op == AMDGPU::V_ADD_F64_e64 \|\| Op == AMDGPU::V_ADD_F16_e64 \|\|
		Op == AMDGPU::V_ADD_F16_t16_e64) &&
MFI->getMode().FP64FP16OutputDenormals))		MFI->getMode().FP64FP16OutputDenormals))
return std::make_pair(nullptr, SIOutMods::NONE);		return std::make_pair(nullptr, SIOutMods::NONE);

// Look through the DAGCombiner canonicalization fmul x, 2 -> fadd x, x		// Look through the DAGCombiner canonicalization fmul x, 2 -> fadd x, x
const MachineOperand *Src0 = TII->getNamedOperand(MI, AMDGPU::OpName::src0);		const MachineOperand *Src0 = TII->getNamedOperand(MI, AMDGPU::OpName::src0);
const MachineOperand *Src1 = TII->getNamedOperand(MI, AMDGPU::OpName::src1);		const MachineOperand *Src1 = TII->getNamedOperand(MI, AMDGPU::OpName::src1);

if (Src0->isReg() && Src1->isReg() && Src0->getReg() == Src1->getReg() &&		if (Src0->isReg() && Src1->isReg() && Src0->getReg() == Src1->getReg() &&
▲ Show 20 Lines • Show All 356 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,270 Lines • ▼ Show 20 Lines	if (SIInstrInfo::isWMMA(MI)) {

updateLiveVariables(LV, MI, *MIB);		updateLiveVariables(LV, MI, *MIB);
if (LIS)		if (LIS)
LIS->ReplaceMachineInstrInMaps(MI, *MIB);		LIS->ReplaceMachineInstrInMaps(MI, *MIB);

return MIB;		return MIB;
}		}

		assert(Opc != AMDGPU::V_FMAC_F16_t16_e32 &&
		"V_FMAC_F16_t16_e32 is not supported and not expected to be present "
		"pre-RA");
		foadUnsubmitted Done Reply Inline Actions My understanding is that having a _T16_e32 instruction here should work correctly, but it is undesirable because it has more restrictive register classes than the _T16_e64 form. So perhaps we want a more general assertion in (or just before) the register allocator that there are no _T16_e32 instructions present? I'm not sure how or where to implement that. foad: My understanding is that having a _T16_e32 instruction here should work correctly, but it is…
		Joe_NashAuthorUnsubmitted Done Reply Inline Actions From the perspective of the RA, allocating a _T16_e32 will work correctly. In this particular function, there is a restriction. V_FMAC_F16_T16_e32 -> V_FMA_F16_gfx9_e64 should not be allowed, because the register class in the former is VGPR_32_F128 and in the latter is VGPR_32. So it would require a COPY or something, which we are not doing. Joe_Nash: From the perspective of the RA, allocating a _T16_e32 will work correctly. In this particular…
		foadUnsubmitted Not Done Reply Inline Actions Is a COPY needed in that situation? I thought perhaps the register allocator would just handle it, if the register classes overlapped. But I'm not sure. Anyway I guess this assert is fine for now at least. foad: Is a COPY needed in that situation? I thought perhaps the register allocator would just handle…
		Joe_NashAuthorUnsubmitted Done Reply Inline Actions The MIR verifier will fail immediately after the twoaddressinstruction pass if you make an instruction like that. Joe_Nash: The MIR verifier will fail immediately after the twoaddressinstruction pass if you make an…

// Handle MAC/FMAC.		// Handle MAC/FMAC.
bool IsF16 = Opc == AMDGPU::V_MAC_F16_e32 \|\| Opc == AMDGPU::V_MAC_F16_e64 \|\|		bool IsF16 = Opc == AMDGPU::V_MAC_F16_e32 \|\| Opc == AMDGPU::V_MAC_F16_e64 \|\|
Opc == AMDGPU::V_FMAC_F16_e32 \|\| Opc == AMDGPU::V_FMAC_F16_e64;		Opc == AMDGPU::V_FMAC_F16_e32 \|\| Opc == AMDGPU::V_FMAC_F16_e64 \|\|
		Opc == AMDGPU::V_FMAC_F16_t16_e64;
bool IsFMA = Opc == AMDGPU::V_FMAC_F32_e32 \|\| Opc == AMDGPU::V_FMAC_F32_e64 \|\|		bool IsFMA = Opc == AMDGPU::V_FMAC_F32_e32 \|\| Opc == AMDGPU::V_FMAC_F32_e64 \|\|
Opc == AMDGPU::V_FMAC_LEGACY_F32_e32 \|\|		Opc == AMDGPU::V_FMAC_LEGACY_F32_e32 \|\|
Opc == AMDGPU::V_FMAC_LEGACY_F32_e64 \|\|		Opc == AMDGPU::V_FMAC_LEGACY_F32_e64 \|\|
Opc == AMDGPU::V_FMAC_F16_e32 \|\| Opc == AMDGPU::V_FMAC_F16_e64 \|\|		Opc == AMDGPU::V_FMAC_F16_e32 \|\| Opc == AMDGPU::V_FMAC_F16_e64 \|\|
		Opc == AMDGPU::V_FMAC_F16_t16_e64 \|\|
Opc == AMDGPU::V_FMAC_F64_e32 \|\| Opc == AMDGPU::V_FMAC_F64_e64;		Opc == AMDGPU::V_FMAC_F64_e32 \|\| Opc == AMDGPU::V_FMAC_F64_e64;
bool IsF64 = Opc == AMDGPU::V_FMAC_F64_e32 \|\| Opc == AMDGPU::V_FMAC_F64_e64;		bool IsF64 = Opc == AMDGPU::V_FMAC_F64_e32 \|\| Opc == AMDGPU::V_FMAC_F64_e64;
bool IsLegacy = Opc == AMDGPU::V_MAC_LEGACY_F32_e32 \|\|		bool IsLegacy = Opc == AMDGPU::V_MAC_LEGACY_F32_e32 \|\|
Opc == AMDGPU::V_MAC_LEGACY_F32_e64 \|\|		Opc == AMDGPU::V_MAC_LEGACY_F32_e64 \|\|
Opc == AMDGPU::V_FMAC_LEGACY_F32_e32 \|\|		Opc == AMDGPU::V_FMAC_LEGACY_F32_e32 \|\|
Opc == AMDGPU::V_FMAC_LEGACY_F32_e64;		Opc == AMDGPU::V_FMAC_LEGACY_F32_e64;
bool Src0Literal = false;		bool Src0Literal = false;

switch (Opc) {		switch (Opc) {
default:		default:
return nullptr;		return nullptr;
case AMDGPU::V_MAC_F16_e64:		case AMDGPU::V_MAC_F16_e64:
case AMDGPU::V_FMAC_F16_e64:		case AMDGPU::V_FMAC_F16_e64:
		case AMDGPU::V_FMAC_F16_t16_e64:
case AMDGPU::V_MAC_F32_e64:		case AMDGPU::V_MAC_F32_e64:
case AMDGPU::V_MAC_LEGACY_F32_e64:		case AMDGPU::V_MAC_LEGACY_F32_e64:
case AMDGPU::V_FMAC_F32_e64:		case AMDGPU::V_FMAC_F32_e64:
case AMDGPU::V_FMAC_LEGACY_F32_e64:		case AMDGPU::V_FMAC_LEGACY_F32_e64:
case AMDGPU::V_FMAC_F64_e64:		case AMDGPU::V_FMAC_F64_e64:
break;		break;
case AMDGPU::V_MAC_F16_e32:		case AMDGPU::V_MAC_F16_e32:
case AMDGPU::V_FMAC_F16_e32:		case AMDGPU::V_FMAC_F16_e32:
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	const auto killDef = [&]() -> void {
DefMI->removeOperand(I);		DefMI->removeOperand(I);
if (LV)		if (LV)
LV->getVarInfo(DefReg).AliveBlocks.clear();		LV->getVarInfo(DefReg).AliveBlocks.clear();
};		};

int64_t Imm;		int64_t Imm;
if (!Src0Literal && getFoldableImm(Src2, Imm, &DefMI)) {		if (!Src0Literal && getFoldableImm(Src2, Imm, &DefMI)) {
unsigned NewOpc =		unsigned NewOpc =
IsFMA ? (IsF16 ? AMDGPU::V_FMAAK_F16 : AMDGPU::V_FMAAK_F32)		IsFMA ? (IsF16 ? (ST.hasTrue16BitInsts() ? AMDGPU::V_FMAAK_F16_t16
		: AMDGPU::V_FMAAK_F16)
		: AMDGPU::V_FMAAK_F32)
: (IsF16 ? AMDGPU::V_MADAK_F16 : AMDGPU::V_MADAK_F32);		: (IsF16 ? AMDGPU::V_MADAK_F16 : AMDGPU::V_MADAK_F32);
if (pseudoToMCOpcode(NewOpc) != -1) {		if (pseudoToMCOpcode(NewOpc) != -1) {
MIB = BuildMI(MBB, MI, MI.getDebugLoc(), get(NewOpc))		MIB = BuildMI(MBB, MI, MI.getDebugLoc(), get(NewOpc))
.add(*Dst)		.add(*Dst)
.add(*Src0)		.add(*Src0)
.add(*Src1)		.add(*Src1)
.addImm(Imm);		.addImm(Imm);
updateLiveVariables(LV, MI, *MIB);		updateLiveVariables(LV, MI, *MIB);
if (LIS)		if (LIS)
LIS->ReplaceMachineInstrInMaps(MI, *MIB);		LIS->ReplaceMachineInstrInMaps(MI, *MIB);
killDef();		killDef();
return MIB;		return MIB;
}		}
}		}
unsigned NewOpc = IsFMA		unsigned NewOpc =
? (IsF16 ? AMDGPU::V_FMAMK_F16 : AMDGPU::V_FMAMK_F32)		IsFMA ? (IsF16 ? (ST.hasTrue16BitInsts() ? AMDGPU::V_FMAMK_F16_t16
		: AMDGPU::V_FMAMK_F16)
		: AMDGPU::V_FMAMK_F32)
: (IsF16 ? AMDGPU::V_MADMK_F16 : AMDGPU::V_MADMK_F32);		: (IsF16 ? AMDGPU::V_MADMK_F16 : AMDGPU::V_MADMK_F32);
if (!Src0Literal && getFoldableImm(Src1, Imm, &DefMI)) {		if (!Src0Literal && getFoldableImm(Src1, Imm, &DefMI)) {
if (pseudoToMCOpcode(NewOpc) != -1) {		if (pseudoToMCOpcode(NewOpc) != -1) {
MIB = BuildMI(MBB, MI, MI.getDebugLoc(), get(NewOpc))		MIB = BuildMI(MBB, MI, MI.getDebugLoc(), get(NewOpc))
.add(*Dst)		.add(*Dst)
.add(*Src0)		.add(*Src0)
.addImm(Imm)		.addImm(Imm)
.add(*Src2);		.add(*Src2);
updateLiveVariables(LV, MI, *MIB);		updateLiveVariables(LV, MI, *MIB);
▲ Show 20 Lines • Show All 5,112 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIInstrInfo.td

Show First 20 Lines • Show All 1,394 Lines • ▼ Show 20 Lines
def FP32SDWAInputMods : FPSDWAInputMods<FP32SDWAInputModsMatchClass>;		def FP32SDWAInputMods : FPSDWAInputMods<FP32SDWAInputModsMatchClass>;

def FPVRegInputModsMatchClass : AsmOperandClass {		def FPVRegInputModsMatchClass : AsmOperandClass {
let Name = "VRegWithFPInputMods";		let Name = "VRegWithFPInputMods";
let ParserMethod = "parseRegWithFPInputMods";		let ParserMethod = "parseRegWithFPInputMods";
let PredicateMethod = "isVRegWithInputMods";		let PredicateMethod = "isVRegWithInputMods";
}		}

		def FPT16VRegInputModsMatchClass : AsmOperandClass {
		let Name = "T16VRegWithFPInputMods";
		let ParserMethod = "parseRegWithFPInputMods";
		let PredicateMethod = "isT16VRegWithInputMods";
		}

def FPVRegInputMods : InputMods <FPVRegInputModsMatchClass> {		def FPVRegInputMods : InputMods <FPVRegInputModsMatchClass> {
let PrintMethod = "printOperandAndFPInputMods";		let PrintMethod = "printOperandAndFPInputMods";
}		}

		def FPT16VRegInputMods : InputMods <FPT16VRegInputModsMatchClass> {
		let PrintMethod = "printOperandAndFPInputMods";
		}

class IntSDWAInputModsMatchClass <int opSize> : AsmOperandClass {		class IntSDWAInputModsMatchClass <int opSize> : AsmOperandClass {
let Name = "SDWAWithInt"#opSize#"InputMods";		let Name = "SDWAWithInt"#opSize#"InputMods";
let ParserMethod = "parseRegOrImmWithIntInputMods";		let ParserMethod = "parseRegOrImmWithIntInputMods";
let PredicateMethod = "isSDWAInt"#opSize#"Operand";		let PredicateMethod = "isSDWAInt"#opSize#"Operand";
}		}

def Int16SDWAInputModsMatchClass : IntSDWAInputModsMatchClass<16>;		def Int16SDWAInputModsMatchClass : IntSDWAInputModsMatchClass<16>;
def Int32SDWAInputModsMatchClass : IntSDWAInputModsMatchClass<32>;		def Int32SDWAInputModsMatchClass : IntSDWAInputModsMatchClass<32>;
Show All 12 Lines
def Bin32SDWAInputMods : IntSDWAInputMods<Bin32SDWAInputModsMatchClass>;		def Bin32SDWAInputMods : IntSDWAInputMods<Bin32SDWAInputModsMatchClass>;

def IntVRegInputModsMatchClass : AsmOperandClass {		def IntVRegInputModsMatchClass : AsmOperandClass {
let Name = "VRegWithIntInputMods";		let Name = "VRegWithIntInputMods";
let ParserMethod = "parseRegWithIntInputMods";		let ParserMethod = "parseRegWithIntInputMods";
let PredicateMethod = "isVRegWithInputMods";		let PredicateMethod = "isVRegWithInputMods";
}		}

		def IntT16VRegInputModsMatchClass : AsmOperandClass {
		let Name = "T16VRegWithIntInputMods";
		let ParserMethod = "parseRegWithIntInputMods";
		let PredicateMethod = "isT16VRegWithInputMods";
		}

		def IntT16VRegInputMods : InputMods <IntT16VRegInputModsMatchClass> {
		let PrintMethod = "printOperandAndIntInputMods";
		}

def IntVRegInputMods : InputMods <IntVRegInputModsMatchClass> {		def IntVRegInputMods : InputMods <IntVRegInputModsMatchClass> {
let PrintMethod = "printOperandAndIntInputMods";		let PrintMethod = "printOperandAndIntInputMods";
}		}

class PackedFPInputModsMatchClass <int opSize> : AsmOperandClass {		class PackedFPInputModsMatchClass <int opSize> : AsmOperandClass {
let Name = "PackedFP"#opSize#"InputMods";		let Name = "PackedFP"#opSize#"InputMods";
let ParserMethod = "parseRegOrImm";		let ParserMethod = "parseRegOrImm";
let PredicateMethod = "isRegOrImm";		let PredicateMethod = "isRegOrImm";
▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines
class getVALUDstForVT<ValueType VT> {		class getVALUDstForVT<ValueType VT> {
RegisterOperand ret = !if(!eq(VT.Size, 32), VOPDstOperand<VGPR_32>,		RegisterOperand ret = !if(!eq(VT.Size, 32), VOPDstOperand<VGPR_32>,
!if(!eq(VT.Size, 128), VOPDstOperand<VReg_128>,		!if(!eq(VT.Size, 128), VOPDstOperand<VReg_128>,
!if(!eq(VT.Size, 64), VOPDstOperand<VReg_64>,		!if(!eq(VT.Size, 64), VOPDstOperand<VReg_64>,
!if(!eq(VT.Size, 16), VOPDstOperand<VGPR_32>,		!if(!eq(VT.Size, 16), VOPDstOperand<VGPR_32>,
VOPDstS64orS32)))); // else VT == i1		VOPDstS64orS32)))); // else VT == i1
}		}

		class getVALUDstForVT_t16<ValueType VT> {
		RegisterOperand ret = !if(!eq(VT.Size, 32), VOPDstOperand<VGPR_32>,
		!if(!eq(VT.Size, 128), VOPDstOperand<VReg_128>,
		!if(!eq(VT.Size, 64), VOPDstOperand<VReg_64>,
		!if(!eq(VT.Size, 16), VOPDstOperand<VGPR_32_Lo128>,
		VOPDstS64orS32)))); // else VT == i1
		}

// Returns the register class to use for the destination of VOP[12C]		// Returns the register class to use for the destination of VOP[12C]
// instructions with SDWA extension		// instructions with SDWA extension
class getSDWADstForVT<ValueType VT> {		class getSDWADstForVT<ValueType VT> {
RegisterOperand ret = !if(!eq(VT.Size, 1),		RegisterOperand ret = !if(!eq(VT.Size, 1),
SDWAVopcDst, // VOPC		SDWAVopcDst, // VOPC
VOPDstOperand<VGPR_32>); // VOP1/2 32-bit dst		VOPDstOperand<VGPR_32>); // VOP1/2 32-bit dst
}		}

// Returns the register class to use for source 0 of VOP[12C]		// Returns the register class to use for source 0 of VOP[12C]
// instructions for the given VT.		// instructions for the given VT.
class getVOPSrc0ForVT<ValueType VT> {		class getVOPSrc0ForVT<ValueType VT, bit IsTrue16> {
bit isFP = isFloatType<VT>.ret;		bit isFP = isFloatType<VT>.ret;

RegisterOperand ret =		RegisterOperand ret =
!if(isFP,		!if(isFP,
!if(!eq(VT.Size, 64),		!if(!eq(VT.Size, 64),
VSrc_f64,		VSrc_f64,
!if(!eq(VT.Value, f16.Value),		!if(!eq(VT.Value, f16.Value),
VSrc_f16,		!if(IsTrue16,
		VSrcT_f16_Lo128,
		VSrc_f16
		),
!if(!eq(VT.Value, v2f16.Value),		!if(!eq(VT.Value, v2f16.Value),
VSrc_v2f16,		VSrc_v2f16,
!if(!eq(VT.Value, v4f16.Value),		!if(!eq(VT.Value, v4f16.Value),
AVSrc_64,		AVSrc_64,
VSrc_f32		VSrc_f32
)		)
)		)
)		)
),		),
!if(!eq(VT.Size, 64),		!if(!eq(VT.Size, 64),
VSrc_b64,		VSrc_b64,
!if(!eq(VT.Value, i16.Value),		!if(!eq(VT.Value, i16.Value),
VSrc_b16,		!if(IsTrue16,
		VSrcT_b16_Lo128,
		VSrc_b16
		),
!if(!eq(VT.Value, v2i16.Value),		!if(!eq(VT.Value, v2i16.Value),
VSrc_v2b16,		VSrc_v2b16,
VSrc_b32		VSrc_b32
)		)
)		)
)		)
);		);
}		}

class getSOPSrcForVT<ValueType VT> {		class getSOPSrcForVT<ValueType VT> {
RegisterOperand ret = !if(!eq(VT.Size, 64), SSrc_b64, SSrc_b32);		RegisterOperand ret = !if(!eq(VT.Size, 64), SSrc_b64, SSrc_b32);
}		}

// Returns the vreg register class to use for source operand given VT		// Returns the vreg register class to use for source operand given VT
class getVregSrcForVT<ValueType VT> {		class getVregSrcForVT<ValueType VT> {
RegisterClass ret = !if(!eq(VT.Size, 128), VReg_128,		RegisterClass ret = !if(!eq(VT.Size, 128), VReg_128,
!if(!eq(VT.Size, 96), VReg_96,		!if(!eq(VT.Size, 96), VReg_96,
!if(!eq(VT.Size, 64), VReg_64,		!if(!eq(VT.Size, 64), VReg_64,
!if(!eq(VT.Size, 48), VReg_64,		!if(!eq(VT.Size, 48), VReg_64,
VGPR_32))));		VGPR_32))));
}		}

		class getVregSrcForVT_t16<ValueType VT> {
		RegisterClass ret = !if(!eq(VT.Size, 128), VReg_128,
		!if(!eq(VT.Size, 96), VReg_96,
		!if(!eq(VT.Size, 64), VReg_64,
		!if(!eq(VT.Size, 48), VReg_64,
		!if(!eq(VT.Size, 16), VGPR_32_Lo128,
		VGPR_32)))));
		}

class getSDWASrcForVT <ValueType VT> {		class getSDWASrcForVT <ValueType VT> {
bit isFP = isFloatType<VT>.ret;		bit isFP = isFloatType<VT>.ret;
RegisterOperand retFlt = !if(!eq(VT.Size, 16), SDWASrc_f16, SDWASrc_f32);		RegisterOperand retFlt = !if(!eq(VT.Size, 16), SDWASrc_f16, SDWASrc_f32);
RegisterOperand retInt = !if(!eq(VT.Size, 16), SDWASrc_i16, SDWASrc_i32);		RegisterOperand retInt = !if(!eq(VT.Size, 16), SDWASrc_i16, SDWASrc_i32);
RegisterOperand ret = !if(isFP, retFlt, retInt);		RegisterOperand ret = !if(isFP, retFlt, retInt);
}		}

// Returns the register class to use for sources of VOP3 instructions for the		// Returns the register class to use for sources of VOP3 instructions for the
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
}		}

// Return type of input modifiers operand specified input operand for DPP		// Return type of input modifiers operand specified input operand for DPP
class getSrcModDPP <ValueType VT> {		class getSrcModDPP <ValueType VT> {
bit isFP = isFloatType<VT>.ret;		bit isFP = isFloatType<VT>.ret;
Operand ret = !if(isFP, FPVRegInputMods, IntVRegInputMods);		Operand ret = !if(isFP, FPVRegInputMods, IntVRegInputMods);
}		}

		class getSrcModDPP_t16 <ValueType VT> {
		bit isFP = isFloatType<VT>.ret;
		Operand ret =
		!if (isFP,
		!if (!eq(VT.Value, f16.Value), FPT16VRegInputMods,
		FPVRegInputMods),
		!if (!eq(VT.Value, i16.Value), IntT16VRegInputMods,
		IntVRegInputMods));
		}

// Return type of input modifiers operand for specified input operand for DPP		// Return type of input modifiers operand for specified input operand for DPP
class getSrcModVOP3DPP <ValueType VT, bit EnableF32SrcMods> {		class getSrcModVOP3DPP <ValueType VT, bit EnableF32SrcMods> {
bit isFP = isFloatType<VT>.ret;		bit isFP = isFloatType<VT>.ret;
bit isPacked = isPackedType<VT>.ret;		bit isPacked = isPackedType<VT>.ret;
Operand ret =		Operand ret =
!if (isFP,		!if (isFP,
!if (!eq(VT.Value, f16.Value), FP16VCSrcInputMods,		!if (!eq(VT.Value, f16.Value), FP16VCSrcInputMods,
FP32VCSrcInputMods),		FP32VCSrcInputMods),
▲ Show 20 Lines • Show All 607 Lines • ▼ Show 20 Lines
}		}

class VOPProfile <list<ValueType> _ArgVT, bit _EnableF32SrcMods = 0,		class VOPProfile <list<ValueType> _ArgVT, bit _EnableF32SrcMods = 0,
bit _EnableClamp = 0> {		bit _EnableClamp = 0> {

field list<ValueType> ArgVT = _ArgVT;		field list<ValueType> ArgVT = _ArgVT;
field bit EnableF32SrcMods = _EnableF32SrcMods;		field bit EnableF32SrcMods = _EnableF32SrcMods;
field bit EnableClamp = _EnableClamp;		field bit EnableClamp = _EnableClamp;
		field bit IsTrue16 = 0;

field ValueType DstVT = ArgVT[0];		field ValueType DstVT = ArgVT[0];
field ValueType Src0VT = ArgVT[1];		field ValueType Src0VT = ArgVT[1];
field ValueType Src1VT = ArgVT[2];		field ValueType Src1VT = ArgVT[2];
field ValueType Src2VT = ArgVT[3];		field ValueType Src2VT = ArgVT[3];
field RegisterOperand DstRC = getVALUDstForVT<DstVT>.ret;		field RegisterOperand DstRC = getVALUDstForVT<DstVT>.ret;
field RegisterOperand DstRCDPP = DstRC;		field RegisterOperand DstRCDPP = DstRC;
field RegisterOperand DstRC64 = DstRC;		field RegisterOperand DstRC64 = DstRC;
field RegisterOperand DstRCVOP3DPP = DstRC64;		field RegisterOperand DstRCVOP3DPP = DstRC64;
field RegisterOperand DstRCSDWA = getSDWADstForVT<DstVT>.ret;		field RegisterOperand DstRCSDWA = getSDWADstForVT<DstVT>.ret;
field RegisterOperand Src0RC32 = getVOPSrc0ForVT<Src0VT>.ret;		field RegisterOperand Src0RC32 = getVOPSrc0ForVT<Src0VT, IsTrue16>.ret;
field RegisterOperand Src1RC32 = RegisterOperand<getVregSrcForVT<Src1VT>.ret>;		field RegisterOperand Src1RC32 = RegisterOperand<getVregSrcForVT<Src1VT>.ret>;
field RegisterOperand Src0RC64 = getVOP3SrcForVT<Src0VT>.ret;		field RegisterOperand Src0RC64 = getVOP3SrcForVT<Src0VT>.ret;
field RegisterOperand Src1RC64 = getVOP3SrcForVT<Src1VT>.ret;		field RegisterOperand Src1RC64 = getVOP3SrcForVT<Src1VT>.ret;
field RegisterOperand Src2RC64 = getVOP3SrcForVT<Src2VT>.ret;		field RegisterOperand Src2RC64 = getVOP3SrcForVT<Src2VT>.ret;
field RegisterClass Src0DPP = getVregSrcForVT<Src0VT>.ret;		field RegisterClass Src0DPP = getVregSrcForVT<Src0VT>.ret;
field RegisterClass Src1DPP = getVregSrcForVT<Src1VT>.ret;		field RegisterClass Src1DPP = getVregSrcForVT<Src1VT>.ret;
field RegisterClass Src2DPP = getVregSrcForVT<Src2VT>.ret;		field RegisterClass Src2DPP = getVregSrcForVT<Src2VT>.ret;
field RegisterOperand Src0VOP3DPP = VGPRSrc_32;		field RegisterOperand Src0VOP3DPP = VGPRSrc_32;
field RegisterOperand Src1VOP3DPP = VGPRSrc_32;		field RegisterOperand Src1VOP3DPP = VGPRSrc_32;
field RegisterOperand Src2VOP3DPP = getVOP3DPPSrcForVT<Src2VT>.ret;		field RegisterOperand Src2VOP3DPP = getVOP3DPPSrcForVT<Src2VT>.ret;
field RegisterOperand Src0SDWA = getSDWASrcForVT<Src0VT>.ret;		field RegisterOperand Src0SDWA = getSDWASrcForVT<Src0VT>.ret;
field RegisterOperand Src1SDWA = getSDWASrcForVT<Src0VT>.ret;		field RegisterOperand Src1SDWA = getSDWASrcForVT<Src0VT>.ret;
field Operand Src0Mod = getSrcMod<Src0VT, EnableF32SrcMods>.ret;		field Operand Src0Mod = getSrcMod<Src0VT, EnableF32SrcMods>.ret;
field Operand Src1Mod = getSrcMod<Src1VT, EnableF32SrcMods>.ret;		field Operand Src1Mod = getSrcMod<Src1VT, EnableF32SrcMods>.ret;
field Operand Src2Mod = getSrcMod<Src2VT, EnableF32SrcMods>.ret;		field Operand Src2Mod = getSrcMod<Src2VT, EnableF32SrcMods>.ret;
field Operand Src0ModDPP = getSrcModDPP<Src0VT>.ret;		field Operand Src0ModDPP = getSrcModDPP<Src0VT>.ret;
field Operand Src1ModDPP = getSrcModDPP<Src1VT>.ret;		field Operand Src1ModDPP = getSrcModDPP<Src1VT>.ret;
field Operand Src2ModDPP = getSrcModDPP<Src2VT>.ret;		field Operand Src2ModDPP = getSrcModDPP<Src2VT>.ret;
		field Operand Src0ModVOP3DPP = getSrcModDPP<Src0VT>.ret;
		field Operand Src1ModVOP3DPP = getSrcModDPP<Src1VT>.ret;
field Operand Src2ModVOP3DPP = getSrcModVOP3DPP<Src2VT, EnableF32SrcMods>.ret;		field Operand Src2ModVOP3DPP = getSrcModVOP3DPP<Src2VT, EnableF32SrcMods>.ret;
field Operand Src0ModSDWA = getSrcModSDWA<Src0VT>.ret;		field Operand Src0ModSDWA = getSrcModSDWA<Src0VT>.ret;
field Operand Src1ModSDWA = getSrcModSDWA<Src1VT>.ret;		field Operand Src1ModSDWA = getSrcModSDWA<Src1VT>.ret;


field bit HasDst = !ne(DstVT.Value, untyped.Value);		field bit HasDst = !ne(DstVT.Value, untyped.Value);
field bit HasDst32 = HasDst;		field bit HasDst32 = HasDst;
field bit EmitDst = HasDst; // force dst encoding, see v_movreld_b32 special case		field bit EmitDst = HasDst; // force dst encoding, see v_movreld_b32 special case
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	field dag InsDPP = !if(HasExtDPP,
(ins));		(ins));
field dag InsDPP16 = getInsDPP16<DstRCDPP, Src0DPP, Src1DPP, Src2DPP, NumSrcArgs,		field dag InsDPP16 = getInsDPP16<DstRCDPP, Src0DPP, Src1DPP, Src2DPP, NumSrcArgs,
HasModifiers, Src0ModDPP, Src1ModDPP, Src2ModDPP>.ret;		HasModifiers, Src0ModDPP, Src1ModDPP, Src2ModDPP>.ret;
field dag InsDPP8 = getInsDPP8<DstRCDPP, Src0DPP, Src1DPP, Src2DPP,		field dag InsDPP8 = getInsDPP8<DstRCDPP, Src0DPP, Src1DPP, Src2DPP,
NumSrcArgs, HasModifiers,		NumSrcArgs, HasModifiers,
Src0ModDPP, Src1ModDPP, Src2ModDPP>.ret;		Src0ModDPP, Src1ModDPP, Src2ModDPP>.ret;
field dag InsVOP3Base = getInsVOP3Base<Src0VOP3DPP, Src1VOP3DPP,		field dag InsVOP3Base = getInsVOP3Base<Src0VOP3DPP, Src1VOP3DPP,
Src2VOP3DPP, NumSrcArgs, HasClamp, HasModifiers, HasSrc2Mods, HasOMod,		Src2VOP3DPP, NumSrcArgs, HasClamp, HasModifiers, HasSrc2Mods, HasOMod,
Src0ModDPP, Src1ModDPP, Src2ModVOP3DPP, HasOpSel, IsVOP3P>.ret;		Src0ModVOP3DPP, Src1ModVOP3DPP, Src2ModVOP3DPP, HasOpSel, IsVOP3P>.ret;
field dag InsVOP3DPP = getInsVOP3DPP<InsVOP3Base, DstRCVOP3DPP, NumSrcArgs>.ret;		field dag InsVOP3DPP = getInsVOP3DPP<InsVOP3Base, DstRCVOP3DPP, NumSrcArgs>.ret;
field dag InsVOP3DPP16 = getInsVOP3DPP16<InsVOP3Base, DstRCVOP3DPP, NumSrcArgs>.ret;		field dag InsVOP3DPP16 = getInsVOP3DPP16<InsVOP3Base, DstRCVOP3DPP, NumSrcArgs>.ret;
field dag InsVOP3DPP8 = getInsVOP3DPP8<InsVOP3Base, DstRCVOP3DPP, NumSrcArgs>.ret;		field dag InsVOP3DPP8 = getInsVOP3DPP8<InsVOP3Base, DstRCVOP3DPP, NumSrcArgs>.ret;
field dag InsSDWA = getInsSDWA<Src0SDWA, Src1SDWA, NumSrcArgs,		field dag InsSDWA = getInsSDWA<Src0SDWA, Src1SDWA, NumSrcArgs,
HasSDWAOMod, Src0ModSDWA, Src1ModSDWA,		HasSDWAOMod, Src0ModSDWA, Src1ModSDWA,
DstVT>.ret;		DstVT>.ret;
field dag InsVOPDX = (ins Src0RC32:$src0X, Src1RC32:$vsrc1X);		field dag InsVOPDX = (ins Src0RC32:$src0X, Src1RC32:$vsrc1X);
// It is a slight misnomer to use the deferred f32 operand type for non-float		// It is a slight misnomer to use the deferred f32 operand type for non-float
Show All 39 Lines	class VOPProfile <list<ValueType> _ArgVT, bit _EnableF32SrcMods = 0,
let HasExt64BitDPP = 0;		let HasExt64BitDPP = 0;
let HasExtSDWA = 0;		let HasExtSDWA = 0;
let HasExtSDWA9 = 0;		let HasExtSDWA9 = 0;
}		}

class VOP_PAT_GEN <VOPProfile p, int mode=PatGenMode.NoPattern> : VOPProfile <p.ArgVT> {		class VOP_PAT_GEN <VOPProfile p, int mode=PatGenMode.NoPattern> : VOPProfile <p.ArgVT> {
let NeedPatGen = mode;		let NeedPatGen = mode;
}		}

		// VOPC_Profile_t16, VOPC_NoSdst_Profile_t16, VOPC_Class_Profile_t16,
		// VOPC_Class_NoSdst_Profile_t16, and VOP_MAC_F16_t16 do not inherit from this
		// class, so copy changes to this class in those profiles
		class VOPProfile_True16<VOPProfile P> : VOPProfile<P.ArgVT> {
		let IsTrue16 = 1;
		// Most DstVT are 16-bit, but not all
		let DstRC = getVALUDstForVT_t16<DstVT>.ret;
		let DstRC64 = getVALUDstForVT<DstVT>.ret;
		let Src1RC32 = RegisterOperand<getVregSrcForVT_t16<Src1VT>.ret>;
		let Src0DPP = getVregSrcForVT_t16<Src0VT>.ret;
		let Src1DPP = getVregSrcForVT_t16<Src1VT>.ret;
		let Src2DPP = getVregSrcForVT_t16<Src2VT>.ret;
		let Src0ModDPP = getSrcModDPP_t16<Src0VT>.ret;
		let Src1ModDPP = getSrcModDPP_t16<Src1VT>.ret;
		let Src2ModDPP = getSrcModDPP_t16<Src2VT>.ret;
		}

def VOP_F16_F16 : VOPProfile <[f16, f16, untyped, untyped]>;		def VOP_F16_F16 : VOPProfile<[f16, f16, untyped, untyped]>;
def VOP_F16_I16 : VOPProfile <[f16, i16, untyped, untyped]>;		def VOP_F16_I16 : VOPProfile <[f16, i16, untyped, untyped]>;
def VOP_I16_F16 : VOPProfile <[i16, f16, untyped, untyped]>;		def VOP_I16_F16 : VOPProfile <[i16, f16, untyped, untyped]>;
def VOP_I16_I16 : VOPProfile <[i16, i16, untyped, untyped]>;		def VOP_I16_I16 : VOPProfile <[i16, i16, untyped, untyped]>;

def VOP_F16_F16_F16 : VOPProfile <[f16, f16, f16, untyped]>;		def VOP_F16_F16_F16 : VOPProfile <[f16, f16, f16, untyped]>;
def VOP_F16_F16_I16 : VOPProfile <[f16, f16, i16, untyped]>;		def VOP_F16_F16_I16 : VOPProfile <[f16, f16, i16, untyped]>;
def VOP_F16_F16_I32 : VOPProfile <[f16, f16, i32, untyped]>;		def VOP_F16_F16_I32 : VOPProfile <[f16, f16, i32, untyped]>;
def VOP_I16_I16_I16 : VOPProfile <[i16, i16, i16, untyped]>;		def VOP_I16_I16_I16 : VOPProfile <[i16, i16, i16, untyped]>;
▲ Show 20 Lines • Show All 405 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIInstructions.td

Show First 20 Lines • Show All 915 Lines • ▼ Show 20 Lines	def : GCNPat <
(f64 (fadd (f64 (VOP3Mods f64:$x, i32:$mods)),		(f64 (fadd (f64 (VOP3Mods f64:$x, i32:$mods)),
(f64 (fneg (f64 (ffloor (f64 (VOP3Mods f64:$x, i32:$mods)))))))),		(f64 (fneg (f64 (ffloor (f64 (VOP3Mods f64:$x, i32:$mods)))))))),
(V_FRACT_F64_e64 $mods, $x)		(V_FRACT_F64_e64 $mods, $x)
>;		>;

} // End OtherPredicates = [UnsafeFPMath]		} // End OtherPredicates = [UnsafeFPMath]


		multiclass f16_fp_Pats<Instruction cvt_f16_f32_inst_e64, Instruction cvt_f32_f16_inst_e64> {
// f16_to_fp patterns		// f16_to_fp patterns
def : GCNPat <		def : GCNPat <
(f32 (f16_to_fp i32:$src0)),		(f32 (f16_to_fp i32:$src0)),
(V_CVT_F32_F16_e64 SRCMODS.NONE, $src0)		(cvt_f32_f16_inst_e64 SRCMODS.NONE, $src0)
		dpUnsubmitted Done Reply Inline Actions The names `cvt16to32_e64` and `cvt32to16_e64` are misleading. They should be the other way around. dp: The names `cvt16to32_e64` and `cvt32to16_e64` are misleading. They should be the other way…
		Joe_NashAuthorUnsubmitted Done Reply Inline Actions Thanks, I have fixed the names. Joe_Nash: Thanks, I have fixed the names.
>;		>;

def : GCNPat <		def : GCNPat <
(f32 (f16_to_fp (and_oneuse i32:$src0, 0x7fff))),		(f32 (f16_to_fp (and_oneuse i32:$src0, 0x7fff))),
(V_CVT_F32_F16_e64 SRCMODS.ABS, $src0)		(cvt_f32_f16_inst_e64 SRCMODS.ABS, $src0)
>;		>;

def : GCNPat <		def : GCNPat <
(f32 (f16_to_fp (i32 (srl_oneuse (and_oneuse i32:$src0, 0x7fff0000), (i32 16))))),		(f32 (f16_to_fp (i32 (srl_oneuse (and_oneuse i32:$src0, 0x7fff0000), (i32 16))))),
(V_CVT_F32_F16_e64 SRCMODS.ABS, (i32 (V_LSHRREV_B32_e64 (i32 16), i32:$src0)))		(cvt_f32_f16_inst_e64 SRCMODS.ABS, (i32 (V_LSHRREV_B32_e64 (i32 16), i32:$src0)))
>;		>;

def : GCNPat <		def : GCNPat <
(f32 (f16_to_fp (or_oneuse i32:$src0, 0x8000))),		(f32 (f16_to_fp (or_oneuse i32:$src0, 0x8000))),
(V_CVT_F32_F16_e64 SRCMODS.NEG_ABS, $src0)		(cvt_f32_f16_inst_e64 SRCMODS.NEG_ABS, $src0)
>;		>;

def : GCNPat <		def : GCNPat <
(f32 (f16_to_fp (xor_oneuse i32:$src0, 0x8000))),		(f32 (f16_to_fp (xor_oneuse i32:$src0, 0x8000))),
(V_CVT_F32_F16_e64 SRCMODS.NEG, $src0)		(cvt_f32_f16_inst_e64 SRCMODS.NEG, $src0)
>;		>;

def : GCNPat <		def : GCNPat <
(f64 (fpextend f16:$src)),		(f64 (fpextend f16:$src)),
(V_CVT_F64_F32_e32 (V_CVT_F32_F16_e32 $src))		(V_CVT_F64_F32_e32 (cvt_f32_f16_inst_e64 SRCMODS.NONE, $src))
>;		>;

// fp_to_fp16 patterns		// fp_to_fp16 patterns
def : GCNPat <		def : GCNPat <
(i32 (AMDGPUfp_to_f16 (f32 (VOP3Mods f32:$src0, i32:$src0_modifiers)))),		(i32 (AMDGPUfp_to_f16 (f32 (VOP3Mods f32:$src0, i32:$src0_modifiers)))),
(V_CVT_F16_F32_e64 $src0_modifiers, f32:$src0)		(cvt_f16_f32_inst_e64 $src0_modifiers, f32:$src0)
>;		>;

def : GCNPat <		def : GCNPat <
(i32 (fp_to_sint f16:$src)),		(i32 (fp_to_sint f16:$src)),
(V_CVT_I32_F32_e32 (V_CVT_F32_F16_e32 VSrc_b32:$src))		(V_CVT_I32_F32_e32 (cvt_f32_f16_inst_e64 SRCMODS.NONE, VSrc_b32:$src))
>;		>;

def : GCNPat <		def : GCNPat <
(i32 (fp_to_uint f16:$src)),		(i32 (fp_to_uint f16:$src)),
(V_CVT_U32_F32_e32 (V_CVT_F32_F16_e32 VSrc_b32:$src))		(V_CVT_U32_F32_e32 (cvt_f32_f16_inst_e64 SRCMODS.NONE, VSrc_b32:$src))
>;		>;

def : GCNPat <		def : GCNPat <
(f16 (sint_to_fp i32:$src)),		(f16 (sint_to_fp i32:$src)),
(V_CVT_F16_F32_e32 (V_CVT_F32_I32_e32 VSrc_b32:$src))		(cvt_f16_f32_inst_e64 SRCMODS.NONE, (V_CVT_F32_I32_e32 VSrc_b32:$src))
>;		>;

def : GCNPat <		def : GCNPat <
(f16 (uint_to_fp i32:$src)),		(f16 (uint_to_fp i32:$src)),
(V_CVT_F16_F32_e32 (V_CVT_F32_U32_e32 VSrc_b32:$src))		(cvt_f16_f32_inst_e64 SRCMODS.NONE, (V_CVT_F32_U32_e32 VSrc_b32:$src))
>;		>;
		}

		let SubtargetPredicate = NotHasTrue16BitInsts in
		defm : f16_fp_Pats<V_CVT_F16_F32_e64, V_CVT_F32_F16_e64>;

		let SubtargetPredicate = HasTrue16BitInsts in
		defm : f16_fp_Pats<V_CVT_F16_F32_t16_e64, V_CVT_F32_F16_t16_e64>;
		dpUnsubmitted Done Reply Inline Actions Why are these patterns special? `_E64` variants have no VGPR limitations. Is this required for future changes? dp: Why are these patterns special? `_E64` variants have no VGPR limitations. Is this required for…
		Joe_NashAuthorUnsubmitted Done Reply Inline Actions This is simply to account for the fact that we have new pseudo instructions for each instruction with 16 bit operands on GFX11. Therefore if a pattern directly directly refers to an instruction we need to duplicate it, one for each pseudo. Joe_Nash: This is simply to account for the fact that we have new pseudo instructions for each…
		foadUnsubmitted Not Done Reply Inline Actions Is this required for future changes? Yes. In future the _t16_e64 version will use 16-bit register classes for 16-bit operands but the regular _e64 version will continue to use 32-bit classes for them. foad: > Is this required for future changes? Yes. In future the _t16_e64 version will use 16-bit…

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// VOP2 Patterns		// VOP2 Patterns
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// NoMods pattern used for mac. If there are any source modifiers then it's		// NoMods pattern used for mac. If there are any source modifiers then it's
// better to select mad instead of mac.		// better to select mad instead of mac.
class FMADPat <ValueType vt, Instruction inst>		class FMADPat <ValueType vt, Instruction inst>
▲ Show 20 Lines • Show All 510 Lines • ▼ Show 20 Lines
class ClampPat<Instruction inst, ValueType vt> : GCNPat <		class ClampPat<Instruction inst, ValueType vt> : GCNPat <
(vt (AMDGPUclamp (VOP3Mods vt:$src0, i32:$src0_modifiers))),		(vt (AMDGPUclamp (VOP3Mods vt:$src0, i32:$src0_modifiers))),
(inst i32:$src0_modifiers, vt:$src0,		(inst i32:$src0_modifiers, vt:$src0,
i32:$src0_modifiers, vt:$src0, DSTCLAMP.ENABLE, DSTOMOD.NONE)		i32:$src0_modifiers, vt:$src0, DSTCLAMP.ENABLE, DSTOMOD.NONE)
>;		>;

def : ClampPat<V_MAX_F32_e64, f32>;		def : ClampPat<V_MAX_F32_e64, f32>;
def : ClampPat<V_MAX_F64_e64, f64>;		def : ClampPat<V_MAX_F64_e64, f64>;
		let SubtargetPredicate = NotHasTrue16BitInsts in
def : ClampPat<V_MAX_F16_e64, f16>;		def : ClampPat<V_MAX_F16_e64, f16>;
		let SubtargetPredicate = HasTrue16BitInsts in
		def : ClampPat<V_MAX_F16_t16_e64, f16>;

let SubtargetPredicate = HasVOP3PInsts in {		let SubtargetPredicate = HasVOP3PInsts in {
def : GCNPat <		def : GCNPat <
(v2f16 (AMDGPUclamp (VOP3PMods v2f16:$src0, i32:$src0_modifiers))),		(v2f16 (AMDGPUclamp (VOP3PMods v2f16:$src0, i32:$src0_modifiers))),
(V_PK_MAX_F16 $src0_modifiers, $src0,		(V_PK_MAX_F16 $src0_modifiers, $src0,
$src0_modifiers, $src0, DSTCLAMP.ENABLE)		$src0_modifiers, $src0, DSTCLAMP.ENABLE)
>;		>;
}		}
▲ Show 20 Lines • Show All 748 Lines • ▼ Show 20 Lines
def : GCNPat <		def : GCNPat <
(i64 (DivergentBinFrag<xor> i64:$src0, (i64 -1))),		(i64 (DivergentBinFrag<xor> i64:$src0, (i64 -1))),
(REG_SEQUENCE VReg_64,		(REG_SEQUENCE VReg_64,
(V_NOT_B32_e32 (i32 (EXTRACT_SUBREG i64:$src0, sub0))), sub0,		(V_NOT_B32_e32 (i32 (EXTRACT_SUBREG i64:$src0, sub0))), sub0,
(V_NOT_B32_e32 (i32 (EXTRACT_SUBREG i64:$src0, sub1))), sub1		(V_NOT_B32_e32 (i32 (EXTRACT_SUBREG i64:$src0, sub1))), sub1
)		)
>;		>;

		let SubtargetPredicate = NotHasTrue16BitInsts in
def : GCNPat <		def : GCNPat <
(f16 (sint_to_fp i1:$src)),		(f16 (sint_to_fp i1:$src)),
(V_CVT_F16_F32_e32 (		(V_CVT_F16_F32_e32 (
V_CNDMASK_B32_e64 /src0mod/(i32 0), /src0/(i32 0),		V_CNDMASK_B32_e64 /src0mod/(i32 0), /src0/(i32 0),
/src1mod/(i32 0), /src1/(i32 CONST.FP32_NEG_ONE),		/src1mod/(i32 0), /src1/(i32 CONST.FP32_NEG_ONE),
SSrc_i1:$src))		SSrc_i1:$src))
>;		>;

		let SubtargetPredicate = HasTrue16BitInsts in
		def : GCNPat <
		(f16 (sint_to_fp i1:$src)),
		(V_CVT_F16_F32_t16_e32 (
		V_CNDMASK_B32_e64 /src0mod/(i32 0), /src0/(i32 0),
		/src1mod/(i32 0), /src1/(i32 CONST.FP32_NEG_ONE),
		SSrc_i1:$src))
		>;

		let SubtargetPredicate = NotHasTrue16BitInsts in
def : GCNPat <		def : GCNPat <
(f16 (uint_to_fp i1:$src)),		(f16 (uint_to_fp i1:$src)),
(V_CVT_F16_F32_e32 (		(V_CVT_F16_F32_e32 (
V_CNDMASK_B32_e64 /src0mod/(i32 0), /src0/(i32 0),		V_CNDMASK_B32_e64 /src0mod/(i32 0), /src0/(i32 0),
/src1mod/(i32 0), /src1/(i32 CONST.FP32_ONE),		/src1mod/(i32 0), /src1/(i32 CONST.FP32_ONE),
SSrc_i1:$src))		SSrc_i1:$src))
>;		>;
		let SubtargetPredicate = HasTrue16BitInsts in
		def : GCNPat <
		(f16 (uint_to_fp i1:$src)),
		(V_CVT_F16_F32_t16_e32 (
		V_CNDMASK_B32_e64 /src0mod/(i32 0), /src0/(i32 0),
		/src1mod/(i32 0), /src1/(i32 CONST.FP32_ONE),
		SSrc_i1:$src))
		>;

def : GCNPat <		def : GCNPat <
(f32 (sint_to_fp i1:$src)),		(f32 (sint_to_fp i1:$src)),
(V_CNDMASK_B32_e64 /src0mod/(i32 0), /src0/(i32 0),		(V_CNDMASK_B32_e64 /src0mod/(i32 0), /src0/(i32 0),
/src1mod/(i32 0), /src1/(i32 CONST.FP32_NEG_ONE),		/src1mod/(i32 0), /src1/(i32 CONST.FP32_NEG_ONE),
SSrc_i1:$src)		SSrc_i1:$src)
>;		>;

▲ Show 20 Lines • Show All 202 Lines • ▼ Show 20 Lines
def : GCNPat<		def : GCNPat<
(i64 (DivergentUnaryFrag<bitreverse> i64:$a)),		(i64 (DivergentUnaryFrag<bitreverse> i64:$a)),
(REG_SEQUENCE VReg_64,		(REG_SEQUENCE VReg_64,
(V_BFREV_B32_e64 (i32 (EXTRACT_SUBREG VReg_64:$a, sub1))), sub0,		(V_BFREV_B32_e64 (i32 (EXTRACT_SUBREG VReg_64:$a, sub1))), sub0,
(V_BFREV_B32_e64 (i32 (EXTRACT_SUBREG VReg_64:$a, sub0))), sub1)>;		(V_BFREV_B32_e64 (i32 (EXTRACT_SUBREG VReg_64:$a, sub0))), sub1)>;

// Prefer selecting to max when legal, but using mul is always valid.		// Prefer selecting to max when legal, but using mul is always valid.
let AddedComplexity = -5 in {		let AddedComplexity = -5 in {

		let OtherPredicates = [NotHasTrue16BitInsts] in {
def : GCNPat<		def : GCNPat<
(fcanonicalize (f16 (VOP3Mods f16:$src, i32:$src_mods))),		(fcanonicalize (f16 (VOP3Mods f16:$src, i32:$src_mods))),
(V_MUL_F16_e64 0, (i32 CONST.FP16_ONE), $src_mods, $src)		(V_MUL_F16_e64 0, (i32 CONST.FP16_ONE), $src_mods, $src)
>;		>;

def : GCNPat<		def : GCNPat<
(fcanonicalize (f16 (fneg (VOP3Mods f16:$src, i32:$src_mods)))),		(fcanonicalize (f16 (fneg (VOP3Mods f16:$src, i32:$src_mods)))),
(V_MUL_F16_e64 0, (i32 CONST.FP16_NEG_ONE), $src_mods, $src)		(V_MUL_F16_e64 0, (i32 CONST.FP16_NEG_ONE), $src_mods, $src)
>;		>;
		} // End OtherPredicates

		let OtherPredicates = [HasTrue16BitInsts] in {
		def : GCNPat<
		(fcanonicalize (f16 (VOP3Mods f16:$src, i32:$src_mods))),
		(V_MUL_F16_t16_e64 0, (i32 CONST.FP16_ONE), $src_mods, $src)
		>;

		def : GCNPat<
		(fcanonicalize (f16 (fneg (VOP3Mods f16:$src, i32:$src_mods)))),
		(V_MUL_F16_t16_e64 0, (i32 CONST.FP16_NEG_ONE), $src_mods, $src)
		>;
		} // End OtherPredicates

def : GCNPat<		def : GCNPat<
(fcanonicalize (v2f16 (VOP3PMods v2f16:$src, i32:$src_mods))),		(fcanonicalize (v2f16 (VOP3PMods v2f16:$src, i32:$src_mods))),
(V_PK_MUL_F16 0, (i32 CONST.FP16_ONE), $src_mods, $src, DSTCLAMP.NONE)		(V_PK_MUL_F16 0, (i32 CONST.FP16_ONE), $src_mods, $src, DSTCLAMP.NONE)
>;		>;

def : GCNPat<		def : GCNPat<
(fcanonicalize (f32 (VOP3Mods f32:$src, i32:$src_mods))),		(fcanonicalize (f32 (VOP3Mods f32:$src, i32:$src_mods))),
Show All 26 Lines	def : GCNPat<
(fcanonicalize (f64 (VOP3Mods f64:$src, i32:$src_mods))),		(fcanonicalize (f64 (VOP3Mods f64:$src, i32:$src_mods))),
(V_MAX_F64_e64 $src_mods, $src, $src_mods, $src)> {		(V_MAX_F64_e64 $src_mods, $src, $src_mods, $src)> {
let OtherPredicates = f64_preds;		let OtherPredicates = f64_preds;
}		}

def : GCNPat<		def : GCNPat<
(fcanonicalize (f16 (VOP3Mods f16:$src, i32:$src_mods))),		(fcanonicalize (f16 (VOP3Mods f16:$src, i32:$src_mods))),
(V_MAX_F16_e64 $src_mods, $src, $src_mods, $src, 0, 0)> {		(V_MAX_F16_e64 $src_mods, $src, $src_mods, $src, 0, 0)> {
// FIXME: Should have 16-bit inst subtarget predicate		let OtherPredicates = !listconcat(f16_preds, [Has16BitInsts, NotHasTrue16BitInsts]);
let OtherPredicates = f16_preds;		}

		def : GCNPat<
		(fcanonicalize (f16 (VOP3Mods f16:$src, i32:$src_mods))),
		(V_MAX_F16_t16_e64 $src_mods, $src, $src_mods, $src, 0, 0)> {
		let OtherPredicates = !listconcat(f16_preds, [Has16BitInsts, HasTrue16BitInsts]);
}		}

def : GCNPat<		def : GCNPat<
(fcanonicalize (v2f16 (VOP3PMods v2f16:$src, i32:$src_mods))),		(fcanonicalize (v2f16 (VOP3PMods v2f16:$src, i32:$src_mods))),
(V_PK_MAX_F16 $src_mods, $src, $src_mods, $src, DSTCLAMP.NONE)> {		(V_PK_MAX_F16 $src_mods, $src, $src_mods, $src, DSTCLAMP.NONE)> {
// FIXME: Should have VOP3P subtarget predicate		// FIXME: Should have VOP3P subtarget predicate
let OtherPredicates = f16_preds;		let OtherPredicates = f16_preds;
}		}
Show All 30 Lines	def : GCNPat <
(fma (f32 (VOP3NoMods f32:$src0)),		(fma (f32 (VOP3NoMods f32:$src0)),
(f32 (VOP3NoMods f32:$src1)),		(f32 (VOP3NoMods f32:$src1)),
(f32 (VOP3NoMods f32:$src2))),		(f32 (VOP3NoMods f32:$src2))),
(V_FMAC_F32_e64 SRCMODS.NONE, $src0, SRCMODS.NONE, $src1,		(V_FMAC_F32_e64 SRCMODS.NONE, $src0, SRCMODS.NONE, $src1,
SRCMODS.NONE, $src2)		SRCMODS.NONE, $src2)
>;		>;
} // End OtherPredicates = [HasDLInsts]		} // End OtherPredicates = [HasDLInsts]

let SubtargetPredicate = isGFX10Plus in		let SubtargetPredicate = isGFX10Plus in {
// Don't allow source modifiers. If there are any source modifiers then it's		// Don't allow source modifiers. If there are any source modifiers then it's
// better to select fma instead of fmac.		// better to select fma instead of fmac.
		let OtherPredicates = [NotHasTrue16BitInsts] in
def : GCNPat <		def : GCNPat <
(fma (f16 (VOP3NoMods f32:$src0)),		(fma (f16 (VOP3NoMods f32:$src0)),
(f16 (VOP3NoMods f32:$src1)),		(f16 (VOP3NoMods f32:$src1)),
(f16 (VOP3NoMods f32:$src2))),		(f16 (VOP3NoMods f32:$src2))),
(V_FMAC_F16_e64 SRCMODS.NONE, $src0, SRCMODS.NONE, $src1,		(V_FMAC_F16_e64 SRCMODS.NONE, $src0, SRCMODS.NONE, $src1,
SRCMODS.NONE, $src2)		SRCMODS.NONE, $src2)
>;		>;
		let OtherPredicates = [HasTrue16BitInsts] in
		def : GCNPat <
		(fma (f16 (VOP3NoMods f32:$src0)),
		(f16 (VOP3NoMods f32:$src1)),
		(f16 (VOP3NoMods f32:$src2))),
		(V_FMAC_F16_t16_e64 SRCMODS.NONE, $src0, SRCMODS.NONE, $src1,
		SRCMODS.NONE, $src2)
		>;
		}

let SubtargetPredicate = isGFX90APlus in		let SubtargetPredicate = isGFX90APlus in
// Don't allow source modifiers. If there are any source modifiers then it's		// Don't allow source modifiers. If there are any source modifiers then it's
// better to select fma instead of fmac.		// better to select fma instead of fmac.
def : GCNPat <		def : GCNPat <
(fma (f64 (VOP3NoMods f64:$src0)),		(fma (f64 (VOP3NoMods f64:$src0)),
(f64 (VOP3NoMods f64:$src1)),		(f64 (VOP3NoMods f64:$src1)),
(f64 (VOP3NoMods f64:$src2))),		(f64 (VOP3NoMods f64:$src2))),
▲ Show 20 Lines • Show All 742 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIRegisterInfo.td

	Show First 20 Lines • Show All 547 Lines • ▼ Show 20 Lines
	// VGPR 32-bit registers			// VGPR 32-bit registers
	// i16/f16 only on VI+			// i16/f16 only on VI+
	def VGPR_32 : SIRegisterClass<"AMDGPU", !listconcat(Reg32Types.types, Reg16Types.types), 32,			def VGPR_32 : SIRegisterClass<"AMDGPU", !listconcat(Reg32Types.types, Reg16Types.types), 32,
	(add (sequence "VGPR%u", 0, 255))> {			(add (sequence "VGPR%u", 0, 255))> {
	let AllocationPriority = 0;			let AllocationPriority = 0;
	let Size = 32;			let Size = 32;
	let Weight = 1;			let Weight = 1;
	}			}

				// Identical to VGPR_32 except it only contains the low 128 (Lo128) registers.
				arsenmUnsubmitted Not Done Reply Inline Actions I don't know what "_F128" is supposed to mean. I read this as a class for long double arsenm: I don't know what "_F128" is supposed to mean. I read this as a class for long double
				Joe_NashAuthorUnsubmitted Done Reply Inline Actions It is short for First 128. Is a set with only the first 128 VGPRs. I will add a comment noting this. Joe_Nash: It is short for First 128. Is a set with only the first 128 VGPRs. I will add a comment noting…
				arsenmUnsubmitted Done Reply Inline Actions Lo128 would probably be more consistent terminology over "first" arsenm: Lo128 would probably be more consistent terminology over "first"
				rampitecUnsubmitted Not Done Reply Inline Actions I probably agree to that. F128 also hints a long double to me. With that I still hope this class is transitional and will be replaced by a real 16 bit RC. rampitec: I probably agree to that. F128 also hints a long double to me. With that I still hope this…
				Joe_NashAuthorUnsubmitted Done Reply Inline Actions Ok, replaced _F128 with _Lo128. Yes, this is a transitional class. Joe_Nash: Ok, replaced _F128 with _Lo128. Yes, this is a transitional class.
				foadUnsubmitted Not Done Reply Inline Actions Just saying: Lo128 is not ideal either because it will be confused with the _LO16/HI16 classes which mean something completely different. foad: Just saying: Lo128 is not ideal either because it will be confused with the _LO16/HI16 classes…
				def VGPR_32_Lo128 : SIRegisterClass<"AMDGPU", !listconcat(Reg32Types.types, Reg16Types.types), 32,
				(add (sequence "VGPR%u", 0, 127))> {
				let AllocationPriority = 0;
				let GeneratePressureSet = 0;
				let Size = 32;
				let Weight = 1;
				}
	} // End HasVGPR = 1			} // End HasVGPR = 1

	// VGPR 64-bit registers			// VGPR 64-bit registers
	def VGPR_64 : SIRegisterTuples<getSubRegs<2>.ret, VGPR_32, 255, 1, 2, "v">;			def VGPR_64 : SIRegisterTuples<getSubRegs<2>.ret, VGPR_32, 255, 1, 2, "v">;

	// VGPR 96-bit registers			// VGPR 96-bit registers
	def VGPR_96 : SIRegisterTuples<getSubRegs<3>.ret, VGPR_32, 255, 1, 3, "v">;			def VGPR_96 : SIRegisterTuples<getSubRegs<3>.ret, VGPR_32, 255, 1, 3, "v">;

	▲ Show 20 Lines • Show All 307 Lines • ▼ Show 20 Lines

	def VS_32 : SIRegisterClass<"AMDGPU", [i32, f32, i16, f16, v2i16, v2f16], 32,			def VS_32 : SIRegisterClass<"AMDGPU", [i32, f32, i16, f16, v2i16, v2f16], 32,
	(add VGPR_32, SReg_32, LDS_DIRECT_CLASS)> {			(add VGPR_32, SReg_32, LDS_DIRECT_CLASS)> {
	let isAllocatable = 0;			let isAllocatable = 0;
	let HasVGPR = 1;			let HasVGPR = 1;
	let HasSGPR = 1;			let HasSGPR = 1;
	}			}

				def VS_32_Lo128 : SIRegisterClass<"AMDGPU", [i32, f32, i16, f16, v2i16, v2f16], 32,
				(add VGPR_32_Lo128, SReg_32, LDS_DIRECT_CLASS)> {
				let isAllocatable = 0;
				let HasVGPR = 1;
				let HasSGPR = 1;
				}

	def VS_64 : SIRegisterClass<"AMDGPU", [i64, f64, v2f32], 32, (add VReg_64, SReg_64)> {			def VS_64 : SIRegisterClass<"AMDGPU", [i64, f64, v2f32], 32, (add VReg_64, SReg_64)> {
	let isAllocatable = 0;			let isAllocatable = 0;
	let HasVGPR = 1;			let HasVGPR = 1;
	let HasSGPR = 1;			let HasSGPR = 1;
	}			}

	def AV_32 : SIRegisterClass<"AMDGPU", VGPR_32.RegTypes, 32, (add VGPR_32, AGPR_32)> {			def AV_32 : SIRegisterClass<"AMDGPU", VGPR_32.RegTypes, 32, (add VGPR_32, AGPR_32)> {
	let HasVGPR = 1;			let HasVGPR = 1;
	Show All 30 Lines
	// Register operands			// Register operands
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	class RegImmMatcher<string name> : AsmOperandClass {			class RegImmMatcher<string name> : AsmOperandClass {
	let Name = name;			let Name = name;
	let RenderMethod = "addRegOrImmOperands";			let RenderMethod = "addRegOrImmOperands";
	}			}

				// For VOP1,2,C True16 instructions. Uses first 128 32-bit VGPRs only
				multiclass SIRegOperand16 <string rc, string MatchName, string opType,
				string rc_suffix = "_32"> {
				let OperandNamespace = "AMDGPU" in {
				def _b16_Lo128 : RegisterOperand<!cast<RegisterClass>(rc#rc_suffix#"_Lo128")> {
				let OperandType = opType#"_INT16";
				let ParserMatchClass = RegImmMatcher<MatchName#"B16_Lo128">;
				let DecoderMethod = "decodeOperand_VSrc16";
				}

				def _f16_Lo128 : RegisterOperand<!cast<RegisterClass>(rc#rc_suffix#"_Lo128")> {
				let OperandType = opType#"_FP16";
				let ParserMatchClass = RegImmMatcher<MatchName#"F16_Lo128">;
				let DecoderMethod = "decodeOperand_" # rc # "_16";
				}
				}
				}


	multiclass SIRegOperand32 <string rc, string MatchName, string opType,			multiclass SIRegOperand32 <string rc, string MatchName, string opType,
	string rc_suffix = "_32"> {			string rc_suffix = "_32"> {
	let OperandNamespace = "AMDGPU" in {			let OperandNamespace = "AMDGPU" in {
	def _b16 : RegisterOperand<!cast<RegisterClass>(rc#rc_suffix)> {			def _b16 : RegisterOperand<!cast<RegisterClass>(rc#rc_suffix)> {
	let OperandType = opType#"_INT16";			let OperandType = opType#"_INT16";
	let ParserMatchClass = RegImmMatcher<MatchName#"B16">;			let ParserMatchClass = RegImmMatcher<MatchName#"B16">;
	let DecoderMethod = "decodeOperand_VSrc16";			let DecoderMethod = "decodeOperand_VSrc16";
	}			}
	▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines

	defm SCSrc : RegInlineOperand<"SReg", "SCSrc"> ;			defm SCSrc : RegInlineOperand<"SReg", "SCSrc"> ;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VSrc_* Operands with an SGPR, VGPR or a 32-bit immediate			// VSrc_* Operands with an SGPR, VGPR or a 32-bit immediate
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	defm VSrc : RegImmOperand<"VS", "VSrc">;			defm VSrc : RegImmOperand<"VS", "VSrc">;
				defm VSrcT : SIRegOperand16<"VS", "VSrcT", "OPERAND_REG_IMM">;

	def VSrc_128 : RegisterOperand<VReg_128> {			def VSrc_128 : RegisterOperand<VReg_128> {
	let DecoderMethod = "DecodeVS_128RegisterClass";			let DecoderMethod = "DecodeVS_128RegisterClass";
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VSrc_*_Deferred Operands with an SGPR, VGPR or a 32-bit immediate for use			// VSrc_*_Deferred Operands with an SGPR, VGPR or a 32-bit immediate for use
	// with FMAMK/FMAAK			// with FMAMK/FMAAK
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

				multiclass SIRegOperand16_Deferred <string rc, string MatchName, string opType,
				string rc_suffix = "_32"> {
				let OperandNamespace = "AMDGPU" in {
				def _f16_Lo128_Deferred : RegisterOperand<!cast<RegisterClass>(rc#rc_suffix#"_Lo128")> {
				let OperandType = opType#"_FP16_DEFERRED";
				let ParserMatchClass = RegImmMatcher<MatchName#"F16_Lo128">;
				let DecoderMethod = "decodeOperand_" # rc # "_16_Deferred";
				}
				}
				}

	multiclass SIRegOperand32_Deferred <string rc, string MatchName, string opType,			multiclass SIRegOperand32_Deferred <string rc, string MatchName, string opType,
	string rc_suffix = "_32"> {			string rc_suffix = "_32"> {
	let OperandNamespace = "AMDGPU" in {			let OperandNamespace = "AMDGPU" in {
	def _f16_Deferred : RegisterOperand<!cast<RegisterClass>(rc#rc_suffix)> {			def _f16_Deferred : RegisterOperand<!cast<RegisterClass>(rc#rc_suffix)> {
	let OperandType = opType#"_FP16_DEFERRED";			let OperandType = opType#"_FP16_DEFERRED";
	let ParserMatchClass = RegImmMatcher<MatchName#"F16">;			let ParserMatchClass = RegImmMatcher<MatchName#"F16">;
	let DecoderMethod = "decodeOperand_" # rc # "_16_Deferred";			let DecoderMethod = "decodeOperand_" # rc # "_16_Deferred";
	}			}

	def _f32_Deferred : RegisterOperand<!cast<RegisterClass>(rc#rc_suffix)> {			def _f32_Deferred : RegisterOperand<!cast<RegisterClass>(rc#rc_suffix)> {
	let OperandType = opType#"_FP32_DEFERRED";			let OperandType = opType#"_FP32_DEFERRED";
	let ParserMatchClass = RegImmMatcher<MatchName#"F32">;			let ParserMatchClass = RegImmMatcher<MatchName#"F32">;
	let DecoderMethod = "decodeOperand_" # rc # "_32_Deferred";			let DecoderMethod = "decodeOperand_" # rc # "_32_Deferred";
	}			}
	}			}
	}			}

	defm VSrc : SIRegOperand32_Deferred<"VS", "VSrc", "OPERAND_REG_IMM">;			defm VSrc : SIRegOperand32_Deferred<"VS", "VSrc", "OPERAND_REG_IMM">;
				defm VSrcT : SIRegOperand16_Deferred<"VS", "VSrcT", "OPERAND_REG_IMM">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VRegSrc_* Operands with a VGPR			// VRegSrc_* Operands with a VGPR
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// This is for operands with the enum(9), VSrc encoding restriction,			// This is for operands with the enum(9), VSrc encoding restriction,
	// but only allows VGPRs.			// but only allows VGPRs.
	def VRegSrc_32 : RegisterOperand<VGPR_32> {			def VRegSrc_32 : RegisterOperand<VGPR_32> {
	Show All 16 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VGPRSrc_*			// VGPRSrc_*
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// An 8-bit RegisterOperand wrapper for a VGPR			// An 8-bit RegisterOperand wrapper for a VGPR
	def VGPRSrc_32 : RegisterOperand<VGPR_32> {			def VGPRSrc_32 : RegisterOperand<VGPR_32> {
	let DecoderMethod = "DecodeVGPR_32RegisterClass";			let DecoderMethod = "DecodeVGPR_32RegisterClass";
	}			}
				def VGPRSrc_32_Lo128 : RegisterOperand<VGPR_32_Lo128> {
				let DecoderMethod = "DecodeVGPR_32RegisterClass";
				}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// ASrc_* Operands with an AccVGPR			// ASrc_* Operands with an AccVGPR
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def ARegSrc_32 : RegisterOperand<AGPR_32> {			def ARegSrc_32 : RegisterOperand<AGPR_32> {
	let DecoderMethod = "DecodeAGPR_32RegisterClass";			let DecoderMethod = "DecodeAGPR_32RegisterClass";
	let EncoderMethod = "getAVOperandEncoding";			let EncoderMethod = "getAVOperandEncoding";
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VCSrc_* Operands with an SGPR, VGPR or an inline constant			// VCSrc_* Operands with an SGPR, VGPR or an inline constant
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	defm VCSrc : RegInlineOperand<"VS", "VCSrc">;			defm VCSrc : RegInlineOperand<"VS", "VCSrc">;
				defm VCSrcT : SIRegOperand16<"VS", "VCSrcT", "OPERAND_REG_INLINE_C">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VISrc_* Operands with a VGPR or an inline constant			// VISrc_* Operands with a VGPR or an inline constant
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	defm VISrc : RegInlineOperand32<"VGPR", "VISrc">;			defm VISrc : RegInlineOperand32<"VGPR", "VISrc">;
	let DecoderMethod = "decodeOperand_VReg_64" in			let DecoderMethod = "decodeOperand_VReg_64" in
	defm VISrc_64 : RegInlineOperand64<"VReg", "VISrc_64", "_64">;			defm VISrc_64 : RegInlineOperand64<"VReg", "VISrc_64", "_64">;
	▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp

//===-- SIShrinkInstructions.cpp - Shrink Instructions --------------------===//		//===-- SIShrinkInstructions.cpp - Shrink Instructions --------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
/// The pass tries to use the 32-bit encoding for instructions when possible.		/// The pass tries to use the 32-bit encoding for instructions when possible.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//

#include "AMDGPU.h"		#include "AMDGPU.h"
#include "GCNSubtarget.h"		#include "GCNSubtarget.h"
#include "MCTargetDesc/AMDGPUMCTargetDesc.h"		#include "MCTargetDesc/AMDGPUMCTargetDesc.h"
		#include "Utils/AMDGPUBaseInfo.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"

#define DEBUG_TYPE "si-shrink-instructions"		#define DEBUG_TYPE "si-shrink-instructions"

STATISTIC(NumInstructionsShrunk,		STATISTIC(NumInstructionsShrunk,
"Number of 64-bit instruction reduced to 32-bit.");		"Number of 64-bit instruction reduced to 32-bit.");
STATISTIC(NumLiteralConstantsFolded,		STATISTIC(NumLiteralConstantsFolded,
Show All 13 Lines
public:		public:
static char ID;		static char ID;

public:		public:
SIShrinkInstructions() : MachineFunctionPass(ID) {		SIShrinkInstructions() : MachineFunctionPass(ID) {
}		}

bool foldImmediates(MachineInstr &MI, bool TryToCommute = true) const;		bool foldImmediates(MachineInstr &MI, bool TryToCommute = true) const;
		bool shouldShrinkTrue16(MachineInstr &MI) const;
bool isKImmOperand(const MachineOperand &Src) const;		bool isKImmOperand(const MachineOperand &Src) const;
bool isKUImmOperand(const MachineOperand &Src) const;		bool isKUImmOperand(const MachineOperand &Src) const;
bool isKImmOrKUImmOperand(const MachineOperand &Src, bool &IsUnsigned) const;		bool isKImmOrKUImmOperand(const MachineOperand &Src, bool &IsUnsigned) const;
bool isReverseInlineImm(const MachineOperand &Src, int32_t &ReverseImm) const;		bool isReverseInlineImm(const MachineOperand &Src, int32_t &ReverseImm) const;
void copyExtraImplicitOps(MachineInstr &NewMI, MachineInstr &MI) const;		void copyExtraImplicitOps(MachineInstr &NewMI, MachineInstr &MI) const;
void shrinkScalarCompare(MachineInstr &MI) const;		void shrinkScalarCompare(MachineInstr &MI) const;
void shrinkMIMG(MachineInstr &MI) const;		void shrinkMIMG(MachineInstr &MI) const;
void shrinkMadFma(MachineInstr &MI) const;		void shrinkMadFma(MachineInstr &MI) const;
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	if (TII->commuteInstruction(MI)) {
// Commute back.		// Commute back.
TII->commuteInstruction(MI);		TII->commuteInstruction(MI);
}		}
}		}

return false;		return false;
}		}

		/// Do not shrink the instruction if its registers are not expressible in the
		foadUnsubmitted Done Reply Inline Actions I would prefer to change SIShrinkInstructions in a more general way that does not specifically check for True16 instructions: D133769 foad: I would prefer to change SIShrinkInstructions in a more general way that does not specifically…
		Joe_NashAuthorUnsubmitted Done Reply Inline Actions That seems fine to me. I will rebase on that when its landed. Joe_Nash: That seems fine to me. I will rebase on that when its landed.
		/// shrunk encoding.
		bool SIShrinkInstructions::shouldShrinkTrue16(MachineInstr &MI) const {
		for (unsigned I = 0, E = MI.getNumExplicitOperands(); I != E; ++I) {
		const MachineOperand &MO = MI.getOperand(I);
		if (MO.isReg()) {
		Register Reg = MO.getReg();
		assert(!Reg.isVirtual() && "Prior checks should ensure we only shrink "
		"True16 Instructions post-RA");
		if (AMDGPU::VGPR_32RegClass.contains(Reg) &&
		!AMDGPU::VGPR_32_Lo128RegClass.contains(Reg))
		rampitecUnsubmitted Done Reply Inline Actions No else after return. rampitec: No else after return.
		return false;
		}
		}
		return true;
		}

bool SIShrinkInstructions::isKImmOperand(const MachineOperand &Src) const {		bool SIShrinkInstructions::isKImmOperand(const MachineOperand &Src) const {
return isInt<16>(Src.getImm()) &&		return isInt<16>(Src.getImm()) &&
!TII->isInlineConstant(*Src.getParent(),		!TII->isInlineConstant(*Src.getParent(),
Src.getParent()->getOperandNo(&Src));		Src.getParent()->getOperandNo(&Src));
}		}

bool SIShrinkInstructions::isKUImmOperand(const MachineOperand &Src) const {		bool SIShrinkInstructions::isKUImmOperand(const MachineOperand &Src) const {
return isUInt<16>(Src.getImm()) &&		return isUInt<16>(Src.getImm()) &&
▲ Show 20 Lines • Show All 235 Lines • ▼ Show 20 Lines	if (Src2.isImm() && !TII->isInlineConstant(Src2)) {
case AMDGPU::V_FMA_F32_e64:		case AMDGPU::V_FMA_F32_e64:
NewOpcode = AMDGPU::V_FMAAK_F32;		NewOpcode = AMDGPU::V_FMAAK_F32;
break;		break;
case AMDGPU::V_MAD_F16_e64:		case AMDGPU::V_MAD_F16_e64:
NewOpcode = AMDGPU::V_MADAK_F16;		NewOpcode = AMDGPU::V_MADAK_F16;
break;		break;
case AMDGPU::V_FMA_F16_e64:		case AMDGPU::V_FMA_F16_e64:
case AMDGPU::V_FMA_F16_gfx9_e64:		case AMDGPU::V_FMA_F16_gfx9_e64:
NewOpcode = AMDGPU::V_FMAAK_F16;		NewOpcode = ST->hasTrue16BitInsts() ? AMDGPU::V_FMAAK_F16_t16
		: AMDGPU::V_FMAAK_F16;
break;		break;
}		}
}		}

// Detect "Dst = VSrc * Imm + VGPR" and convert to MK form.		// Detect "Dst = VSrc * Imm + VGPR" and convert to MK form.
if (Src2.isReg() && TRI->isVGPR(*MRI, Src2.getReg())) {		if (Src2.isReg() && TRI->isVGPR(*MRI, Src2.getReg())) {
if (Src1.isImm() && !TII->isInlineConstant(Src1))		if (Src1.isImm() && !TII->isInlineConstant(Src1))
Swap = false;		Swap = false;
Show All 11 Lines	if (Src2.isReg() && TRI->isVGPR(*MRI, Src2.getReg())) {
case AMDGPU::V_FMA_F32_e64:		case AMDGPU::V_FMA_F32_e64:
NewOpcode = AMDGPU::V_FMAMK_F32;		NewOpcode = AMDGPU::V_FMAMK_F32;
break;		break;
case AMDGPU::V_MAD_F16_e64:		case AMDGPU::V_MAD_F16_e64:
NewOpcode = AMDGPU::V_MADMK_F16;		NewOpcode = AMDGPU::V_MADMK_F16;
break;		break;
case AMDGPU::V_FMA_F16_e64:		case AMDGPU::V_FMA_F16_e64:
case AMDGPU::V_FMA_F16_gfx9_e64:		case AMDGPU::V_FMA_F16_gfx9_e64:
NewOpcode = AMDGPU::V_FMAMK_F16;		NewOpcode = ST->hasTrue16BitInsts() ? AMDGPU::V_FMAMK_F16_t16
		: AMDGPU::V_FMAMK_F16;
break;		break;
}		}
}		}

if (NewOpcode == AMDGPU::INSTRUCTION_LIST_END)		if (NewOpcode == AMDGPU::INSTRUCTION_LIST_END)
return;		return;

		if (AMDGPU::isTrue16Inst(NewOpcode) && !shouldShrinkTrue16(MI))
		return;

if (Swap) {		if (Swap) {
// Swap Src0 and Src1 by building a new instruction.		// Swap Src0 and Src1 by building a new instruction.
BuildMI(*MI.getParent(), MI, MI.getDebugLoc(), TII->get(NewOpcode),		BuildMI(*MI.getParent(), MI, MI.getDebugLoc(), TII->get(NewOpcode),
MI.getOperand(0).getReg())		MI.getOperand(0).getReg())
.add(Src1)		.add(Src1)
.add(Src0)		.add(Src0)
.add(Src2)		.add(Src2)
.setMIFlags(MI.getFlags());		.setMIFlags(MI.getFlags());
▲ Show 20 Lines • Show All 521 Lines • ▼ Show 20 Lines	for (I = MBB.begin(); I != MBB.end(); I = Next) {
// fold an immediate into the shrunk instruction as a literal operand. In		// fold an immediate into the shrunk instruction as a literal operand. In
// GFX10 VOP3 instructions can take a literal operand anyway, so there is		// GFX10 VOP3 instructions can take a literal operand anyway, so there is
// no advantage to doing this.		// no advantage to doing this.
if (ST->hasVOP3Literal() &&		if (ST->hasVOP3Literal() &&
!MF.getProperties().hasProperty(		!MF.getProperties().hasProperty(
MachineFunctionProperties::Property::NoVRegs))		MachineFunctionProperties::Property::NoVRegs))
continue;		continue;

		if (ST->hasTrue16BitInsts() && AMDGPU::isTrue16Inst(MI.getOpcode()) &&
		!shouldShrinkTrue16(MI))
		continue;

// We can shrink this instruction		// We can shrink this instruction
LLVM_DEBUG(dbgs() << "Shrinking " << MI);		LLVM_DEBUG(dbgs() << "Shrinking " << MI);

MachineInstr *Inst32 = TII->buildShrunkInst(MI, Op32);		MachineInstr *Inst32 = TII->buildShrunkInst(MI, Op32);
++NumInstructionsShrunk;		++NumInstructionsShrunk;

// Copy extra operands not present in the instruction definition.		// Copy extra operands not present in the instruction definition.
copyExtraImplicitOps(*Inst32, MI);		copyExtraImplicitOps(*Inst32, MI);
Show All 13 Lines

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h

	Show First 20 Lines • Show All 494 Lines • ▼ Show 20 Lines

	LLVM_READONLY			LLVM_READONLY
	int getVOPDFull(unsigned OpX, unsigned OpY);			int getVOPDFull(unsigned OpX, unsigned OpY);

	LLVM_READONLY			LLVM_READONLY
	bool isVOPD(unsigned Opc);			bool isVOPD(unsigned Opc);

	LLVM_READONLY			LLVM_READONLY
				bool isTrue16Inst(unsigned Opc);

				LLVM_READONLY
	unsigned mapWMMA2AddrTo3AddrOpcode(unsigned Opc);			unsigned mapWMMA2AddrTo3AddrOpcode(unsigned Opc);

	LLVM_READONLY			LLVM_READONLY
	unsigned mapWMMA3AddrTo2AddrOpcode(unsigned Opc);			unsigned mapWMMA3AddrTo2AddrOpcode(unsigned Opc);

	void initDefaultAMDKernelCodeT(amd_kernel_code_t &Header,			void initDefaultAMDKernelCodeT(amd_kernel_code_t &Header,
	const MCSubtargetInfo *STI);			const MCSubtargetInfo *STI);

	▲ Show 20 Lines • Show All 611 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp

Show All 27 Lines
#define GET_INSTRMAP_INFO		#define GET_INSTRMAP_INFO
#include "AMDGPUGenInstrInfo.inc"		#include "AMDGPUGenInstrInfo.inc"

static llvm::cl::opt<unsigned>		static llvm::cl::opt<unsigned>
AmdhsaCodeObjectVersion("amdhsa-code-object-version", llvm::cl::Hidden,		AmdhsaCodeObjectVersion("amdhsa-code-object-version", llvm::cl::Hidden,
llvm::cl::desc("AMDHSA Code Object Version"),		llvm::cl::desc("AMDHSA Code Object Version"),
llvm::cl::init(4));		llvm::cl::init(4));

// TODO-GFX11: Remove this when full 16-bit codegen is implemented.
static llvm::cl::opt<bool>
LimitTo128VGPRs("amdgpu-limit-to-128-vgprs", llvm::cl::Hidden,
llvm::cl::desc("Never use more than 128 VGPRs"));

namespace {		namespace {

/// \returns Bit mask for given bit \p Shift and bit \p Width.		/// \returns Bit mask for given bit \p Shift and bit \p Width.
unsigned getBitMask(unsigned Shift, unsigned Width) {		unsigned getBitMask(unsigned Shift, unsigned Width) {
return ((1 << Width) - 1) << Shift;		return ((1 << Width) - 1) << Shift;
}		}

/// Packs \p Src into \p Dst for given bit \p Shift and bit \p Width.		/// Packs \p Src into \p Dst for given bit \p Shift and bit \p Width.
▲ Show 20 Lines • Show All 235 Lines • ▼ Show 20 Lines
};		};

struct VOPDInfo {		struct VOPDInfo {
uint16_t Opcode;		uint16_t Opcode;
uint16_t OpX;		uint16_t OpX;
uint16_t OpY;		uint16_t OpY;
};		};

		struct VOPTrue16Info {
		uint16_t Opcode;
		bool IsTrue16;
		};

#define GET_MTBUFInfoTable_DECL		#define GET_MTBUFInfoTable_DECL
#define GET_MTBUFInfoTable_IMPL		#define GET_MTBUFInfoTable_IMPL
#define GET_MUBUFInfoTable_DECL		#define GET_MUBUFInfoTable_DECL
#define GET_MUBUFInfoTable_IMPL		#define GET_MUBUFInfoTable_IMPL
#define GET_SMInfoTable_DECL		#define GET_SMInfoTable_DECL
#define GET_SMInfoTable_IMPL		#define GET_SMInfoTable_IMPL
#define GET_VOP1InfoTable_DECL		#define GET_VOP1InfoTable_DECL
#define GET_VOP1InfoTable_IMPL		#define GET_VOP1InfoTable_IMPL
#define GET_VOP2InfoTable_DECL		#define GET_VOP2InfoTable_DECL
#define GET_VOP2InfoTable_IMPL		#define GET_VOP2InfoTable_IMPL
#define GET_VOP3InfoTable_DECL		#define GET_VOP3InfoTable_DECL
#define GET_VOP3InfoTable_IMPL		#define GET_VOP3InfoTable_IMPL
#define GET_VOPC64DPPTable_DECL		#define GET_VOPC64DPPTable_DECL
#define GET_VOPC64DPPTable_IMPL		#define GET_VOPC64DPPTable_IMPL
#define GET_VOPC64DPP8Table_DECL		#define GET_VOPC64DPP8Table_DECL
#define GET_VOPC64DPP8Table_IMPL		#define GET_VOPC64DPP8Table_IMPL
#define GET_VOPDComponentTable_DECL		#define GET_VOPDComponentTable_DECL
#define GET_VOPDComponentTable_IMPL		#define GET_VOPDComponentTable_IMPL
#define GET_VOPDPairs_DECL		#define GET_VOPDPairs_DECL
#define GET_VOPDPairs_IMPL		#define GET_VOPDPairs_IMPL
		#define GET_VOPTrue16Table_DECL
		#define GET_VOPTrue16Table_IMPL
#define GET_WMMAOpcode2AddrMappingTable_DECL		#define GET_WMMAOpcode2AddrMappingTable_DECL
#define GET_WMMAOpcode2AddrMappingTable_IMPL		#define GET_WMMAOpcode2AddrMappingTable_IMPL
#define GET_WMMAOpcode3AddrMappingTable_DECL		#define GET_WMMAOpcode3AddrMappingTable_DECL
#define GET_WMMAOpcode3AddrMappingTable_IMPL		#define GET_WMMAOpcode3AddrMappingTable_IMPL
#include "AMDGPUGenSearchableTables.inc"		#include "AMDGPUGenSearchableTables.inc"

int getMTBUFBaseOpcode(unsigned Opc) {		int getMTBUFBaseOpcode(unsigned Opc) {
const MTBUFInfo *Info = getMTBUFInfoFromOpcode(Opc);		const MTBUFInfo *Info = getMTBUFInfoFromOpcode(Opc);
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines

unsigned getVOPDOpcode(unsigned Opc) {		unsigned getVOPDOpcode(unsigned Opc) {
const VOPDComponentInfo *Info = getVOPDComponentHelper(Opc);		const VOPDComponentInfo *Info = getVOPDComponentHelper(Opc);
return Info ? Info->VOPDOp : ~0u;		return Info ? Info->VOPDOp : ~0u;
}		}

bool isVOPD(unsigned Opc) { return getVOPDOpcodeHelper(Opc); }		bool isVOPD(unsigned Opc) { return getVOPDOpcodeHelper(Opc); }

		bool isTrue16Inst(unsigned Opc) {
		const VOPTrue16Info *Info = getTrue16OpcodeHelper(Opc);
		return Info ? Info->IsTrue16 : false;
		}

unsigned mapWMMA2AddrTo3AddrOpcode(unsigned Opc) {		unsigned mapWMMA2AddrTo3AddrOpcode(unsigned Opc) {
const WMMAOpcodeMappingInfo *Info = getWMMAMappingInfoFrom2AddrOpcode(Opc);		const WMMAOpcodeMappingInfo *Info = getWMMAMappingInfoFrom2AddrOpcode(Opc);
return Info ? Info->Opcode3Addr : ~0u;		return Info ? Info->Opcode3Addr : ~0u;
}		}

unsigned mapWMMA3AddrTo2AddrOpcode(unsigned Opc) {		unsigned mapWMMA3AddrTo2AddrOpcode(unsigned Opc) {
const WMMAOpcodeMappingInfo *Info = getWMMAMappingInfoFrom3AddrOpcode(Opc);		const WMMAOpcodeMappingInfo *Info = getWMMAMappingInfoFrom3AddrOpcode(Opc);
return Info ? Info->Opcode2Addr : ~0u;		return Info ? Info->Opcode2Addr : ~0u;
▲ Show 20 Lines • Show All 417 Lines • ▼ Show 20 Lines	unsigned getTotalNumVGPRs(const MCSubtargetInfo *STI) {
if (STI->getFeatureBits().test(FeatureGFX90AInsts))		if (STI->getFeatureBits().test(FeatureGFX90AInsts))
return 512;		return 512;
if (!isGFX10Plus(*STI))		if (!isGFX10Plus(*STI))
return 256;		return 256;
return STI->getFeatureBits().test(FeatureWavefrontSize32) ? 1024 : 512;		return STI->getFeatureBits().test(FeatureWavefrontSize32) ? 1024 : 512;
}		}

unsigned getAddressableNumVGPRs(const MCSubtargetInfo *STI) {		unsigned getAddressableNumVGPRs(const MCSubtargetInfo *STI) {
if (LimitTo128VGPRs.getNumOccurrences() ? LimitTo128VGPRs
: isGFX11Plus(*STI)) {
// GFX11 changes the encoding of 16-bit operands in VOP1/2/C instructions
// such that values 128..255 no longer mean v128..v255, they mean
// v0.hi..v127.hi instead. Until the compiler understands this, it is not
// safe to use v128..v255.
// TODO-GFX11: Remove this when full 16-bit codegen is implemented.
return 128;
}
if (STI->getFeatureBits().test(FeatureGFX90AInsts))		if (STI->getFeatureBits().test(FeatureGFX90AInsts))
return 512;		return 512;
return 256;		return 256;
}		}

unsigned getMinNumVGPRs(const MCSubtargetInfo *STI, unsigned WavesPerEU) {		unsigned getMinNumVGPRs(const MCSubtargetInfo *STI, unsigned WavesPerEU) {
assert(WavesPerEU != 0);		assert(WavesPerEU != 0);

▲ Show 20 Lines • Show All 1,553 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/VOP1Instructions.td

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	multiclass VOP1Inst <string opName, VOPProfile P,

def : MnemonicAlias<opName#"_e32", opName>, LetDummies;		def : MnemonicAlias<opName#"_e32", opName>, LetDummies;
def : MnemonicAlias<opName#"_e64", opName>, LetDummies;		def : MnemonicAlias<opName#"_e64", opName>, LetDummies;

foreach _ = BoolToList<P.HasExtSDWA>.ret in		foreach _ = BoolToList<P.HasExtSDWA>.ret in
def : MnemonicAlias<opName#"_sdwa", opName>, LetDummies;		def : MnemonicAlias<opName#"_sdwa", opName>, LetDummies;

foreach _ = BoolToList<P.HasExtDPP>.ret in		foreach _ = BoolToList<P.HasExtDPP>.ret in
def : MnemonicAlias<opName#"_dpp", opName>, LetDummies;		def : MnemonicAlias<opName#"_dpp", opName, AMDGPUAsmVariants.DPP>, LetDummies;
		}

		multiclass VOP1Inst_t16<string opName,
		VOPProfile P,
		SDPatternOperator node = null_frag> {
		let OtherPredicates = [NotHasTrue16BitInsts, Has16BitInsts] in {
		defm NAME : VOP1Inst<opName, P, node>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		defm _t16 : VOP1Inst<opName#"_t16", VOPProfile_True16<P>, node>;
		}
}		}

// Special profile for instructions which have clamp		// Special profile for instructions which have clamp
// and output modifiers (but have no input modifiers)		// and output modifiers (but have no input modifiers)
class VOPProfileI2F<ValueType dstVt, ValueType srcVt> :		class VOPProfileI2F<ValueType dstVt, ValueType srcVt> :
VOPProfile<[dstVt, srcVt, untyped, untyped]> {		VOPProfile<[dstVt, srcVt, untyped, untyped]> {

let Ins64 = (ins Src0RC64:$src0, clampmod:$clamp, omod:$omod);		let Ins64 = (ins Src0RC64:$src0, clampmod:$clamp, omod:$omod);
let InsVOP3Base = (ins Src0DPP:$src0, clampmod:$clamp, omod:$omod);		let InsVOP3Base = (ins Src0VOP3DPP:$src0, clampmod:$clamp, omod:$omod);
		let Asm64 = "$vdst, $src0$clamp$omod";
		let AsmVOP3DPPBase = Asm64;

		let HasModifiers = 0;
		let HasClamp = 1;
		}

		class VOPProfileI2F_True16<ValueType dstVt, ValueType srcVt> :
		VOPProfile_True16<VOPProfile<[dstVt, srcVt, untyped, untyped]>> {

		let Ins64 = (ins Src0RC64:$src0, clampmod:$clamp, omod:$omod);
		let InsVOP3Base = (ins Src0VOP3DPP:$src0, clampmod:$clamp, omod:$omod);
let Asm64 = "$vdst, $src0$clamp$omod";		let Asm64 = "$vdst, $src0$clamp$omod";
let AsmVOP3DPPBase = Asm64;		let AsmVOP3DPPBase = Asm64;

let HasModifiers = 0;		let HasModifiers = 0;
let HasClamp = 1;		let HasClamp = 1;
}		}

def VOP1_F64_I32 : VOPProfileI2F <f64, i32>;		def VOP1_F64_I32 : VOPProfileI2F <f64, i32>;
def VOP1_F32_I32 : VOPProfileI2F <f32, i32>;		def VOP1_F32_I32 : VOPProfileI2F <f32, i32>;
def VOP1_F16_I16 : VOPProfileI2F <f16, i16>;		def VOP1_F16_I16 : VOPProfileI2F <f16, i16>;
		def VOP1_F16_I16_t16 : VOPProfileI2F_True16 <f16, i16>;

def VOP_NOP_PROFILE : VOPProfile <[untyped, untyped, untyped, untyped]>{		def VOP_NOP_PROFILE : VOPProfile <[untyped, untyped, untyped, untyped]>{
let HasExtVOP3DPP = 0;		let HasExtVOP3DPP = 0;
}		}

// OMod clears exceptions when set. OMod was always an operand, but its		// OMod clears exceptions when set. OMod was always an operand, but its
// now explicitly set.		// now explicitly set.
class VOP_SPECIAL_OMOD_PROF<ValueType dstVt, ValueType srcVt> :		class VOP_SPECIAL_OMOD_PROF<ValueType dstVt, ValueType srcVt> :
VOPProfile<[dstVt, srcVt, untyped, untyped]> {		VOPProfile<[dstVt, srcVt, untyped, untyped]> {

let HasOMod = 1;		let HasOMod = 1;
}		}
def VOP_I32_F32_SPECIAL_OMOD : VOP_SPECIAL_OMOD_PROF<i32, f32>;		def VOP_I32_F32_SPECIAL_OMOD : VOP_SPECIAL_OMOD_PROF<i32, f32>;
def VOP_I32_F64_SPECIAL_OMOD : VOP_SPECIAL_OMOD_PROF<i32, f64>;		def VOP_I32_F64_SPECIAL_OMOD : VOP_SPECIAL_OMOD_PROF<i32, f64>;
def VOP_I16_F16_SPECIAL_OMOD : VOP_SPECIAL_OMOD_PROF<i16, f16>;		def VOP_I16_F16_SPECIAL_OMOD : VOP_SPECIAL_OMOD_PROF<i16, f16>;
		def VOP_I16_F16_SPECIAL_OMOD_t16 : VOPProfile_True16<VOP_I16_F16> {
		let HasOMod = 1;
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// VOP1 Instructions		// VOP1 Instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

let VOPAsmPrefer32Bit = 1 in {		let VOPAsmPrefer32Bit = 1 in {
defm V_NOP : VOP1Inst <"v_nop", VOP_NOP_PROFILE>;		defm V_NOP : VOP1Inst <"v_nop", VOP_NOP_PROFILE>;
}		}
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
defm V_CVT_F32_I32 : VOP1Inst <"v_cvt_f32_i32", VOP1_F32_I32, sint_to_fp>;		defm V_CVT_F32_I32 : VOP1Inst <"v_cvt_f32_i32", VOP1_F32_I32, sint_to_fp>;
defm V_CVT_F32_U32 : VOP1Inst <"v_cvt_f32_u32", VOP1_F32_I32, uint_to_fp>;		defm V_CVT_F32_U32 : VOP1Inst <"v_cvt_f32_u32", VOP1_F32_I32, uint_to_fp>;
}		}

// OMod clears exceptions when set in these 2 instructions		// OMod clears exceptions when set in these 2 instructions
defm V_CVT_U32_F32 : VOP1Inst <"v_cvt_u32_f32", VOP_I32_F32_SPECIAL_OMOD, fp_to_uint>;		defm V_CVT_U32_F32 : VOP1Inst <"v_cvt_u32_f32", VOP_I32_F32_SPECIAL_OMOD, fp_to_uint>;
defm V_CVT_I32_F32 : VOP1Inst <"v_cvt_i32_f32", VOP_I32_F32_SPECIAL_OMOD, fp_to_sint>;		defm V_CVT_I32_F32 : VOP1Inst <"v_cvt_i32_f32", VOP_I32_F32_SPECIAL_OMOD, fp_to_sint>;
let FPDPRounding = 1, isReMaterializable = 0 in {		let FPDPRounding = 1, isReMaterializable = 0 in {
		let OtherPredicates = [NotHasTrue16BitInsts] in
defm V_CVT_F16_F32 : VOP1Inst <"v_cvt_f16_f32", VOP_F16_F32, fpround>;		defm V_CVT_F16_F32 : VOP1Inst <"v_cvt_f16_f32", VOP_F16_F32, fpround>;
		let OtherPredicates = [HasTrue16BitInsts] in
		defm V_CVT_F16_F32_t16 : VOP1Inst <"v_cvt_f16_f32_t16", VOPProfile_True16<VOP_F16_F32>, fpround>;
} // End FPDPRounding = 1, isReMaterializable = 0		} // End FPDPRounding = 1, isReMaterializable = 0

		let OtherPredicates = [NotHasTrue16BitInsts] in
defm V_CVT_F32_F16 : VOP1Inst <"v_cvt_f32_f16", VOP_F32_F16, fpextend>;		defm V_CVT_F32_F16 : VOP1Inst <"v_cvt_f32_f16", VOP_F32_F16, fpextend>;
		let OtherPredicates = [HasTrue16BitInsts] in
		defm V_CVT_F32_F16_t16 : VOP1Inst <"v_cvt_f32_f16_t16", VOPProfile_True16<VOP_F32_F16>, fpextend>;

let ReadsModeReg = 0, mayRaiseFPException = 0 in {		let ReadsModeReg = 0, mayRaiseFPException = 0 in {
defm V_CVT_RPI_I32_F32 : VOP1Inst <"v_cvt_rpi_i32_f32", VOP_I32_F32, cvt_rpi_i32_f32>;		defm V_CVT_RPI_I32_F32 : VOP1Inst <"v_cvt_rpi_i32_f32", VOP_I32_F32, cvt_rpi_i32_f32>;
defm V_CVT_FLR_I32_F32 : VOP1Inst <"v_cvt_flr_i32_f32", VOP_I32_F32, cvt_flr_i32_f32>;		defm V_CVT_FLR_I32_F32 : VOP1Inst <"v_cvt_flr_i32_f32", VOP_I32_F32, cvt_flr_i32_f32>;
defm V_CVT_OFF_F32_I4 : VOP1Inst <"v_cvt_off_f32_i4", VOP1_F32_I32>;		defm V_CVT_OFF_F32_I4 : VOP1Inst <"v_cvt_off_f32_i4", VOP1_F32_I32>;
} // End ReadsModeReg = 0, mayRaiseFPException = 0		} // End ReadsModeReg = 0, mayRaiseFPException = 0
} // End SchedRW = [WriteFloatCvt]		} // End SchedRW = [WriteFloatCvt]

▲ Show 20 Lines • Show All 141 Lines • ▼ Show 20 Lines	let SchedRW = [WriteDoubleAdd] in {
defm V_TRUNC_F64 : VOP1Inst<"v_trunc_f64", VOP_F64_F64, ftrunc>;		defm V_TRUNC_F64 : VOP1Inst<"v_trunc_f64", VOP_F64_F64, ftrunc>;
defm V_CEIL_F64 : VOP1Inst<"v_ceil_f64", VOP_F64_F64, fceil>;		defm V_CEIL_F64 : VOP1Inst<"v_ceil_f64", VOP_F64_F64, fceil>;
defm V_RNDNE_F64 : VOP1Inst<"v_rndne_f64", VOP_F64_F64, frint>;		defm V_RNDNE_F64 : VOP1Inst<"v_rndne_f64", VOP_F64_F64, frint>;
defm V_FLOOR_F64 : VOP1Inst<"v_floor_f64", VOP_F64_F64, ffloor>;		defm V_FLOOR_F64 : VOP1Inst<"v_floor_f64", VOP_F64_F64, ffloor>;
} // End SchedRW = [WriteDoubleAdd]		} // End SchedRW = [WriteDoubleAdd]
} // End SubtargetPredicate = isGFX7Plus		} // End SubtargetPredicate = isGFX7Plus
} // End isReMaterializable = 1		} // End isReMaterializable = 1

let SubtargetPredicate = Has16BitInsts in {

let FPDPRounding = 1 in {		let FPDPRounding = 1 in {
		let OtherPredicates = [Has16BitInsts, NotHasTrue16BitInsts] in {
defm V_CVT_F16_U16 : VOP1Inst <"v_cvt_f16_u16", VOP1_F16_I16, uint_to_fp>;		defm V_CVT_F16_U16 : VOP1Inst <"v_cvt_f16_u16", VOP1_F16_I16, uint_to_fp>;
defm V_CVT_F16_I16 : VOP1Inst <"v_cvt_f16_i16", VOP1_F16_I16, sint_to_fp>;		defm V_CVT_F16_I16 : VOP1Inst <"v_cvt_f16_i16", VOP1_F16_I16, sint_to_fp>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		defm V_CVT_F16_U16_t16 : VOP1Inst <"v_cvt_f16_u16_t16", VOP1_F16_I16_t16, uint_to_fp>;
		defm V_CVT_F16_I16_t16 : VOP1Inst <"v_cvt_f16_i16_t16", VOP1_F16_I16_t16, sint_to_fp>;
		}
} // End FPDPRounding = 1		} // End FPDPRounding = 1
// OMod clears exceptions when set in these two instructions		// OMod clears exceptions when set in these two instructions
		let OtherPredicates = [Has16BitInsts, NotHasTrue16BitInsts] in {
defm V_CVT_U16_F16 : VOP1Inst <"v_cvt_u16_f16", VOP_I16_F16_SPECIAL_OMOD, fp_to_uint>;		defm V_CVT_U16_F16 : VOP1Inst <"v_cvt_u16_f16", VOP_I16_F16_SPECIAL_OMOD, fp_to_uint>;
defm V_CVT_I16_F16 : VOP1Inst <"v_cvt_i16_f16", VOP_I16_F16_SPECIAL_OMOD, fp_to_sint>;		defm V_CVT_I16_F16 : VOP1Inst <"v_cvt_i16_f16", VOP_I16_F16_SPECIAL_OMOD, fp_to_sint>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		defm V_CVT_U16_F16_t16 : VOP1Inst <"v_cvt_u16_f16_t16", VOP_I16_F16_SPECIAL_OMOD_t16, fp_to_uint>;
		defm V_CVT_I16_F16_t16 : VOP1Inst <"v_cvt_i16_f16_t16", VOP_I16_F16_SPECIAL_OMOD_t16, fp_to_sint>;
		}
let TRANS = 1, SchedRW = [WriteTrans32] in {		let TRANS = 1, SchedRW = [WriteTrans32] in {
defm V_RCP_F16 : VOP1Inst <"v_rcp_f16", VOP_F16_F16, AMDGPUrcp>;		defm V_RCP_F16 : VOP1Inst_t16 <"v_rcp_f16", VOP_F16_F16, AMDGPUrcp>;
defm V_SQRT_F16 : VOP1Inst <"v_sqrt_f16", VOP_F16_F16, any_amdgcn_sqrt>;		defm V_SQRT_F16 : VOP1Inst_t16 <"v_sqrt_f16", VOP_F16_F16, any_amdgcn_sqrt>;
defm V_RSQ_F16 : VOP1Inst <"v_rsq_f16", VOP_F16_F16, AMDGPUrsq>;		defm V_RSQ_F16 : VOP1Inst_t16 <"v_rsq_f16", VOP_F16_F16, AMDGPUrsq>;
defm V_LOG_F16 : VOP1Inst <"v_log_f16", VOP_F16_F16, flog2>;		defm V_LOG_F16 : VOP1Inst_t16 <"v_log_f16", VOP_F16_F16, flog2>;
defm V_EXP_F16 : VOP1Inst <"v_exp_f16", VOP_F16_F16, fexp2>;		defm V_EXP_F16 : VOP1Inst_t16 <"v_exp_f16", VOP_F16_F16, fexp2>;
defm V_SIN_F16 : VOP1Inst <"v_sin_f16", VOP_F16_F16, AMDGPUsin>;		defm V_SIN_F16 : VOP1Inst_t16 <"v_sin_f16", VOP_F16_F16, AMDGPUsin>;
defm V_COS_F16 : VOP1Inst <"v_cos_f16", VOP_F16_F16, AMDGPUcos>;		defm V_COS_F16 : VOP1Inst_t16 <"v_cos_f16", VOP_F16_F16, AMDGPUcos>;
} // End TRANS = 1, SchedRW = [WriteTrans32]		} // End TRANS = 1, SchedRW = [WriteTrans32]
defm V_FREXP_MANT_F16 : VOP1Inst <"v_frexp_mant_f16", VOP_F16_F16, int_amdgcn_frexp_mant>;		defm V_FREXP_MANT_F16 : VOP1Inst_t16 <"v_frexp_mant_f16", VOP_F16_F16, int_amdgcn_frexp_mant>;
		let OtherPredicates = [Has16BitInsts, NotHasTrue16BitInsts] in {
defm V_FREXP_EXP_I16_F16 : VOP1Inst <"v_frexp_exp_i16_f16", VOP_I16_F16_SPECIAL_OMOD, int_amdgcn_frexp_exp>;		defm V_FREXP_EXP_I16_F16 : VOP1Inst <"v_frexp_exp_i16_f16", VOP_I16_F16_SPECIAL_OMOD, int_amdgcn_frexp_exp>;
defm V_FLOOR_F16 : VOP1Inst <"v_floor_f16", VOP_F16_F16, ffloor>;		}
defm V_CEIL_F16 : VOP1Inst <"v_ceil_f16", VOP_F16_F16, fceil>;		let OtherPredicates = [HasTrue16BitInsts] in {
defm V_TRUNC_F16 : VOP1Inst <"v_trunc_f16", VOP_F16_F16, ftrunc>;		defm V_FREXP_EXP_I16_F16_t16 : VOP1Inst <"v_frexp_exp_i16_f16_t16", VOP_I16_F16_SPECIAL_OMOD_t16, int_amdgcn_frexp_exp>;
defm V_RNDNE_F16 : VOP1Inst <"v_rndne_f16", VOP_F16_F16, frint>;		}
		defm V_FLOOR_F16 : VOP1Inst_t16 <"v_floor_f16", VOP_F16_F16, ffloor>;
		defm V_CEIL_F16 : VOP1Inst_t16 <"v_ceil_f16", VOP_F16_F16, fceil>;
		defm V_TRUNC_F16 : VOP1Inst_t16 <"v_trunc_f16", VOP_F16_F16, ftrunc>;
		defm V_RNDNE_F16 : VOP1Inst_t16 <"v_rndne_f16", VOP_F16_F16, frint>;
let FPDPRounding = 1 in {		let FPDPRounding = 1 in {
defm V_FRACT_F16 : VOP1Inst <"v_fract_f16", VOP_F16_F16, AMDGPUfract>;		defm V_FRACT_F16 : VOP1Inst_t16 <"v_fract_f16", VOP_F16_F16, AMDGPUfract>;
} // End FPDPRounding = 1		} // End FPDPRounding = 1

}		let OtherPredicates = [Has16BitInsts, NotHasTrue16BitInsts] in {

let OtherPredicates = [Has16BitInsts] in {

def : GCNPat<		def : GCNPat<
(f32 (f16_to_fp i16:$src)),		(f32 (f16_to_fp i16:$src)),
(V_CVT_F32_F16_e32 $src)		(V_CVT_F32_F16_e32 $src)
>;		>;

def : GCNPat<		def : GCNPat<
(i16 (AMDGPUfp_to_f16 f32:$src)),		(i16 (AMDGPUfp_to_f16 f32:$src)),
(V_CVT_F16_F32_e32 $src)		(V_CVT_F16_F32_e32 $src)
>;		>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		def : GCNPat<
		(f32 (f16_to_fp i16:$src)),
		(V_CVT_F32_F16_t16_e32 $src)
		>;
		def : GCNPat<
		(i16 (AMDGPUfp_to_f16 f32:$src)),
		(V_CVT_F16_F32_t16_e32 $src)
		>;
}		}

def VOP_SWAP_I32 : VOPProfile<[i32, i32, i32, untyped]> {		def VOP_SWAP_I32 : VOPProfile<[i32, i32, i32, untyped]> {
let Outs32 = (outs VGPR_32:$vdst, VGPR_32:$vdst1);		let Outs32 = (outs VGPR_32:$vdst, VGPR_32:$vdst1);
let Ins32 = (ins VGPR_32:$src0, VGPR_32:$src1);		let Ins32 = (ins VGPR_32:$src0, VGPR_32:$src1);
let Outs64 = Outs32;		let Outs64 = Outs32;
let Asm32 = " $vdst, $src0";		let Asm32 = " $vdst, $src0";
let Asm64 = "";		let Asm64 = "";
let Ins64 = (ins);		let Ins64 = (ins);
}		}

let SubtargetPredicate = isGFX9Plus in {		let SubtargetPredicate = isGFX9Plus in {
def V_SWAP_B32 : VOP1_Pseudo<"v_swap_b32", VOP_SWAP_I32, [], 1> {		def V_SWAP_B32 : VOP1_Pseudo<"v_swap_b32", VOP_SWAP_I32, [], 1> {
let Constraints = "$vdst = $src1, $vdst1 = $src0";		let Constraints = "$vdst = $src1, $vdst1 = $src0";
let DisableEncoding = "$vdst1,$src1";		let DisableEncoding = "$vdst1,$src1";
let SchedRW = [Write64Bit, Write64Bit];		let SchedRW = [Write64Bit, Write64Bit];
}		}

let isReMaterializable = 1 in		let isReMaterializable = 1 in
defm V_SAT_PK_U8_I16 : VOP1Inst<"v_sat_pk_u8_i16", VOP_I32_I32>;		defm V_SAT_PK_U8_I16 : VOP1Inst<"v_sat_pk_u8_i16", VOP_I32_I32>;

let mayRaiseFPException = 0 in {		let mayRaiseFPException = 0 in {
		let OtherPredicates = [Has16BitInsts, NotHasTrue16BitInsts] in {
defm V_CVT_NORM_I16_F16 : VOP1Inst<"v_cvt_norm_i16_f16", VOP_I16_F16_SPECIAL_OMOD>;		defm V_CVT_NORM_I16_F16 : VOP1Inst<"v_cvt_norm_i16_f16", VOP_I16_F16_SPECIAL_OMOD>;
defm V_CVT_NORM_U16_F16 : VOP1Inst<"v_cvt_norm_u16_f16", VOP_I16_F16_SPECIAL_OMOD>;		defm V_CVT_NORM_U16_F16 : VOP1Inst<"v_cvt_norm_u16_f16", VOP_I16_F16_SPECIAL_OMOD>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		defm V_CVT_NORM_I16_F16_t16 : VOP1Inst<"v_cvt_norm_i16_f16_t16", VOP_I16_F16_SPECIAL_OMOD_t16>;
		defm V_CVT_NORM_U16_F16_t16 : VOP1Inst<"v_cvt_norm_u16_f16_t16", VOP_I16_F16_SPECIAL_OMOD_t16>;
		}
} // End mayRaiseFPException = 0		} // End mayRaiseFPException = 0
} // End SubtargetPredicate = isGFX9Plus		} // End SubtargetPredicate = isGFX9Plus

let SubtargetPredicate = isGFX9Only in {		let SubtargetPredicate = isGFX9Only in {
defm V_SCREEN_PARTITION_4SE_B32 : VOP1Inst <"v_screen_partition_4se_b32", VOP_I32_I32>;		defm V_SCREEN_PARTITION_4SE_B32 : VOP1Inst <"v_screen_partition_4se_b32", VOP_I32_I32>;
} // End SubtargetPredicate = isGFX9Only		} // End SubtargetPredicate = isGFX9Only

class VOPProfile_Base_CVT_F32_F8<ValueType vt> : VOPProfileI2F <vt, i32> {		class VOPProfile_Base_CVT_F32_F8<ValueType vt> : VOPProfileI2F <vt, i32> {
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
}		}

let SubtargetPredicate = isGFX11Plus in {		let SubtargetPredicate = isGFX11Plus in {
// Restrict src0 to be VGPR		// Restrict src0 to be VGPR
def V_PERMLANE64_B32 : VOP1_Pseudo<"v_permlane64_b32", VOP_MOVRELS,		def V_PERMLANE64_B32 : VOP1_Pseudo<"v_permlane64_b32", VOP_MOVRELS,
getVOP1Pat64<int_amdgcn_permlane64,		getVOP1Pat64<int_amdgcn_permlane64,
VOP_MOVRELS>.ret,		VOP_MOVRELS>.ret,
/VOP1Only=/ 1>;		/VOP1Only=/ 1>;
defm V_NOT_B16 : VOP1Inst<"v_not_b16", VOP_I16_I16>;		defm V_NOT_B16 : VOP1Inst_t16<"v_not_b16", VOP_I16_I16>;
defm V_CVT_I32_I16 : VOP1Inst<"v_cvt_i32_i16", VOP_I32_I16>;		defm V_CVT_I32_I16 : VOP1Inst_t16<"v_cvt_i32_i16", VOP_I32_I16>;
defm V_CVT_U32_U16 : VOP1Inst<"v_cvt_u32_u16", VOP_I32_I16>;		defm V_CVT_U32_U16 : VOP1Inst_t16<"v_cvt_u32_u16", VOP_I32_I16>;
} // End SubtargetPredicate = isGFX11Plus		} // End SubtargetPredicate = isGFX11Plus

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Target-specific instruction encodings.		// Target-specific instruction encodings.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class VOP1_DPP<bits<8> op, VOP1_DPP_Pseudo ps, VOPProfile p = ps.Pfl, bit isDPP16 = 0> :		class VOP1_DPP<bits<8> op, VOP1_DPP_Pseudo ps, VOPProfile p = ps.Pfl, bit isDPP16 = 0> :
VOP_DPP<ps.OpName, p, isDPP16> {		VOP_DPP<ps.OpName, p, isDPP16> {
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	multiclass VOP1_Real_e32_gfx11<bits<9> op, string opName = NAME> {
def _e32_gfx11 :		def _e32_gfx11 :
VOP1_Real<ps, SIEncodingFamily.GFX11>,		VOP1_Real<ps, SIEncodingFamily.GFX11>,
VOP1e<op{7-0}, ps.Pfl>;		VOP1e<op{7-0}, ps.Pfl>;
}		}
multiclass VOP1_Real_e32_with_name_gfx11<bits<9> op, string opName,		multiclass VOP1_Real_e32_with_name_gfx11<bits<9> op, string opName,
string asmName> {		string asmName> {
defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");		defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");
let AsmString = asmName # ps.AsmOperands in {		let AsmString = asmName # ps.AsmOperands in {
defm NAME : VOP1_Real_e32_gfx11<op, opName>,		defm NAME : VOP1_Real_e32_gfx11<op, opName>;
MnemonicAlias<ps.Mnemonic, asmName>, Requires<[isGFX11Plus]>;
}		}
}		}
multiclass VOP1_Real_e64_gfx11<bits<9> op> {		multiclass VOP1_Real_e64_gfx11<bits<9> op> {
def _e64_gfx11 :		def _e64_gfx11 :
VOP3_Real<!cast<VOP3_Pseudo>(NAME#"_e64"), SIEncodingFamily.GFX11>,		VOP3_Real<!cast<VOP3_Pseudo>(NAME#"_e64"), SIEncodingFamily.GFX11>,
VOP3e_gfx11<{0, 1, 1, op{6-0}}, !cast<VOP3_Pseudo>(NAME#"_e64").Pfl>;		VOP3e_gfx11<{0, 1, 1, op{6-0}}, !cast<VOP3_Pseudo>(NAME#"_e64").Pfl>;
}		}
multiclass VOP1_Real_dpp_gfx11<bits<9> op, string opName = NAME> {		multiclass VOP1_Real_dpp_gfx11<bits<9> op, string opName = NAME> {
defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");		defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");
def _dpp_gfx11 : VOP1_DPP16<op{7-0}, !cast<VOP1_DPP_Pseudo>(opName#"_dpp"), SIEncodingFamily.GFX11> {		def _dpp_gfx11 : VOP1_DPP16<op{7-0}, !cast<VOP1_DPP_Pseudo>(opName#"_dpp"), SIEncodingFamily.GFX11> {
let DecoderNamespace = "DPPGFX11";		let DecoderNamespace = "DPPGFX11";
}		}
}		}
multiclass VOP1_Real_dpp_with_name_gfx11<bits<9> op, string opName,		multiclass VOP1_Real_dpp_with_name_gfx11<bits<9> op, string opName,
string asmName> {		string asmName> {
defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");		defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");
let AsmString = asmName # ps.Pfl.AsmDPP16, DecoderNamespace = "DPPGFX11" in {		let AsmString = asmName # ps.Pfl.AsmDPP16, DecoderNamespace = "DPPGFX11" in {
defm NAME : VOP1_Real_dpp_gfx11<op, opName>,		defm NAME : VOP1_Real_dpp_gfx11<op, opName>;
MnemonicAlias<ps.Mnemonic, asmName>, Requires<[isGFX11Plus]>;
}		}
}		}
multiclass VOP1_Real_dpp8_gfx11<bits<9> op, string opName = NAME> {		multiclass VOP1_Real_dpp8_gfx11<bits<9> op, string opName = NAME> {
defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");		defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");
def _dpp8_gfx11 : VOP1_DPP8<op{7-0}, ps> {		def _dpp8_gfx11 : VOP1_DPP8<op{7-0}, ps> {
let DecoderNamespace = "DPP8GFX11";		let DecoderNamespace = "DPP8GFX11";
}		}
}		}
multiclass VOP1_Real_dpp8_with_name_gfx11<bits<9> op, string opName,		multiclass VOP1_Real_dpp8_with_name_gfx11<bits<9> op, string opName,
string asmName> {		string asmName> {
defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");		defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");
let AsmString = asmName # ps.Pfl.AsmDPP8, DecoderNamespace = "DPP8GFX11" in {		let AsmString = asmName # ps.Pfl.AsmDPP8, DecoderNamespace = "DPP8GFX11" in {
defm NAME : VOP1_Real_dpp8_gfx11<op, opName>,		defm NAME : VOP1_Real_dpp8_gfx11<op, opName>;
MnemonicAlias<ps.Mnemonic, asmName>, Requires<[isGFX11Plus]>;
}		}
}		}
} // End AssemblerPredicate = isGFX11Only, DecoderNamespace = "GFX11"		} // End AssemblerPredicate = isGFX11Only, DecoderNamespace = "GFX11"

multiclass VOP1_Realtriple_e64_gfx11<bits<9> op> {		multiclass VOP1_Realtriple_e64_gfx11<bits<9> op> {
defm NAME : VOP3_Realtriple_gfx11<{0, 1, 1, op{6-0}}, /isSingle=/ 0, NAME>;		defm NAME : VOP3_Realtriple_gfx11<{0, 1, 1, op{6-0}}, /isSingle=/ 0, NAME>;
}		}
multiclass VOP1_Realtriple_e64_with_name_gfx11<bits<9> op, string opName,		multiclass VOP1_Realtriple_e64_with_name_gfx11<bits<9> op, string opName,
string asmName> {		string asmName> {
defm NAME : VOP3_Realtriple_with_name_gfx11<{0, 1, 1, op{6-0}}, opName,		defm NAME : VOP3_Realtriple_with_name_gfx11<{0, 1, 1, op{6-0}}, opName,
asmName>;		asmName>;
}		}

multiclass VOP1_Real_FULL_gfx11<bits<9> op> :		multiclass VOP1_Real_FULL_gfx11<bits<9> op> :
VOP1_Real_e32_gfx11<op>, VOP1_Realtriple_e64_gfx11<op>,		VOP1_Real_e32_gfx11<op>, VOP1_Realtriple_e64_gfx11<op>,
VOP1_Real_dpp_gfx11<op>, VOP1_Real_dpp8_gfx11<op>;		VOP1_Real_dpp_gfx11<op>, VOP1_Real_dpp8_gfx11<op>;

multiclass VOP1_Real_NO_VOP3_with_name_gfx11<bits<9> op, string opName,		multiclass VOP1_Real_NO_VOP3_with_name_gfx11<bits<9> op, string opName,
string asmName> :		string asmName> {
VOP1_Real_e32_with_name_gfx11<op, opName, asmName>,		defm NAME : VOP1_Real_e32_with_name_gfx11<op, opName, asmName>,
VOP1_Real_dpp_with_name_gfx11<op, opName, asmName>,		VOP1_Real_dpp_with_name_gfx11<op, opName, asmName>,
VOP1_Real_dpp8_with_name_gfx11<op, opName, asmName>;		VOP1_Real_dpp8_with_name_gfx11<op, opName, asmName>;
		defvar ps = !cast<VOP1_Pseudo>(opName#"_e32");
		def gfx11_alias : MnemonicAlias<ps.Mnemonic, asmName>,
		Requires<[isGFX11Plus]>;
		}

multiclass VOP1_Real_FULL_with_name_gfx11<bits<9> op, string opName,		multiclass VOP1_Real_FULL_with_name_gfx11<bits<9> op, string opName,
string asmName> :		string asmName> :
VOP1_Real_NO_VOP3_with_name_gfx11<op, opName, asmName>,		VOP1_Real_NO_VOP3_with_name_gfx11<op, opName, asmName>,
VOP1_Realtriple_e64_with_name_gfx11<op, opName, asmName>;		VOP1_Realtriple_e64_with_name_gfx11<op, opName, asmName>;

		multiclass VOP1_Real_FULL_t16_gfx11<bits<9> op, string asmName,
		string opName = NAME> :
		VOP1_Real_FULL_with_name_gfx11<op, opName, asmName>;

multiclass VOP1_Real_NO_DPP_gfx11<bits<9> op> :		multiclass VOP1_Real_NO_DPP_gfx11<bits<9> op> :
VOP1_Real_e32_gfx11<op>, VOP1_Real_e64_gfx11<op>;		VOP1_Real_e32_gfx11<op>, VOP1_Real_e64_gfx11<op>;

defm V_CVT_NEAREST_I32_F32 : VOP1_Real_FULL_with_name_gfx11<0x00c,		defm V_CVT_NEAREST_I32_F32 : VOP1_Real_FULL_with_name_gfx11<0x00c,
"V_CVT_RPI_I32_F32", "v_cvt_nearest_i32_f32">;		"V_CVT_RPI_I32_F32", "v_cvt_nearest_i32_f32">;
defm V_CVT_FLOOR_I32_F32 : VOP1_Real_FULL_with_name_gfx11<0x00d,		defm V_CVT_FLOOR_I32_F32 : VOP1_Real_FULL_with_name_gfx11<0x00d,
"V_CVT_FLR_I32_F32", "v_cvt_floor_i32_f32">;		"V_CVT_FLR_I32_F32", "v_cvt_floor_i32_f32">;
defm V_CLZ_I32_U32 : VOP1_Real_FULL_with_name_gfx11<0x039,		defm V_CLZ_I32_U32 : VOP1_Real_FULL_with_name_gfx11<0x039,
"V_FFBH_U32", "v_clz_i32_u32">;		"V_FFBH_U32", "v_clz_i32_u32">;
defm V_CTZ_I32_B32 : VOP1_Real_FULL_with_name_gfx11<0x03a,		defm V_CTZ_I32_B32 : VOP1_Real_FULL_with_name_gfx11<0x03a,
"V_FFBL_B32", "v_ctz_i32_b32">;		"V_FFBL_B32", "v_ctz_i32_b32">;
defm V_CLS_I32 : VOP1_Real_FULL_with_name_gfx11<0x03b,		defm V_CLS_I32 : VOP1_Real_FULL_with_name_gfx11<0x03b,
"V_FFBH_I32", "v_cls_i32">;		"V_FFBH_I32", "v_cls_i32">;
defm V_PERMLANE64_B32 : VOP1Only_Real_gfx11<0x067>;		defm V_PERMLANE64_B32 : VOP1Only_Real_gfx11<0x067>;
defm V_NOT_B16 : VOP1_Real_FULL_gfx11<0x069>;		defm V_NOT_B16_t16 : VOP1_Real_FULL_t16_gfx11<0x069, "v_not_b16">;
defm V_CVT_I32_I16 : VOP1_Real_FULL_gfx11<0x06a>;		defm V_CVT_I32_I16_t16 : VOP1_Real_FULL_t16_gfx11<0x06a, "v_cvt_i32_i16">;
defm V_CVT_U32_U16 : VOP1_Real_FULL_gfx11<0x06b>;		defm V_CVT_U32_U16_t16 : VOP1_Real_FULL_t16_gfx11<0x06b, "v_cvt_u32_u16">;

		defm V_CVT_F16_U16_t16 : VOP1_Real_FULL_t16_gfx11<0x050, "v_cvt_f16_u16">;
		defm V_CVT_F16_I16_t16 : VOP1_Real_FULL_t16_gfx11<0x051, "v_cvt_f16_i16">;
		defm V_CVT_U16_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x052, "v_cvt_u16_f16">;
		defm V_CVT_I16_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x053, "v_cvt_i16_f16">;
		defm V_RCP_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x054, "v_rcp_f16">;
		defm V_SQRT_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x055, "v_sqrt_f16">;
		defm V_RSQ_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x056, "v_rsq_f16">;
		defm V_LOG_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x057, "v_log_f16">;
		defm V_EXP_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x058, "v_exp_f16">;
		defm V_FREXP_MANT_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x059, "v_frexp_mant_f16">;
		defm V_FREXP_EXP_I16_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x05a, "v_frexp_exp_i16_f16">;
		defm V_FLOOR_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x05b, "v_floor_f16">;
		defm V_CEIL_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x05c, "v_ceil_f16">;
		defm V_TRUNC_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x05d, "v_trunc_f16">;
		defm V_RNDNE_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x05e, "v_rndne_f16">;
		defm V_FRACT_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x05f, "v_fract_f16">;
		defm V_SIN_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x060, "v_sin_f16">;
		defm V_COS_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x061, "v_cos_f16">;
		defm V_CVT_NORM_I16_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x063, "v_cvt_norm_i16_f16">;
		defm V_CVT_NORM_U16_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x064, "v_cvt_norm_u16_f16">;

		defm V_CVT_F16_F32_t16 : VOP1_Real_FULL_t16_gfx11<0x00a, "v_cvt_f16_f32">;
		defm V_CVT_F32_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x00b, "v_cvt_f32_f16">;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// GFX10.		// GFX10.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

let AssemblerPredicate = isGFX10Only, DecoderNamespace = "GFX10" in {		let AssemblerPredicate = isGFX10Only, DecoderNamespace = "GFX10" in {
multiclass VOP1Only_Real_gfx10<bits<9> op> {		multiclass VOP1Only_Real_gfx10<bits<9> op> {
def _gfx10 :		def _gfx10 :
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
multiclass VOP1_Real_gfx10_NO_DPP_gfx11<bits<9> op> :		multiclass VOP1_Real_gfx10_NO_DPP_gfx11<bits<9> op> :
VOP1_Real_gfx10<op>, VOP1_Real_NO_DPP_gfx11<op>;		VOP1_Real_gfx10<op>, VOP1_Real_NO_DPP_gfx11<op>;

multiclass VOP1Only_Real_gfx10_gfx11<bits<9> op> :		multiclass VOP1Only_Real_gfx10_gfx11<bits<9> op> :
VOP1Only_Real_gfx10<op>, VOP1Only_Real_gfx11<op>;		VOP1Only_Real_gfx10<op>, VOP1Only_Real_gfx11<op>;

defm V_PIPEFLUSH : VOP1_Real_gfx10_NO_DPP_gfx11<0x01b>;		defm V_PIPEFLUSH : VOP1_Real_gfx10_NO_DPP_gfx11<0x01b>;
defm V_MOVRELSD_2_B32 : VOP1_Real_gfx10_FULL_gfx11<0x048>;		defm V_MOVRELSD_2_B32 : VOP1_Real_gfx10_FULL_gfx11<0x048>;
defm V_CVT_F16_U16 : VOP1_Real_gfx10_FULL_gfx11<0x050>;		defm V_CVT_F16_U16 : VOP1_Real_gfx10<0x050>;
defm V_CVT_F16_I16 : VOP1_Real_gfx10_FULL_gfx11<0x051>;		defm V_CVT_F16_I16 : VOP1_Real_gfx10<0x051>;
defm V_CVT_U16_F16 : VOP1_Real_gfx10_FULL_gfx11<0x052>;		defm V_CVT_U16_F16 : VOP1_Real_gfx10<0x052>;
defm V_CVT_I16_F16 : VOP1_Real_gfx10_FULL_gfx11<0x053>;		defm V_CVT_I16_F16 : VOP1_Real_gfx10<0x053>;
defm V_RCP_F16 : VOP1_Real_gfx10_FULL_gfx11<0x054>;		defm V_RCP_F16 : VOP1_Real_gfx10<0x054>;
defm V_SQRT_F16 : VOP1_Real_gfx10_FULL_gfx11<0x055>;		defm V_SQRT_F16 : VOP1_Real_gfx10<0x055>;
defm V_RSQ_F16 : VOP1_Real_gfx10_FULL_gfx11<0x056>;		defm V_RSQ_F16 : VOP1_Real_gfx10<0x056>;
defm V_LOG_F16 : VOP1_Real_gfx10_FULL_gfx11<0x057>;		defm V_LOG_F16 : VOP1_Real_gfx10<0x057>;
defm V_EXP_F16 : VOP1_Real_gfx10_FULL_gfx11<0x058>;		defm V_EXP_F16 : VOP1_Real_gfx10<0x058>;
defm V_FREXP_MANT_F16 : VOP1_Real_gfx10_FULL_gfx11<0x059>;		defm V_FREXP_MANT_F16 : VOP1_Real_gfx10<0x059>;
defm V_FREXP_EXP_I16_F16 : VOP1_Real_gfx10_FULL_gfx11<0x05a>;		defm V_FREXP_EXP_I16_F16 : VOP1_Real_gfx10<0x05a>;
defm V_FLOOR_F16 : VOP1_Real_gfx10_FULL_gfx11<0x05b>;		defm V_FLOOR_F16 : VOP1_Real_gfx10<0x05b>;
defm V_CEIL_F16 : VOP1_Real_gfx10_FULL_gfx11<0x05c>;		defm V_CEIL_F16 : VOP1_Real_gfx10<0x05c>;
defm V_TRUNC_F16 : VOP1_Real_gfx10_FULL_gfx11<0x05d>;		defm V_TRUNC_F16 : VOP1_Real_gfx10<0x05d>;
defm V_RNDNE_F16 : VOP1_Real_gfx10_FULL_gfx11<0x05e>;		defm V_RNDNE_F16 : VOP1_Real_gfx10<0x05e>;
defm V_FRACT_F16 : VOP1_Real_gfx10_FULL_gfx11<0x05f>;		defm V_FRACT_F16 : VOP1_Real_gfx10<0x05f>;
defm V_SIN_F16 : VOP1_Real_gfx10_FULL_gfx11<0x060>;		defm V_SIN_F16 : VOP1_Real_gfx10<0x060>;
defm V_COS_F16 : VOP1_Real_gfx10_FULL_gfx11<0x061>;		defm V_COS_F16 : VOP1_Real_gfx10<0x061>;
defm V_SAT_PK_U8_I16 : VOP1_Real_gfx10_FULL_gfx11<0x062>;		defm V_SAT_PK_U8_I16 : VOP1_Real_gfx10_FULL_gfx11<0x062>;
defm V_CVT_NORM_I16_F16 : VOP1_Real_gfx10_FULL_gfx11<0x063>;		defm V_CVT_NORM_I16_F16 : VOP1_Real_gfx10<0x063>;
defm V_CVT_NORM_U16_F16 : VOP1_Real_gfx10_FULL_gfx11<0x064>;		defm V_CVT_NORM_U16_F16 : VOP1_Real_gfx10<0x064>;

defm V_SWAP_B32 : VOP1Only_Real_gfx10_gfx11<0x065>;		defm V_SWAP_B32 : VOP1Only_Real_gfx10_gfx11<0x065>;
defm V_SWAPREL_B32 : VOP1Only_Real_gfx10_gfx11<0x068>;		defm V_SWAPREL_B32 : VOP1Only_Real_gfx10_gfx11<0x068>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// GFX7, GFX10.		// GFX7, GFX10.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines
defm V_NOP : VOP1_Real_gfx6_gfx7_gfx10_NO_DPP_gfx11<0x000>;		defm V_NOP : VOP1_Real_gfx6_gfx7_gfx10_NO_DPP_gfx11<0x000>;
defm V_MOV_B32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x001>;		defm V_MOV_B32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x001>;
defm V_CVT_I32_F64 : VOP1_Real_gfx6_gfx7_gfx10_NO_DPP_gfx11<0x003>;		defm V_CVT_I32_F64 : VOP1_Real_gfx6_gfx7_gfx10_NO_DPP_gfx11<0x003>;
defm V_CVT_F64_I32 : VOP1_Real_gfx6_gfx7_gfx10_NO_DPP_gfx11<0x004>;		defm V_CVT_F64_I32 : VOP1_Real_gfx6_gfx7_gfx10_NO_DPP_gfx11<0x004>;
defm V_CVT_F32_I32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x005>;		defm V_CVT_F32_I32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x005>;
defm V_CVT_F32_U32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x006>;		defm V_CVT_F32_U32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x006>;
defm V_CVT_U32_F32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x007>;		defm V_CVT_U32_F32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x007>;
defm V_CVT_I32_F32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x008>;		defm V_CVT_I32_F32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x008>;
defm V_CVT_F16_F32 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x00a>;		defm V_CVT_F16_F32 : VOP1_Real_gfx6_gfx7_gfx10<0x00a>;
defm V_CVT_F32_F16 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x00b>;		defm V_CVT_F32_F16 : VOP1_Real_gfx6_gfx7_gfx10<0x00b>;
defm V_CVT_RPI_I32_F32 : VOP1_Real_gfx6_gfx7_gfx10<0x00c>;		defm V_CVT_RPI_I32_F32 : VOP1_Real_gfx6_gfx7_gfx10<0x00c>;
defm V_CVT_FLR_I32_F32 : VOP1_Real_gfx6_gfx7_gfx10<0x00d>;		defm V_CVT_FLR_I32_F32 : VOP1_Real_gfx6_gfx7_gfx10<0x00d>;
defm V_CVT_OFF_F32_I4 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x00e>;		defm V_CVT_OFF_F32_I4 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x00e>;
defm V_CVT_F32_F64 : VOP1_Real_gfx6_gfx7_gfx10_NO_DPP_gfx11<0x00f>;		defm V_CVT_F32_F64 : VOP1_Real_gfx6_gfx7_gfx10_NO_DPP_gfx11<0x00f>;
defm V_CVT_F64_F32 : VOP1_Real_gfx6_gfx7_gfx10_NO_DPP_gfx11<0x010>;		defm V_CVT_F64_F32 : VOP1_Real_gfx6_gfx7_gfx10_NO_DPP_gfx11<0x010>;
defm V_CVT_F32_UBYTE0 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x011>;		defm V_CVT_F32_UBYTE0 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x011>;
defm V_CVT_F32_UBYTE1 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x012>;		defm V_CVT_F32_UBYTE1 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x012>;
defm V_CVT_F32_UBYTE2 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x013>;		defm V_CVT_F32_UBYTE2 : VOP1_Real_gfx6_gfx7_gfx10_FULL_gfx11<0x013>;
▲ Show 20 Lines • Show All 166 Lines • ▼ Show 20 Lines
defm V_ACCVGPR_MOV_B32 : VOP1Only_Real_vi<0x52>;		defm V_ACCVGPR_MOV_B32 : VOP1Only_Real_vi<0x52>;

let VOP1 = 1, SubtargetPredicate = isGFX8GFX9, Uses = [EXEC, M0] in {		let VOP1 = 1, SubtargetPredicate = isGFX8GFX9, Uses = [EXEC, M0] in {

// Copy of v_mov_b32 with $vdst as a use operand for use with VGPR		// Copy of v_mov_b32 with $vdst as a use operand for use with VGPR
// indexing mode. vdst can't be treated as a def for codegen purposes,		// indexing mode. vdst can't be treated as a def for codegen purposes,
// and an implicit use and def of the super register should be added.		// and an implicit use and def of the super register should be added.
def V_MOV_B32_indirect_write : VPseudoInstSI<(outs),		def V_MOV_B32_indirect_write : VPseudoInstSI<(outs),
(ins getVALUDstForVT<i32>.ret:$vdst, getVOPSrc0ForVT<i32>.ret:$src0)>,		(ins getVALUDstForVT<i32>.ret:$vdst, getVOPSrc0ForVT<i32, 0>.ret:$src0)>,
PseudoInstExpansion<(V_MOV_B32_e32_vi getVALUDstForVT<i32>.ret:$vdst,		PseudoInstExpansion<(V_MOV_B32_e32_vi getVALUDstForVT<i32>.ret:$vdst,
getVOPSrc0ForVT<i32>.ret:$src0)>;		getVOPSrc0ForVT<i32, 0>.ret:$src0)>;

// Copy of v_mov_b32 for use with VGPR indexing mode. An implicit use of the		// Copy of v_mov_b32 for use with VGPR indexing mode. An implicit use of the
// super register should be added.		// super register should be added.
def V_MOV_B32_indirect_read : VPseudoInstSI<		def V_MOV_B32_indirect_read : VPseudoInstSI<
(outs getVALUDstForVT<i32>.ret:$vdst),		(outs getVALUDstForVT<i32>.ret:$vdst),
(ins getVOPSrc0ForVT<i32>.ret:$src0)>,		(ins getVOPSrc0ForVT<i32, 0>.ret:$src0)>,
PseudoInstExpansion<(V_MOV_B32_e32_vi getVALUDstForVT<i32>.ret:$vdst,		PseudoInstExpansion<(V_MOV_B32_e32_vi getVALUDstForVT<i32>.ret:$vdst,
getVOPSrc0ForVT<i32>.ret:$src0)>;		getVOPSrc0ForVT<i32, 0>.ret:$src0)>;

} // End VOP1 = 1, SubtargetPredicate = isGFX8GFX9, Uses = [M0]		} // End VOP1 = 1, SubtargetPredicate = isGFX8GFX9, Uses = [M0]

let OtherPredicates = [isGFX8Plus] in {		let OtherPredicates = [isGFX8Plus] in {

def : GCNPat <		def : GCNPat <
(i32 (int_amdgcn_mov_dpp i32:$src, timm:$dpp_ctrl, timm:$row_mask,		(i32 (int_amdgcn_mov_dpp i32:$src, timm:$dpp_ctrl, timm:$row_mask,
timm:$bank_mask, timm:$bound_ctrl)),		timm:$bank_mask, timm:$bound_ctrl)),
▲ Show 20 Lines • Show All 115 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/VOP2Instructions.td

Show First 20 Lines • Show All 180 Lines • ▼ Show 20 Lines	multiclass VOP2Inst<string opName,
VOP2Inst_e64<opName, P, node, revOp, GFX9Renamed>,		VOP2Inst_e64<opName, P, node, revOp, GFX9Renamed>,
VOP2Inst_sdwa<opName, P, GFX9Renamed> {		VOP2Inst_sdwa<opName, P, GFX9Renamed> {
let renamedInGFX9 = GFX9Renamed in {		let renamedInGFX9 = GFX9Renamed in {
foreach _ = BoolToList<P.HasExtDPP>.ret in		foreach _ = BoolToList<P.HasExtDPP>.ret in
def _dpp : VOP2_DPP_Pseudo <opName, P>;		def _dpp : VOP2_DPP_Pseudo <opName, P>;
}		}
}		}

		multiclass VOP2Inst_t16<string opName,
		VOPProfile P,
		SDPatternOperator node = null_frag,
		string revOp = opName,
		bit GFX9Renamed = 0> {
		let SubtargetPredicate = NotHasTrue16BitInsts, OtherPredicates = [Has16BitInsts] in {
		defm NAME : VOP2Inst<opName, P, node, revOp, GFX9Renamed>;
		}
		let SubtargetPredicate = HasTrue16BitInsts in {
		defm _t16 : VOP2Inst<opName#"_t16", VOPProfile_True16<P>, node, revOp#"_t16", GFX9Renamed>;
		}
		}

		// Creating a _t16_e32 pseudo when there is no corresponding real instruction on
		// any subtarget is a problem. It makes getMCOpcodeGen return -1, which we
		// assume means the instruction is already a real. The fix is to not create that
		// _t16_e32 pseudo
		multiclass VOP2Inst_e64_t16<string opName,
		VOPProfile P,
		SDPatternOperator node = null_frag,
		string revOp = opName,
		bit GFX9Renamed = 0> {
		let SubtargetPredicate = NotHasTrue16BitInsts, OtherPredicates = [Has16BitInsts] in {
		defm NAME : VOP2Inst<opName, P, node, revOp, GFX9Renamed>;
		}
		let SubtargetPredicate = HasTrue16BitInsts in {
		defm _t16 : VOP2Inst_e64<opName#"_t16", VOPProfile_True16<P>, node, revOp#"_t16", GFX9Renamed>;
		}
		}

multiclass VOP2Inst_VOPD<string opName,		multiclass VOP2Inst_VOPD<string opName,
VOPProfile P,		VOPProfile P,
bits<5> VOPDOp,		bits<5> VOPDOp,
string VOPDName,		string VOPDName,
SDPatternOperator node = null_frag,		SDPatternOperator node = null_frag,
string revOp = opName,		string revOp = opName,
bit GFX9Renamed = 0> :		bit GFX9Renamed = 0> :
VOP2Inst_e32_VOPD<opName, P, VOPDOp, VOPDName, node, revOp, GFX9Renamed>,		VOP2Inst_e32_VOPD<opName, P, VOPDOp, VOPDName, node, revOp, GFX9Renamed>,
▲ Show 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	class VOP_MADAK <ValueType vt> : VOP_MADK_Base<vt> {
field string AsmVOPDX = "$vdstX, $src0X, $vsrc1X, $imm";		field string AsmVOPDX = "$vdstX, $src0X, $vsrc1X, $imm";
let AsmVOPDXDeferred = "$vdstX, $src0X, $vsrc1X, $immDeferred";		let AsmVOPDXDeferred = "$vdstX, $src0X, $vsrc1X, $immDeferred";
field string AsmVOPDY = "$vdstY, $src0Y, $vsrc1Y, $imm";		field string AsmVOPDY = "$vdstY, $src0Y, $vsrc1Y, $imm";
field bit HasExt = 0;		field bit HasExt = 0;
let IsSingle = 1;		let IsSingle = 1;
}		}

def VOP_MADAK_F16 : VOP_MADAK <f16>;		def VOP_MADAK_F16 : VOP_MADAK <f16>;
		def VOP_MADAK_F16_t16 : VOP_MADAK <f16> {
		let IsTrue16 = 1;
		let DstRC = VOPDstOperand<VGPR_32_Lo128>;
		let Ins32 = (ins VSrcT_f16_Lo128_Deferred:$src0, VGPR_32_Lo128:$src1, ImmOpType:$imm);
		}
def VOP_MADAK_F32 : VOP_MADAK <f32>;		def VOP_MADAK_F32 : VOP_MADAK <f32>;

class VOP_MADMK <ValueType vt> : VOP_MADK_Base<vt> {		class VOP_MADMK <ValueType vt> : VOP_MADK_Base<vt> {
field Operand ImmOpType = !if(!eq(vt.Size, 32), f32kimm, f16kimm);		field Operand ImmOpType = !if(!eq(vt.Size, 32), f32kimm, f16kimm);
field dag Ins32 = (ins VSrc_f32_Deferred:$src0, ImmOpType:$imm, VGPR_32:$src1);		field dag Ins32 = !if(!eq(vt.Size, 32),
		(ins VSrc_f32_Deferred:$src0, ImmOpType:$imm, VGPR_32:$src1),
		(ins VSrc_f16_Deferred:$src0, ImmOpType:$imm, VGPR_32:$src1));
field dag InsVOPDX = (ins VSrc_f32_Deferred:$src0X, ImmOpType:$imm, VGPR_32:$vsrc1X);		field dag InsVOPDX = (ins VSrc_f32_Deferred:$src0X, ImmOpType:$imm, VGPR_32:$vsrc1X);
let InsVOPDXDeferred = (ins VSrc_f32_Deferred:$src0X, ImmOpType:$immDeferred, VGPR_32:$vsrc1X);		let InsVOPDXDeferred = (ins VSrc_f32_Deferred:$src0X, ImmOpType:$immDeferred, VGPR_32:$vsrc1X);
field dag InsVOPDY = (ins VSrc_f32_Deferred:$src0Y, ImmOpType:$imm, VGPR_32:$vsrc1Y);		field dag InsVOPDY = (ins VSrc_f32_Deferred:$src0Y, ImmOpType:$imm, VGPR_32:$vsrc1Y);

field string Asm32 = "$vdst, $src0, $imm, $src1";		field string Asm32 = "$vdst, $src0, $imm, $src1";
field string AsmVOPDX = "$vdstX, $src0X, $imm, $vsrc1X";		field string AsmVOPDX = "$vdstX, $src0X, $imm, $vsrc1X";
let AsmVOPDXDeferred = "$vdstX, $src0X, $immDeferred, $vsrc1X";		let AsmVOPDXDeferred = "$vdstX, $src0X, $immDeferred, $vsrc1X";
field string AsmVOPDY = "$vdstY, $src0Y, $imm, $vsrc1Y";		field string AsmVOPDY = "$vdstY, $src0Y, $imm, $vsrc1Y";
field bit HasExt = 0;		field bit HasExt = 0;
let IsSingle = 1;		let IsSingle = 1;
}		}

def VOP_MADMK_F16 : VOP_MADMK <f16>;		def VOP_MADMK_F16 : VOP_MADMK <f16>;
		def VOP_MADMK_F16_t16 : VOP_MADMK <f16> {
		let IsTrue16 = 1;
		let DstRC = VOPDstOperand<VGPR_32_Lo128>;
		let Ins32 = (ins VSrcT_f16_Lo128_Deferred:$src0, ImmOpType:$imm, VGPR_32_Lo128:$src1);
		}
def VOP_MADMK_F32 : VOP_MADMK <f32>;		def VOP_MADMK_F32 : VOP_MADMK <f32>;

class getRegisterOperandForVT<ValueType VT> {		class getRegisterOperandForVT<ValueType VT> {
RegisterOperand ret = RegisterOperand<getVregSrcForVT<VT>.ret>;		RegisterOperand ret = RegisterOperand<getVregSrcForVT<VT>.ret>;
}		}

// FIXME: Remove src2_modifiers. It isn't used, so is wasting memory		// FIXME: Remove src2_modifiers. It isn't used, so is wasting memory
// and processing time but it makes it easier to convert to mad.		// and processing time but it makes it easier to convert to mad.
Show All 36 Lines	class VOP_MAC <ValueType vt0, ValueType vt1=vt0> : VOPProfile <[vt0, vt1, vt1, vt0]> {
let HasExtDPP = 1;		let HasExtDPP = 1;
let HasExt32BitDPP = 1;		let HasExt32BitDPP = 1;
let HasExtSDWA = 1;		let HasExtSDWA = 1;
let HasExtSDWA9 = 0;		let HasExtSDWA9 = 0;
let TieRegDPP = "$src2";		let TieRegDPP = "$src2";
}		}

def VOP_MAC_F16 : VOP_MAC <f16>;		def VOP_MAC_F16 : VOP_MAC <f16>;
		def VOP_MAC_F16_t16 : VOP_MAC <f16> {
		let IsTrue16 = 1;
		let DstRC = VOPDstOperand<VGPR_32_Lo128>;
		let DstRC64 = VOPDstOperand<VGPR_32>;
		let Src1RC32 = VGPRSrc_32_Lo128;
		let Ins32 = (ins Src0RC32:$src0, Src1RC32:$src1, getVregSrcForVT_t16<Src2VT>.ret:$src2);
		let Src0DPP = getVregSrcForVT_t16<Src0VT>.ret;
		let Src1DPP = getVregSrcForVT_t16<Src1VT>.ret;
		let Src2DPP = getVregSrcForVT_t16<Src2VT>.ret;
		let Src0ModDPP = getSrcModDPP_t16<Src0VT>.ret;
		let Src1ModDPP = getSrcModDPP_t16<Src1VT>.ret;
		let Src2ModDPP = getSrcModDPP_t16<Src2VT>.ret;
		let InsDPP = (ins Src0ModDPP:$src0_modifiers, Src0DPP:$src0,
		Src1ModDPP:$src1_modifiers, Src1DPP:$src1,
		getVregSrcForVT_t16<Src2VT>.ret:$src2, // stub argument
		dpp_ctrl:$dpp_ctrl, row_mask:$row_mask,
		bank_mask:$bank_mask, bound_ctrl:$bound_ctrl);
		let InsDPP8 = (ins Src0ModDPP:$src0_modifiers, Src0DPP:$src0,
		Src1ModDPP:$src1_modifiers, Src1DPP:$src1,
		getVregSrcForVT_t16<Src2VT>.ret:$src2, // stub argument
		dpp8:$dpp8, FI:$fi);
		}
def VOP_MAC_F32 : VOP_MAC <f32>;		def VOP_MAC_F32 : VOP_MAC <f32>;
let HasExtDPP = 0, HasExt32BitDPP = 0 in		let HasExtDPP = 0, HasExt32BitDPP = 0 in
def VOP_MAC_LEGACY_F32 : VOP_MAC <f32>;		def VOP_MAC_LEGACY_F32 : VOP_MAC <f32>;
let HasExtSDWA = 0, HasExt32BitDPP = 0, HasExt64BitDPP = 1 in		let HasExtSDWA = 0, HasExt32BitDPP = 0, HasExt64BitDPP = 1 in
def VOP_MAC_F64 : VOP_MAC <f64>;		def VOP_MAC_F64 : VOP_MAC <f64>;

class VOP_DOT_ACC<ValueType vt0, ValueType vt1> : VOP_MAC<vt0, vt1> {		class VOP_DOT_ACC<ValueType vt0, ValueType vt1> : VOP_MAC<vt0, vt1> {
let HasClamp = 0;		let HasClamp = 0;
▲ Show 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	GCNPat<
), sub1		), sub1
)		)
>;		>;

def : divergent_i64_BinOp <and, V_AND_B32_e64>;		def : divergent_i64_BinOp <and, V_AND_B32_e64>;
def : divergent_i64_BinOp <or, V_OR_B32_e64>;		def : divergent_i64_BinOp <or, V_OR_B32_e64>;
def : divergent_i64_BinOp <xor, V_XOR_B32_e64>;		def : divergent_i64_BinOp <xor, V_XOR_B32_e64>;

		//===----------------------------------------------------------------------===//
		// 16-Bit Operand Instructions
		//===----------------------------------------------------------------------===//

let SubtargetPredicate = Has16BitInsts in {
let isReMaterializable = 1 in {		let isReMaterializable = 1 in {
let FPDPRounding = 1 in {		let FPDPRounding = 1 in {
def V_MADMK_F16 : VOP2_Pseudo <"v_madmk_f16", VOP_MADMK_F16, [], "">;		defm V_LDEXP_F16 : VOP2Inst_t16 <"v_ldexp_f16", VOP_F16_F16_I32, AMDGPUldexp>;
defm V_LDEXP_F16 : VOP2Inst <"v_ldexp_f16", VOP_F16_F16_I32, AMDGPUldexp>;
} // End FPDPRounding = 1		} // End FPDPRounding = 1
		// FIXME VOP3 Only instructions. NFC using VOPProfile_True16 for these until a planned change to use a new register class for VOP3 encoded True16 instuctions
defm V_LSHLREV_B16 : VOP2Inst <"v_lshlrev_b16", VOP_I16_I16_I16, clshl_rev_16>;		defm V_LSHLREV_B16 : VOP2Inst_e64_t16 <"v_lshlrev_b16", VOP_I16_I16_I16, clshl_rev_16>;
defm V_LSHRREV_B16 : VOP2Inst <"v_lshrrev_b16", VOP_I16_I16_I16, clshr_rev_16>;		defm V_LSHRREV_B16 : VOP2Inst_e64_t16 <"v_lshrrev_b16", VOP_I16_I16_I16, clshr_rev_16>;
defm V_ASHRREV_I16 : VOP2Inst <"v_ashrrev_i16", VOP_I16_I16_I16, cashr_rev_16>;		defm V_ASHRREV_I16 : VOP2Inst_e64_t16 <"v_ashrrev_i16", VOP_I16_I16_I16, cashr_rev_16>;

let isCommutable = 1 in {		let isCommutable = 1 in {
let FPDPRounding = 1 in {		let FPDPRounding = 1 in {
defm V_ADD_F16 : VOP2Inst <"v_add_f16", VOP_F16_F16_F16, any_fadd>;		defm V_ADD_F16 : VOP2Inst_t16 <"v_add_f16", VOP_F16_F16_F16, any_fadd>;
defm V_SUB_F16 : VOP2Inst <"v_sub_f16", VOP_F16_F16_F16, any_fsub>;		defm V_SUB_F16 : VOP2Inst_t16 <"v_sub_f16", VOP_F16_F16_F16, any_fsub>;
defm V_SUBREV_F16 : VOP2Inst <"v_subrev_f16", VOP_F16_F16_F16, null_frag, "v_sub_f16">;		defm V_SUBREV_F16 : VOP2Inst_t16 <"v_subrev_f16", VOP_F16_F16_F16, null_frag, "v_sub_f16">;
defm V_MUL_F16 : VOP2Inst <"v_mul_f16", VOP_F16_F16_F16, any_fmul>;		defm V_MUL_F16 : VOP2Inst_t16 <"v_mul_f16", VOP_F16_F16_F16, any_fmul>;
		} // End FPDPRounding = 1
		defm V_MUL_LO_U16 : VOP2Inst_e64_t16 <"v_mul_lo_u16", VOP_I16_I16_I16, mul>;
		defm V_MAX_F16 : VOP2Inst_t16 <"v_max_f16", VOP_F16_F16_F16, fmaxnum_like>;
		defm V_MIN_F16 : VOP2Inst_t16 <"v_min_f16", VOP_F16_F16_F16, fminnum_like>;
		defm V_MAX_U16 : VOP2Inst_e64_t16 <"v_max_u16", VOP_I16_I16_I16, umax>;
		defm V_MAX_I16 : VOP2Inst_e64_t16 <"v_max_i16", VOP_I16_I16_I16, smax>;
		defm V_MIN_U16 : VOP2Inst_e64_t16 <"v_min_u16", VOP_I16_I16_I16, umin>;
		defm V_MIN_I16 : VOP2Inst_e64_t16 <"v_min_i16", VOP_I16_I16_I16, smin>;
		} // End isCommutable = 1
		} // End isReMaterializable = 1

let mayRaiseFPException = 0 in {		let SubtargetPredicate = isGFX11Plus in {
def V_MADAK_F16 : VOP2_Pseudo <"v_madak_f16", VOP_MADAK_F16, [], "">;		let isCommutable = 1 in {
		defm V_AND_B16_t16 : VOP2Inst_e64 <"v_and_b16_t16", VOPProfile_True16<VOP_I16_I16_I16>, and>;
		defm V_OR_B16_t16 : VOP2Inst_e64 <"v_or_b16_t16", VOPProfile_True16<VOP_I16_I16_I16>, or>;
		defm V_XOR_B16_t16 : VOP2Inst_e64 <"v_xor_b16_t16", VOPProfile_True16<VOP_I16_I16_I16>, xor>;
		} // End isCommutable = 1
		} // End SubtargetPredicate = isGFX11Plus

		let FPDPRounding = 1, isReMaterializable = 1 in {
		let SubtargetPredicate = isGFX10Plus, OtherPredicates = [NotHasTrue16BitInsts] in {
		def V_FMAMK_F16 : VOP2_Pseudo <"v_fmamk_f16", VOP_MADMK_F16, [], "">;
		}
		let SubtargetPredicate = HasTrue16BitInsts in {
		def V_FMAMK_F16_t16 : VOP2_Pseudo <"v_fmamk_f16_t16", VOP_MADMK_F16_t16, [], "">;
}		}

} // End FPDPRounding = 1		let isCommutable = 1 in {
		let SubtargetPredicate = isGFX10Plus, OtherPredicates = [NotHasTrue16BitInsts] in {
		def V_FMAAK_F16 : VOP2_Pseudo <"v_fmaak_f16", VOP_MADAK_F16, [], "">;
		}
		let SubtargetPredicate = HasTrue16BitInsts in {
		def V_FMAAK_F16_t16 : VOP2_Pseudo <"v_fmaak_f16_t16", VOP_MADAK_F16_t16, [], "">;
		}
		} // End isCommutable = 1
		} // End FPDPRounding = 1, isReMaterializable = 1

defm V_MUL_LO_U16 : VOP2Inst <"v_mul_lo_u16", VOP_I16_I16_I16, mul>;		let Constraints = "$vdst = $src2",
defm V_MAX_F16 : VOP2Inst <"v_max_f16", VOP_F16_F16_F16, fmaxnum_like>;		DisableEncoding="$src2",
defm V_MIN_F16 : VOP2Inst <"v_min_f16", VOP_F16_F16_F16, fminnum_like>;		isConvertibleToThreeAddress = 1,
defm V_MAX_U16 : VOP2Inst <"v_max_u16", VOP_I16_I16_I16, umax>;		isCommutable = 1 in {
defm V_MAX_I16 : VOP2Inst <"v_max_i16", VOP_I16_I16_I16, smax>;		let SubtargetPredicate = isGFX10Plus, OtherPredicates = [NotHasTrue16BitInsts] in {
defm V_MIN_U16 : VOP2Inst <"v_min_u16", VOP_I16_I16_I16, umin>;		defm V_FMAC_F16 : VOP2Inst <"v_fmac_f16", VOP_MAC_F16>;
defm V_MIN_I16 : VOP2Inst <"v_min_i16", VOP_I16_I16_I16, smin>;		}
		let SubtargetPredicate = HasTrue16BitInsts in {
		defm V_FMAC_F16_t16 : VOP2Inst <"v_fmac_f16_t16", VOP_MAC_F16_t16>;
		}
		} // End FMAC Constraints

		let SubtargetPredicate = Has16BitInsts in {
		let isReMaterializable = 1 in {
		let FPDPRounding = 1 in {
		def V_MADMK_F16 : VOP2_Pseudo <"v_madmk_f16", VOP_MADMK_F16, [], "">;
		} // End FPDPRounding = 1
		let isCommutable = 1 in {
		let mayRaiseFPException = 0 in {
		def V_MADAK_F16 : VOP2_Pseudo <"v_madak_f16", VOP_MADAK_F16, [], "">;
		}
let SubtargetPredicate = isGFX8GFX9 in {		let SubtargetPredicate = isGFX8GFX9 in {
defm V_ADD_U16 : VOP2Inst <"v_add_u16", VOP_I16_I16_I16_ARITH, add>;		defm V_ADD_U16 : VOP2Inst <"v_add_u16", VOP_I16_I16_I16_ARITH, add>;
defm V_SUB_U16 : VOP2Inst <"v_sub_u16" , VOP_I16_I16_I16_ARITH, sub>;		defm V_SUB_U16 : VOP2Inst <"v_sub_u16" , VOP_I16_I16_I16_ARITH, sub>;
defm V_SUBREV_U16 : VOP2Inst <"v_subrev_u16", VOP_I16_I16_I16_ARITH, null_frag, "v_sub_u16">;		defm V_SUBREV_U16 : VOP2Inst <"v_subrev_u16", VOP_I16_I16_I16_ARITH, null_frag, "v_sub_u16">;
}		}
} // End isCommutable = 1		} // End isCommutable = 1
} // End isReMaterializable = 1		} // End isReMaterializable = 1

// FIXME: Missing FPDPRounding		// FIXME: Missing FPDPRounding
let Constraints = "$vdst = $src2", DisableEncoding="$src2",		let Constraints = "$vdst = $src2", DisableEncoding="$src2",
isConvertibleToThreeAddress = 1, isCommutable = 1 in {		isConvertibleToThreeAddress = 1, isCommutable = 1 in {
defm V_MAC_F16 : VOP2Inst <"v_mac_f16", VOP_MAC_F16>;		defm V_MAC_F16 : VOP2Inst <"v_mac_f16", VOP_MAC_F16>;
}		}
} // End SubtargetPredicate = Has16BitInsts		} // End SubtargetPredicate = Has16BitInsts


let SubtargetPredicate = HasDLInsts in {		let SubtargetPredicate = HasDLInsts in {

let isReMaterializable = 1 in		let isReMaterializable = 1 in
defm V_XNOR_B32 : VOP2Inst <"v_xnor_b32", VOP_I32_I32_I32, xnor>;		defm V_XNOR_B32 : VOP2Inst <"v_xnor_b32", VOP_I32_I32_I32, xnor>;

def : GCNPat<		def : GCNPat<
(i32 (DivergentUnaryFrag<not> (xor_oneuse i32:$src0, i32:$src1))),		(i32 (DivergentUnaryFrag<not> (xor_oneuse i32:$src0, i32:$src1))),
(i32 (V_XNOR_B32_e64 $src0, $src1))		(i32 (V_XNOR_B32_e64 $src0, $src1))
Show All 24 Lines	(REG_SEQUENCE VReg_64, (i32 (V_XNOR_B32_e64
(i32 (EXTRACT_SUBREG $src1, sub1)))), sub1)		(i32 (EXTRACT_SUBREG $src1, sub1)))), sub1)
>;		>;

let Constraints = "$vdst = $src2",		let Constraints = "$vdst = $src2",
DisableEncoding = "$src2",		DisableEncoding = "$src2",
isConvertibleToThreeAddress = 1,		isConvertibleToThreeAddress = 1,
isCommutable = 1 in		isCommutable = 1 in
defm V_FMAC_F32 : VOP2Inst_VOPD <"v_fmac_f32", VOP_MAC_F32, 0x0, "v_fmac_f32">;		defm V_FMAC_F32 : VOP2Inst_VOPD <"v_fmac_f32", VOP_MAC_F32, 0x0, "v_fmac_f32">;

} // End SubtargetPredicate = HasDLInsts		} // End SubtargetPredicate = HasDLInsts

let SubtargetPredicate = HasFmaLegacy32 in {		let SubtargetPredicate = HasFmaLegacy32 in {

let Constraints = "$vdst = $src2",		let Constraints = "$vdst = $src2",
DisableEncoding = "$src2",		DisableEncoding = "$src2",
isConvertibleToThreeAddress = 1,		isConvertibleToThreeAddress = 1,
isCommutable = 1 in		isCommutable = 1 in
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines

let SubtargetPredicate = HasFmaakFmamkF32Insts, isReMaterializable = 1 in {		let SubtargetPredicate = HasFmaakFmamkF32Insts, isReMaterializable = 1 in {
def V_FMAMK_F32 : VOP2_Pseudo<"v_fmamk_f32", VOP_MADMK_F32, [], "">, VOPD_Component<0x2, "v_fmamk_f32">;		def V_FMAMK_F32 : VOP2_Pseudo<"v_fmamk_f32", VOP_MADMK_F32, [], "">, VOPD_Component<0x2, "v_fmamk_f32">;

let isCommutable = 1 in		let isCommutable = 1 in
def V_FMAAK_F32 : VOP2_Pseudo<"v_fmaak_f32", VOP_MADAK_F32, [], "">, VOPD_Component<0x1, "v_fmaak_f32">;		def V_FMAAK_F32 : VOP2_Pseudo<"v_fmaak_f32", VOP_MADAK_F32, [], "">, VOPD_Component<0x1, "v_fmaak_f32">;
}		}

let SubtargetPredicate = isGFX10Plus in {

let FPDPRounding = 1, isReMaterializable = 1 in {
def V_FMAMK_F16 : VOP2_Pseudo <"v_fmamk_f16", VOP_MADMK_F16, [], "">;

let isCommutable = 1 in
def V_FMAAK_F16 : VOP2_Pseudo <"v_fmaak_f16", VOP_MADAK_F16, [], "">;
} // End FPDPRounding = 1, isReMaterializable = 1

let Constraints = "$vdst = $src2",
DisableEncoding="$src2",
isConvertibleToThreeAddress = 1,
isCommutable = 1 in {
defm V_FMAC_F16 : VOP2Inst <"v_fmac_f16", VOP_MAC_F16>;
}

} // End SubtargetPredicate = isGFX10Plus

let SubtargetPredicate = HasPkFmacF16Inst in {		let SubtargetPredicate = HasPkFmacF16Inst in {
defm V_PK_FMAC_F16 : VOP2Inst<"v_pk_fmac_f16", VOP_V2F16_V2F16_V2F16>;		defm V_PK_FMAC_F16 : VOP2Inst<"v_pk_fmac_f16", VOP_V2F16_V2F16_V2F16>;
} // End SubtargetPredicate = HasPkFmacF16Inst		} // End SubtargetPredicate = HasPkFmacF16Inst

// Note: 16-bit instructions produce a 0 result in the high 16-bits		// Note: 16-bit instructions produce a 0 result in the high 16-bits
// on GFX8 and GFX9 and preserve high 16 bits on GFX10+		// on GFX8 and GFX9 and preserve high 16 bits on GFX10+
multiclass Arithmetic_i16_0Hi_Pats <SDPatternOperator op, Instruction inst> {		multiclass Arithmetic_i16_0Hi_Pats <SDPatternOperator op, Instruction inst> {

▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
}		}
}		}

let SubtargetPredicate = Has16BitInsts, OtherPredicates = [HasIntClamp] in {		let SubtargetPredicate = Has16BitInsts, OtherPredicates = [HasIntClamp] in {
def : VOPBinOpClampPat<uaddsat, V_ADD_U16_e64, i16>;		def : VOPBinOpClampPat<uaddsat, V_ADD_U16_e64, i16>;
def : VOPBinOpClampPat<usubsat, V_SUB_U16_e64, i16>;		def : VOPBinOpClampPat<usubsat, V_SUB_U16_e64, i16>;
}		}

let SubtargetPredicate = isGFX11Plus in {
let isCommutable = 1 in {
defm V_AND_B16 : VOP2Inst <"v_and_b16", VOP_I16_I16_I16, and>;
defm V_OR_B16 : VOP2Inst <"v_or_b16", VOP_I16_I16_I16, or>;
defm V_XOR_B16 : VOP2Inst <"v_xor_b16", VOP_I16_I16_I16, xor>;
} // End isCommutable = 1
} // End SubtargetPredicate = isGFX11Plus

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// DPP Encodings		// DPP Encodings
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class VOP2_DPP<bits<6> op, VOP2_DPP_Pseudo ps,		class VOP2_DPP<bits<6> op, VOP2_DPP_Pseudo ps,
string opName = ps.OpName, VOPProfile p = ps.Pfl,		string opName = ps.OpName, VOPProfile p = ps.Pfl,
bit IsDPP16 = 0> :		bit IsDPP16 = 0> :
VOP_DPP<opName, p, IsDPP16> {		VOP_DPP<opName, p, IsDPP16> {
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines

let AssemblerPredicate = isGFX11Only, DecoderNamespace = "GFX11" in {		let AssemblerPredicate = isGFX11Only, DecoderNamespace = "GFX11" in {
//===------------------------------- VOP2 -------------------------------===//		//===------------------------------- VOP2 -------------------------------===//
multiclass VOP2Only_Real_MADK_gfx11<bits<6> op> {		multiclass VOP2Only_Real_MADK_gfx11<bits<6> op> {
def _gfx11 :		def _gfx11 :
VOP2_Real<!cast<VOP2_Pseudo>(NAME), SIEncodingFamily.GFX11>,		VOP2_Real<!cast<VOP2_Pseudo>(NAME), SIEncodingFamily.GFX11>,
VOP2_MADKe<op{5-0}, !cast<VOP2_Pseudo>(NAME).Pfl>;		VOP2_MADKe<op{5-0}, !cast<VOP2_Pseudo>(NAME).Pfl>;
}		}
		multiclass VOP2Only_Real_MADK_gfx11_with_name<bits<6> op, string asmName,
		string opName = NAME> {
		def _gfx11 :
		VOP2_Real<!cast<VOP2_Pseudo>(opName), SIEncodingFamily.GFX11>,
		VOP2_MADKe<op{5-0}, !cast<VOP2_Pseudo>(opName).Pfl> {
		VOP2_Pseudo ps = !cast<VOP2_Pseudo>(opName);
		let AsmString = asmName # ps.AsmOperands;
		}
		}
multiclass VOP2_Real_e32_gfx11<bits<6> op> {		multiclass VOP2_Real_e32_gfx11<bits<6> op> {
def _e32_gfx11 :		def _e32_gfx11 :
VOP2_Real<!cast<VOP2_Pseudo>(NAME#"_e32"), SIEncodingFamily.GFX11>,		VOP2_Real<!cast<VOP2_Pseudo>(NAME#"_e32"), SIEncodingFamily.GFX11>,
VOP2e<op{5-0}, !cast<VOP2_Pseudo>(NAME#"_e32").Pfl>;		VOP2e<op{5-0}, !cast<VOP2_Pseudo>(NAME#"_e32").Pfl>;
}		}
multiclass VOP2Only_Real_e32_gfx11<bits<6> op> {		multiclass VOP2Only_Real_e32_gfx11<bits<6> op> {
let IsSingle = 1 in		let IsSingle = 1 in
defm NAME: VOP2_Real_e32_gfx11<op>;		defm NAME: VOP2_Real_e32_gfx11<op>;
Show All 17 Lines	let AssemblerPredicate = isGFX11Only, DecoderNamespace = "GFX11" in {
}		}

//===------------------------- VOP2 (with name) -------------------------===//		//===------------------------- VOP2 (with name) -------------------------===//
multiclass VOP2_Real_e32_with_name_gfx11<bits<6> op, string opName,		multiclass VOP2_Real_e32_with_name_gfx11<bits<6> op, string opName,
string asmName, bit single = 0> {		string asmName, bit single = 0> {
defvar ps = !cast<VOP2_Pseudo>(opName#"_e32");		defvar ps = !cast<VOP2_Pseudo>(opName#"_e32");
def _e32_gfx11 :		def _e32_gfx11 :
VOP2_Real<ps, SIEncodingFamily.GFX11, asmName>,		VOP2_Real<ps, SIEncodingFamily.GFX11, asmName>,
VOP2e<op{5-0}, ps.Pfl>,		VOP2e<op{5-0}, ps.Pfl> {
MnemonicAlias<ps.Mnemonic, asmName>, Requires<[isGFX11Plus]> {
let AsmString = asmName # ps.AsmOperands;		let AsmString = asmName # ps.AsmOperands;
let IsSingle = single;		let IsSingle = single;
}		}
}		}
multiclass VOP2_Real_e64_with_name_gfx11<bits<6> op, string opName,		multiclass VOP2_Real_e64_with_name_gfx11<bits<6> op, string opName,
string asmName> {		string asmName> {
defvar ps = !cast<VOP3_Pseudo>(opName#"_e64");		defvar ps = !cast<VOP3_Pseudo>(opName#"_e64");
def _e64_gfx11 :		def _e64_gfx11 :
VOP3_Real<ps, SIEncodingFamily.GFX11>,		VOP3_Real<ps, SIEncodingFamily.GFX11>,
VOP3e_gfx11<{0, 1, 0, 0, op{5-0}}, ps.Pfl>,		VOP3e_gfx11<{0, 1, 0, 0, op{5-0}}, ps.Pfl> {
MnemonicAlias<ps.Mnemonic, asmName>, Requires<[isGFX11Plus]> {
let AsmString = asmName # ps.AsmOperands;		let AsmString = asmName # ps.AsmOperands;
}		}
}		}

multiclass VOP2_Real_dpp_with_name_gfx11<bits<6> op, string opName,		multiclass VOP2_Real_dpp_with_name_gfx11<bits<6> op, string opName,
string asmName> {		string asmName> {
defvar ps = !cast<VOP2_Pseudo>(opName#"_e32");		defvar ps = !cast<VOP2_Pseudo>(opName#"_e32");
foreach _ = BoolToList<ps.Pfl.HasExtDPP>.ret in		foreach _ = BoolToList<ps.Pfl.HasExtDPP>.ret in
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines

multiclass VOP2_Real_NO_VOP3_gfx11<bits<6> op> :		multiclass VOP2_Real_NO_VOP3_gfx11<bits<6> op> :
VOP2_Real_e32_gfx11<op>, VOP2_Real_dpp_gfx11<op>, VOP2_Real_dpp8_gfx11<op>;		VOP2_Real_e32_gfx11<op>, VOP2_Real_dpp_gfx11<op>, VOP2_Real_dpp8_gfx11<op>;

multiclass VOP2_Real_FULL_gfx11<bits<6> op> :		multiclass VOP2_Real_FULL_gfx11<bits<6> op> :
VOP2_Realtriple_e64_gfx11<op>, VOP2_Real_NO_VOP3_gfx11<op>;		VOP2_Realtriple_e64_gfx11<op>, VOP2_Real_NO_VOP3_gfx11<op>;

multiclass VOP2_Real_NO_VOP3_with_name_gfx11<bits<6> op, string opName,		multiclass VOP2_Real_NO_VOP3_with_name_gfx11<bits<6> op, string opName,
string asmName, bit isSingle = 0> :		string asmName, bit isSingle = 0> {
VOP2_Real_e32_with_name_gfx11<op, opName, asmName, isSingle>,
		defm NAME : VOP2_Real_e32_with_name_gfx11<op, opName, asmName, isSingle>,
VOP2_Real_dpp_with_name_gfx11<op, opName, asmName>,		VOP2_Real_dpp_with_name_gfx11<op, opName, asmName>,
VOP2_Real_dpp8_with_name_gfx11<op, opName, asmName>;		VOP2_Real_dpp8_with_name_gfx11<op, opName, asmName>;
		defvar ps = !cast<VOP2_Pseudo>(opName#"_e32");
		def _gfx11_alias : MnemonicAlias<ps.Mnemonic, asmName>, Requires<[isGFX11Plus]>;
		}

multiclass VOP2_Real_FULL_with_name_gfx11<bits<6> op, string opName,		multiclass VOP2_Real_FULL_with_name_gfx11<bits<6> op, string opName,
string asmName> :		string asmName> :
VOP2_Realtriple_e64_with_name_gfx11<op, opName, asmName>,		VOP2_Realtriple_e64_with_name_gfx11<op, opName, asmName>,
VOP2_Real_NO_VOP3_with_name_gfx11<op, opName, asmName>;		VOP2_Real_NO_VOP3_with_name_gfx11<op, opName, asmName>;

		multiclass VOP2_Real_FULL_t16_gfx11<bits<6> op, string asmName, string opName = NAME>
		: VOP2_Real_FULL_with_name_gfx11<op, opName, asmName>;

multiclass VOP2_Real_NO_DPP_gfx11<bits<6> op> :		multiclass VOP2_Real_NO_DPP_gfx11<bits<6> op> :
VOP2_Real_e32_gfx11<op>, VOP2_Real_e64_gfx11<op>;		VOP2_Real_e32_gfx11<op>, VOP2_Real_e64_gfx11<op>;

multiclass VOP2_Real_NO_DPP_with_name_gfx11<bits<6> op, string opName,		multiclass VOP2_Real_NO_DPP_with_name_gfx11<bits<6> op, string opName,
string asmName> :		string asmName> {
VOP2_Real_e32_with_name_gfx11<op, opName, asmName>,		defm NAME : VOP2_Real_e32_with_name_gfx11<op, opName, asmName>,
VOP2_Real_e64_with_name_gfx11<op, opName, asmName>;		VOP2_Real_e64_with_name_gfx11<op, opName, asmName>;
		defvar ps = !cast<VOP2_Pseudo>(opName#"_e32");
		def _gfx11_alias : MnemonicAlias<ps.Mnemonic, asmName>, Requires<[isGFX11Plus]>;
		}

defm V_CNDMASK_B32 : VOP2e_Real_gfx11<0x001, "V_CNDMASK_B32",		defm V_CNDMASK_B32 : VOP2e_Real_gfx11<0x001, "V_CNDMASK_B32",
"v_cndmask_b32">;		"v_cndmask_b32">;
defm V_DOT2ACC_F32_F16 : VOP2_Real_NO_VOP3_with_name_gfx11<0x002,		defm V_DOT2ACC_F32_F16 : VOP2_Real_NO_VOP3_with_name_gfx11<0x002,
"V_DOT2C_F32_F16", "v_dot2acc_f32_f16", 1>;		"V_DOT2C_F32_F16", "v_dot2acc_f32_f16", 1>;
defm V_FMAC_DX9_ZERO_F32 : VOP2_Real_NO_DPP_with_name_gfx11<0x006,		defm V_FMAC_DX9_ZERO_F32 : VOP2_Real_NO_DPP_with_name_gfx11<0x006,
"V_FMAC_LEGACY_F32", "v_fmac_dx9_zero_f32">;		"V_FMAC_LEGACY_F32", "v_fmac_dx9_zero_f32">;
defm V_MUL_DX9_ZERO_F32 : VOP2_Real_FULL_with_name_gfx11<0x007,		defm V_MUL_DX9_ZERO_F32 : VOP2_Real_FULL_with_name_gfx11<0x007,
"V_MUL_LEGACY_F32", "v_mul_dx9_zero_f32">;		"V_MUL_LEGACY_F32", "v_mul_dx9_zero_f32">;
defm V_LSHLREV_B32 : VOP2_Real_FULL_gfx11<0x018>;		defm V_LSHLREV_B32 : VOP2_Real_FULL_gfx11<0x018>;
defm V_LSHRREV_B32 : VOP2_Real_FULL_gfx11<0x019>;		defm V_LSHRREV_B32 : VOP2_Real_FULL_gfx11<0x019>;
defm V_ASHRREV_I32 : VOP2_Real_FULL_gfx11<0x01a>;		defm V_ASHRREV_I32 : VOP2_Real_FULL_gfx11<0x01a>;
defm V_ADD_CO_CI_U32 :		defm V_ADD_CO_CI_U32 :
VOP2be_Real_gfx11<0x020, "V_ADDC_U32", "v_add_co_ci_u32">;		VOP2be_Real_gfx11<0x020, "V_ADDC_U32", "v_add_co_ci_u32">;
defm V_SUB_CO_CI_U32 :		defm V_SUB_CO_CI_U32 :
VOP2be_Real_gfx11<0x021, "V_SUBB_U32", "v_sub_co_ci_u32">;		VOP2be_Real_gfx11<0x021, "V_SUBB_U32", "v_sub_co_ci_u32">;
defm V_SUBREV_CO_CI_U32 :		defm V_SUBREV_CO_CI_U32 :
VOP2be_Real_gfx11<0x022, "V_SUBBREV_U32", "v_subrev_co_ci_u32">;		VOP2be_Real_gfx11<0x022, "V_SUBBREV_U32", "v_subrev_co_ci_u32">;

defm V_CVT_PK_RTZ_F16_F32 : VOP2_Real_FULL_with_name_gfx11<0x02f,		defm V_CVT_PK_RTZ_F16_F32 : VOP2_Real_FULL_with_name_gfx11<0x02f,
"V_CVT_PKRTZ_F16_F32", "v_cvt_pk_rtz_f16_f32">;		"V_CVT_PKRTZ_F16_F32", "v_cvt_pk_rtz_f16_f32">;
defm V_PK_FMAC_F16 : VOP2Only_Real_gfx11<0x03c>;		defm V_PK_FMAC_F16 : VOP2Only_Real_gfx11<0x03c>;

		defm V_ADD_F16_t16 : VOP2_Real_FULL_t16_gfx11<0x032, "v_add_f16">;
		defm V_SUB_F16_t16 : VOP2_Real_FULL_t16_gfx11<0x033, "v_sub_f16">;
		defm V_SUBREV_F16_t16 : VOP2_Real_FULL_t16_gfx11<0x034, "v_subrev_f16">;
		defm V_MUL_F16_t16 : VOP2_Real_FULL_t16_gfx11<0x035, "v_mul_f16">;
		defm V_FMAC_F16_t16 : VOP2_Real_FULL_t16_gfx11<0x036, "v_fmac_f16">;
		defm V_LDEXP_F16_t16 : VOP2_Real_FULL_t16_gfx11<0x03b, "v_ldexp_f16">;
		defm V_MAX_F16_t16 : VOP2_Real_FULL_t16_gfx11<0x039, "v_max_f16">;
		defm V_MIN_F16_t16 : VOP2_Real_FULL_t16_gfx11<0x03a, "v_min_f16">;
		defm V_FMAMK_F16_t16 : VOP2Only_Real_MADK_gfx11_with_name<0x037, "v_fmamk_f16">;
		defm V_FMAAK_F16_t16 : VOP2Only_Real_MADK_gfx11_with_name<0x038, "v_fmaak_f16">;

// VOP3 only.		// VOP3 only.
defm V_CNDMASK_B16 : VOP3Only_Realtriple_gfx11<0x25d>;		defm V_CNDMASK_B16 : VOP3Only_Realtriple_gfx11<0x25d>;
defm V_LDEXP_F32 : VOP3Only_Realtriple_gfx11<0x31c>;		defm V_LDEXP_F32 : VOP3Only_Realtriple_gfx11<0x31c>;
defm V_BFM_B32 : VOP3Only_Realtriple_gfx11<0x31d>;		defm V_BFM_B32 : VOP3Only_Realtriple_gfx11<0x31d>;
defm V_BCNT_U32_B32 : VOP3Only_Realtriple_gfx11<0x31e>;		defm V_BCNT_U32_B32 : VOP3Only_Realtriple_gfx11<0x31e>;
defm V_MBCNT_LO_U32_B32 : VOP3Only_Realtriple_gfx11<0x31f>;		defm V_MBCNT_LO_U32_B32 : VOP3Only_Realtriple_gfx11<0x31f>;
defm V_MBCNT_HI_U32_B32 : VOP3Only_Realtriple_gfx11<0x320>;		defm V_MBCNT_HI_U32_B32 : VOP3Only_Realtriple_gfx11<0x320>;
defm V_CVT_PKNORM_I16_F32 : VOP3Only_Realtriple_gfx11<0x321>;		defm V_CVT_PKNORM_I16_F32 : VOP3Only_Realtriple_gfx11<0x321>;
▲ Show 20 Lines • Show All 274 Lines • ▼ Show 20 Lines
// NB: Same opcode as v_mac_legacy_f32		// NB: Same opcode as v_mac_legacy_f32
let DecoderNamespace = "GFX10_B" in		let DecoderNamespace = "GFX10_B" in
defm V_FMAC_LEGACY_F32 : VOP2_Real_gfx10<0x006>;		defm V_FMAC_LEGACY_F32 : VOP2_Real_gfx10<0x006>;

defm V_XNOR_B32 : VOP2_Real_gfx10_gfx11<0x01e>;		defm V_XNOR_B32 : VOP2_Real_gfx10_gfx11<0x01e>;
defm V_FMAC_F32 : VOP2_Real_gfx10_gfx11<0x02b>;		defm V_FMAC_F32 : VOP2_Real_gfx10_gfx11<0x02b>;
defm V_FMAMK_F32 : VOP2Only_Real_MADK_gfx10_gfx11<0x02c>;		defm V_FMAMK_F32 : VOP2Only_Real_MADK_gfx10_gfx11<0x02c>;
defm V_FMAAK_F32 : VOP2Only_Real_MADK_gfx10_gfx11<0x02d>;		defm V_FMAAK_F32 : VOP2Only_Real_MADK_gfx10_gfx11<0x02d>;
defm V_ADD_F16 : VOP2_Real_gfx10_gfx11<0x032>;		defm V_ADD_F16 : VOP2_Real_gfx10<0x032>;
defm V_SUB_F16 : VOP2_Real_gfx10_gfx11<0x033>;		defm V_SUB_F16 : VOP2_Real_gfx10<0x033>;
defm V_SUBREV_F16 : VOP2_Real_gfx10_gfx11<0x034>;		defm V_SUBREV_F16 : VOP2_Real_gfx10<0x034>;
defm V_MUL_F16 : VOP2_Real_gfx10_gfx11<0x035>;		defm V_MUL_F16 : VOP2_Real_gfx10<0x035>;
defm V_FMAC_F16 : VOP2_Real_gfx10_gfx11<0x036>;		defm V_FMAC_F16 : VOP2_Real_gfx10<0x036>;
defm V_FMAMK_F16 : VOP2Only_Real_MADK_gfx10_gfx11<0x037>;		defm V_FMAMK_F16 : VOP2Only_Real_MADK_gfx10<0x037>;
defm V_FMAAK_F16 : VOP2Only_Real_MADK_gfx10_gfx11<0x038>;		defm V_FMAAK_F16 : VOP2Only_Real_MADK_gfx10<0x038>;
defm V_MAX_F16 : VOP2_Real_gfx10_gfx11<0x039>;		defm V_MAX_F16 : VOP2_Real_gfx10<0x039>;
defm V_MIN_F16 : VOP2_Real_gfx10_gfx11<0x03a>;		defm V_MIN_F16 : VOP2_Real_gfx10<0x03a>;
defm V_LDEXP_F16 : VOP2_Real_gfx10_gfx11<0x03b>;		defm V_LDEXP_F16 : VOP2_Real_gfx10<0x03b>;

let IsSingle = 1 in {		let IsSingle = 1 in {
defm V_PK_FMAC_F16 : VOP2_Real_e32_gfx10<0x03c>;		defm V_PK_FMAC_F16 : VOP2_Real_e32_gfx10<0x03c>;
}		}

// VOP2 no carry-in, carry-out.		// VOP2 no carry-in, carry-out.
defm V_ADD_NC_U32 :		defm V_ADD_NC_U32 :
VOP2_Real_with_name_gfx10_gfx11<0x025, "V_ADD_U32", "v_add_nc_u32">;		VOP2_Real_with_name_gfx10_gfx11<0x025, "V_ADD_U32", "v_add_nc_u32">;
▲ Show 20 Lines • Show All 545 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/VOP3Instructions.td

	Show First 20 Lines • Show All 761 Lines • ▼ Show 20 Lines

	class VOP3_DOT_Profile<VOPProfile P, VOP3Features Features = VOP3_REGULAR> : VOP3_Profile<P, Features> {			class VOP3_DOT_Profile<VOPProfile P, VOP3Features Features = VOP3_REGULAR> : VOP3_Profile<P, Features> {
	let HasClamp = 0;			let HasClamp = 0;
	let HasOMod = 0;			let HasOMod = 0;
	// Override modifiers for bf16(i16) (same as float modifiers).			// Override modifiers for bf16(i16) (same as float modifiers).
	let HasSrc0Mods = 1;			let HasSrc0Mods = 1;
	let HasSrc1Mods = 1;			let HasSrc1Mods = 1;
	let HasSrc2Mods = 1;			let HasSrc2Mods = 1;
	let Src0ModDPP = FPVRegInputMods;			let Src0ModVOP3DPP = FPVRegInputMods;
	let Src1ModDPP = FPVRegInputMods;			let Src1ModVOP3DPP = FPVRegInputMods;
	let Src2ModVOP3DPP = FP16InputMods;			let Src2ModVOP3DPP = FP16InputMods;
	let InsVOP3OpSel = getInsVOP3OpSel<Src0RC64, Src1RC64, Src2RC64, NumSrcArgs,			let InsVOP3OpSel = getInsVOP3OpSel<Src0RC64, Src1RC64, Src2RC64, NumSrcArgs,
	HasClamp, HasOMod, FP16InputMods,			HasClamp, HasOMod, FP16InputMods,
	FP16InputMods, FP16InputMods>.ret;			FP16InputMods, FP16InputMods>.ret;
	let AsmVOP3OpSel = getAsmVOP3OpSel<NumSrcArgs, HasClamp, 1, 1, 1>.ret;			let AsmVOP3OpSel = getAsmVOP3OpSel<NumSrcArgs, HasClamp, 1, 1, 1>.ret;
	}			}

	let SubtargetPredicate = isGFX11Plus in {			let SubtargetPredicate = isGFX11Plus in {
	▲ Show 20 Lines • Show All 137 Lines • ▼ Show 20 Lines
	defm V_DOT2_F16_F16 : VOP3Dot_Realtriple_gfx11<0x266>;			defm V_DOT2_F16_F16 : VOP3Dot_Realtriple_gfx11<0x266>;
	defm V_DOT2_BF16_BF16 : VOP3Dot_Realtriple_gfx11<0x267>;			defm V_DOT2_BF16_BF16 : VOP3Dot_Realtriple_gfx11<0x267>;
	defm V_DIV_SCALE_F32 : VOP3be_Real_gfx11<0x2fc, "V_DIV_SCALE_F32", "v_div_scale_f32">;			defm V_DIV_SCALE_F32 : VOP3be_Real_gfx11<0x2fc, "V_DIV_SCALE_F32", "v_div_scale_f32">;
	defm V_DIV_SCALE_F64 : VOP3be_Real_gfx11<0x2fd, "V_DIV_SCALE_F64", "v_div_scale_f64">;			defm V_DIV_SCALE_F64 : VOP3be_Real_gfx11<0x2fd, "V_DIV_SCALE_F64", "v_div_scale_f64">;
	defm V_MAD_U64_U32_gfx11 : VOP3be_Real_gfx11<0x2fe, "V_MAD_U64_U32_gfx11", "v_mad_u64_u32">;			defm V_MAD_U64_U32_gfx11 : VOP3be_Real_gfx11<0x2fe, "V_MAD_U64_U32_gfx11", "v_mad_u64_u32">;
	defm V_MAD_I64_I32_gfx11 : VOP3be_Real_gfx11<0x2ff, "V_MAD_I64_I32_gfx11", "v_mad_i64_i32">;			defm V_MAD_I64_I32_gfx11 : VOP3be_Real_gfx11<0x2ff, "V_MAD_I64_I32_gfx11", "v_mad_i64_i32">;
	defm V_ADD_NC_U16 : VOP3Only_Realtriple_gfx11<0x303>;			defm V_ADD_NC_U16 : VOP3Only_Realtriple_gfx11<0x303>;
	defm V_SUB_NC_U16 : VOP3Only_Realtriple_gfx11<0x304>;			defm V_SUB_NC_U16 : VOP3Only_Realtriple_gfx11<0x304>;
	defm V_MUL_LO_U16 : VOP3Only_Realtriple_gfx11<0x305>;			defm V_MUL_LO_U16_t16 : VOP3Only_Realtriple_t16_gfx11<0x305, "v_mul_lo_u16">;
	defm V_CVT_PK_I16_F32 : VOP3_Realtriple_gfx11<0x306>;			defm V_CVT_PK_I16_F32 : VOP3_Realtriple_gfx11<0x306>;
	defm V_CVT_PK_U16_F32 : VOP3_Realtriple_gfx11<0x307>;			defm V_CVT_PK_U16_F32 : VOP3_Realtriple_gfx11<0x307>;
	defm V_MAX_U16 : VOP3Only_Realtriple_gfx11<0x309>;			defm V_MAX_U16_t16 : VOP3Only_Realtriple_t16_gfx11<0x309, "v_max_u16">;
	defm V_MAX_I16 : VOP3Only_Realtriple_gfx11<0x30a>;			defm V_MAX_I16_t16 : VOP3Only_Realtriple_t16_gfx11<0x30a, "v_max_i16">;
	defm V_MIN_U16 : VOP3Only_Realtriple_gfx11<0x30b>;			defm V_MIN_U16_t16 : VOP3Only_Realtriple_t16_gfx11<0x30b, "v_min_u16">;
	defm V_MIN_I16 : VOP3Only_Realtriple_gfx11<0x30c>;			defm V_MIN_I16_t16 : VOP3Only_Realtriple_t16_gfx11<0x30c, "v_min_i16">;
				arsenmUnsubmitted Done Reply Inline Actions Since the _T16 doesn't appear in the instruction mnemonic it should be lowercased arsenm: Since the _T16 doesn't appear in the instruction mnemonic it should be lowercased
				Joe_NashAuthorUnsubmitted Done Reply Inline Actions replaced _T16 with _t16 everywhere Joe_Nash: replaced _T16 with _t16 everywhere
	defm V_ADD_NC_I16 : VOP3_Realtriple_with_name_gfx11<0x30d, "V_ADD_I16", "v_add_nc_i16">;			defm V_ADD_NC_I16 : VOP3_Realtriple_with_name_gfx11<0x30d, "V_ADD_I16", "v_add_nc_i16">;
	defm V_SUB_NC_I16 : VOP3_Realtriple_with_name_gfx11<0x30e, "V_SUB_I16", "v_sub_nc_i16">;			defm V_SUB_NC_I16 : VOP3_Realtriple_with_name_gfx11<0x30e, "V_SUB_I16", "v_sub_nc_i16">;
	defm V_PACK_B32_F16 : VOP3_Realtriple_gfx11<0x311>;			defm V_PACK_B32_F16 : VOP3_Realtriple_gfx11<0x311>;
	defm V_CVT_PK_NORM_I16_F16 : VOP3_Realtriple_with_name_gfx11<0x312, "V_CVT_PKNORM_I16_F16" , "v_cvt_pk_norm_i16_f16" >;			defm V_CVT_PK_NORM_I16_F16 : VOP3_Realtriple_with_name_gfx11<0x312, "V_CVT_PKNORM_I16_F16" , "v_cvt_pk_norm_i16_f16" >;
	defm V_CVT_PK_NORM_U16_F16 : VOP3_Realtriple_with_name_gfx11<0x313, "V_CVT_PKNORM_U16_F16" , "v_cvt_pk_norm_u16_f16" >;			defm V_CVT_PK_NORM_U16_F16 : VOP3_Realtriple_with_name_gfx11<0x313, "V_CVT_PKNORM_U16_F16" , "v_cvt_pk_norm_u16_f16" >;
	defm V_SUB_NC_I32 : VOP3_Realtriple_with_name_gfx11<0x325, "V_SUB_I32", "v_sub_nc_i32">;			defm V_SUB_NC_I32 : VOP3_Realtriple_with_name_gfx11<0x325, "V_SUB_I32", "v_sub_nc_i32">;
	defm V_ADD_NC_I32 : VOP3_Realtriple_with_name_gfx11<0x326, "V_ADD_I32", "v_add_nc_i32">;			defm V_ADD_NC_I32 : VOP3_Realtriple_with_name_gfx11<0x326, "V_ADD_I32", "v_add_nc_i32">;
	defm V_ADD_F64 : VOP3_Real_Base_gfx11<0x327>;			defm V_ADD_F64 : VOP3_Real_Base_gfx11<0x327>;
	defm V_MUL_F64 : VOP3_Real_Base_gfx11<0x328>;			defm V_MUL_F64 : VOP3_Real_Base_gfx11<0x328>;
	defm V_MIN_F64 : VOP3_Real_Base_gfx11<0x329>;			defm V_MIN_F64 : VOP3_Real_Base_gfx11<0x329>;
	defm V_MAX_F64 : VOP3_Real_Base_gfx11<0x32a>;			defm V_MAX_F64 : VOP3_Real_Base_gfx11<0x32a>;
	defm V_LDEXP_F64 : VOP3_Real_Base_gfx11<0x32b>;			defm V_LDEXP_F64 : VOP3_Real_Base_gfx11<0x32b>;
	defm V_MUL_LO_U32 : VOP3_Real_Base_gfx11<0x32c>;			defm V_MUL_LO_U32 : VOP3_Real_Base_gfx11<0x32c>;
	defm V_MUL_HI_U32 : VOP3_Real_Base_gfx11<0x32d>;			defm V_MUL_HI_U32 : VOP3_Real_Base_gfx11<0x32d>;
	defm V_MUL_HI_I32 : VOP3_Real_Base_gfx11<0x32e>;			defm V_MUL_HI_I32 : VOP3_Real_Base_gfx11<0x32e>;
	defm V_TRIG_PREOP_F64 : VOP3_Real_Base_gfx11<0x32f>;			defm V_TRIG_PREOP_F64 : VOP3_Real_Base_gfx11<0x32f>;
	defm V_LSHLREV_B16 : VOP3Only_Realtriple_gfx11<0x338>;			defm V_LSHLREV_B16_t16 : VOP3Only_Realtriple_t16_gfx11<0x338, "v_lshlrev_b16">;
	defm V_LSHRREV_B16 : VOP3Only_Realtriple_gfx11<0x339>;			defm V_LSHRREV_B16_t16 : VOP3Only_Realtriple_t16_gfx11<0x339, "v_lshrrev_b16">;
	defm V_ASHRREV_I16 : VOP3Only_Realtriple_gfx11<0x33a>;			defm V_ASHRREV_I16_t16 : VOP3Only_Realtriple_t16_gfx11<0x33a, "v_ashrrev_i16">;
	defm V_LSHLREV_B64 : VOP3_Real_Base_gfx11<0x33c>;			defm V_LSHLREV_B64 : VOP3_Real_Base_gfx11<0x33c>;
	defm V_LSHRREV_B64 : VOP3_Real_Base_gfx11<0x33d>;			defm V_LSHRREV_B64 : VOP3_Real_Base_gfx11<0x33d>;
	defm V_ASHRREV_I64 : VOP3_Real_Base_gfx11<0x33e>;			defm V_ASHRREV_I64 : VOP3_Real_Base_gfx11<0x33e>;
	defm V_READLANE_B32 : VOP3_Real_No_Suffix_gfx11<0x360>; // Pseudo in VOP2			defm V_READLANE_B32 : VOP3_Real_No_Suffix_gfx11<0x360>; // Pseudo in VOP2
	let InOperandList = (ins SSrcOrLds_b32:$src0, SCSrc_b32:$src1, VGPR_32:$vdst_in) in {			let InOperandList = (ins SSrcOrLds_b32:$src0, SCSrc_b32:$src1, VGPR_32:$vdst_in) in {
	defm V_WRITELANE_B32 : VOP3_Real_No_Suffix_gfx11<0x361>; // Pseudo in VOP2			defm V_WRITELANE_B32 : VOP3_Real_No_Suffix_gfx11<0x361>; // Pseudo in VOP2
	} // End InOperandList = (ins SSrcOrLds_b32:$src0, SCSrc_b32:$src1, VGPR_32:$vdst_in)			} // End InOperandList = (ins SSrcOrLds_b32:$src0, SCSrc_b32:$src1, VGPR_32:$vdst_in)
	defm V_AND_B16 : VOP3Only_Realtriple_gfx11<0x362>;			defm V_AND_B16_t16 : VOP3Only_Realtriple_t16_gfx11<0x362, "v_and_b16">;
	defm V_OR_B16 : VOP3Only_Realtriple_gfx11<0x363>;			defm V_OR_B16_t16 : VOP3Only_Realtriple_t16_gfx11<0x363, "v_or_b16">;
	defm V_XOR_B16 : VOP3Only_Realtriple_gfx11<0x364>;			defm V_XOR_B16_t16 : VOP3Only_Realtriple_t16_gfx11<0x364, "v_xor_b16">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// GFX10.			// GFX10.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	let AssemblerPredicate = isGFX10Only, DecoderNamespace = "GFX10" in {			let AssemblerPredicate = isGFX10Only, DecoderNamespace = "GFX10" in {
	multiclass VOP3_Real_gfx10<bits<10> op> {			multiclass VOP3_Real_gfx10<bits<10> op> {
	def _gfx10 :			def _gfx10 :
	▲ Show 20 Lines • Show All 465 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/VOPCInstructions.td

Show First 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	class VOPC_Profile<list<SchedReadWrite> sched, ValueType vt0, ValueType vt1 = vt0> :
let OutsVOP3DPP = Outs64;		let OutsVOP3DPP = Outs64;
let OutsVOP3DPP8 = Outs64;		let OutsVOP3DPP8 = Outs64;
let InsVOP3DPP = getInsVOP3DPP<InsVOP3Base, Src0VOP3DPP, NumSrcArgs, 0/HasOld/>.ret;		let InsVOP3DPP = getInsVOP3DPP<InsVOP3Base, Src0VOP3DPP, NumSrcArgs, 0/HasOld/>.ret;
let InsVOP3DPP16 = getInsVOP3DPP16<InsVOP3Base, Src0VOP3DPP, NumSrcArgs, 0/HasOld/>.ret;		let InsVOP3DPP16 = getInsVOP3DPP16<InsVOP3Base, Src0VOP3DPP, NumSrcArgs, 0/HasOld/>.ret;
let InsVOP3DPP8 = getInsVOP3DPP8<InsVOP3Base, Src0VOP3DPP, NumSrcArgs, 0/HasOld/>.ret;		let InsVOP3DPP8 = getInsVOP3DPP8<InsVOP3Base, Src0VOP3DPP, NumSrcArgs, 0/HasOld/>.ret;
list<SchedReadWrite> Schedule = sched;		list<SchedReadWrite> Schedule = sched;
}		}

		multiclass VOPC_Profile_t16<list<SchedReadWrite> sched, ValueType vt0, ValueType vt1 = vt0> {
		def NAME : VOPC_Profile<sched, vt0, vt1>;
		def _t16 : VOPC_Profile<sched, vt0, vt1> {
		let IsTrue16 = 1;
		let Src1RC32 = RegisterOperand<getVregSrcForVT_t16<Src1VT>.ret>;
		let Src0DPP = getVregSrcForVT_t16<Src0VT>.ret;
		let Src1DPP = getVregSrcForVT_t16<Src1VT>.ret;
		let Src2DPP = getVregSrcForVT_t16<Src2VT>.ret;
		let Src0ModDPP = getSrcModDPP_t16<Src0VT>.ret;
		let Src1ModDPP = getSrcModDPP_t16<Src1VT>.ret;
		let Src2ModDPP = getSrcModDPP_t16<Src2VT>.ret;
		}
		}

class VOPC_NoSdst_Profile<list<SchedReadWrite> sched, ValueType vt0,		class VOPC_NoSdst_Profile<list<SchedReadWrite> sched, ValueType vt0,
ValueType vt1 = vt0> :		ValueType vt1 = vt0> :
VOPC_Profile<sched, vt0, vt1> {		VOPC_Profile<sched, vt0, vt1> {
let Outs64 = (outs );		let Outs64 = (outs );
let OutsVOP3DPP = Outs64;		let OutsVOP3DPP = Outs64;
let OutsVOP3DPP8 = Outs64;		let OutsVOP3DPP8 = Outs64;
let OutsSDWA = (outs );		let OutsSDWA = (outs );
let InsSDWA = (ins Src0ModSDWA:$src0_modifiers, Src0SDWA:$src0,		let InsSDWA = (ins Src0ModSDWA:$src0_modifiers, Src0SDWA:$src0,
Src1ModSDWA:$src1_modifiers, Src1SDWA:$src1,		Src1ModSDWA:$src1_modifiers, Src1SDWA:$src1,
src0_sel:$src0_sel, src1_sel:$src1_sel);		src0_sel:$src0_sel, src1_sel:$src1_sel);
let Asm64 = !if(isFloatType<Src0VT>.ret, "$src0_modifiers, $src1_modifiers$clamp",		let Asm64 = !if(isFloatType<Src0VT>.ret, "$src0_modifiers, $src1_modifiers$clamp",
"$src0, $src1");		"$src0, $src1");
let AsmVOP3DPPBase = Asm64;		let AsmVOP3DPPBase = Asm64;
let AsmSDWA9 = "$src0_modifiers, $src1_modifiers $src0_sel $src1_sel";		let AsmSDWA9 = "$src0_modifiers, $src1_modifiers $src0_sel $src1_sel";
let EmitDst = 0;		let EmitDst = 0;
}		}

		multiclass VOPC_NoSdst_Profile_t16<list<SchedReadWrite> sched, ValueType vt0, ValueType vt1 = vt0> {
		def NAME : VOPC_NoSdst_Profile<sched, vt0, vt1>;
		def _t16 : VOPC_NoSdst_Profile<sched, vt0, vt1> {
		let IsTrue16 = 1;
		let Src1RC32 = RegisterOperand<getVregSrcForVT_t16<Src1VT>.ret>;
		let Src0DPP = getVregSrcForVT_t16<Src0VT>.ret;
		let Src1DPP = getVregSrcForVT_t16<Src1VT>.ret;
		let Src2DPP = getVregSrcForVT_t16<Src2VT>.ret;
		let Src0ModDPP = getSrcModDPP_t16<Src0VT>.ret;
		let Src1ModDPP = getSrcModDPP_t16<Src1VT>.ret;
		let Src2ModDPP = getSrcModDPP_t16<Src2VT>.ret;
		}
		}

class VOPC_Pseudo <string opName, VOPC_Profile P, list<dag> pattern=[],		class VOPC_Pseudo <string opName, VOPC_Profile P, list<dag> pattern=[],
bit DefVcc = 1> :		bit DefVcc = 1> :
InstSI<(outs), P.Ins32, "", pattern>,		InstSI<(outs), P.Ins32, "", pattern>,
VOP <opName>,		VOP <opName>,
SIMCInstr<opName#"_e32", SIEncodingFamily.NONE> {		SIMCInstr<opName#"_e32", SIEncodingFamily.NONE> {

let isPseudo = 1;		let isPseudo = 1;
let isCodeGenOnly = 1;		let isCodeGenOnly = 1;
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	// else
// else		// else
// 0 dst, 0 src		// 0 dst, 0 src
(inst))));		(inst))));

let AsmVariantName = AMDGPUAsmVariants.Default;		let AsmVariantName = AMDGPUAsmVariants.Default;
let SubtargetPredicate = AssemblerPredicate;		let SubtargetPredicate = AssemblerPredicate;
}		}

multiclass VOPCInstAliases <string old_name, string Arch, string real_name = old_name> {		multiclass VOPCInstAliases <string old_name, string Arch, string real_name = old_name, string mnemonic_from = real_name> {
def : VOPCInstAlias <!cast<VOP3_Pseudo>(old_name#"_e64"),		def : VOPCInstAlias <!cast<VOP3_Pseudo>(old_name#"_e64"),
!cast<Instruction>(real_name#"_e32_"#Arch),		!cast<Instruction>(real_name#"_e32_"#Arch),
!cast<VOP3_Pseudo>(old_name#"_e64").Pfl.Asm32,		!cast<VOP3_Pseudo>(old_name#"_e64").Pfl.Asm32,
real_name>;		mnemonic_from>;
let WaveSizePredicate = isWave32 in {		let WaveSizePredicate = isWave32 in {
def : VOPCInstAlias <!cast<VOP3_Pseudo>(old_name#"_e64"),		def : VOPCInstAlias <!cast<VOP3_Pseudo>(old_name#"_e64"),
!cast<Instruction>(real_name#"_e32_"#Arch),		!cast<Instruction>(real_name#"_e32_"#Arch),
"vcc_lo, "#!cast<VOP3_Pseudo>(old_name#"_e64").Pfl.Asm32,		"vcc_lo, "#!cast<VOP3_Pseudo>(old_name#"_e64").Pfl.Asm32,
real_name>;		mnemonic_from>;
}		}
let WaveSizePredicate = isWave64 in {		let WaveSizePredicate = isWave64 in {
def : VOPCInstAlias <!cast<VOP3_Pseudo>(old_name#"_e64"),		def : VOPCInstAlias <!cast<VOP3_Pseudo>(old_name#"_e64"),
!cast<Instruction>(real_name#"_e32_"#Arch),		!cast<Instruction>(real_name#"_e32_"#Arch),
"vcc, "#!cast<VOP3_Pseudo>(old_name#"_e64").Pfl.Asm32,		"vcc, "#!cast<VOP3_Pseudo>(old_name#"_e64").Pfl.Asm32,
real_name>;		mnemonic_from>;
}		}
}		}

multiclass VOPCXInstAliases <string old_name, string Arch, string real_name = old_name> {		multiclass VOPCXInstAliases <string old_name, string Arch, string real_name = old_name, string mnemonic_from = real_name> {
def : VOPCInstAlias <!cast<VOP3_Pseudo>(old_name#"_e64"),		def : VOPCInstAlias <!cast<VOP3_Pseudo>(old_name#"_e64"),
!cast<Instruction>(real_name#"_e32_"#Arch),		!cast<Instruction>(real_name#"_e32_"#Arch),
!cast<VOP3_Pseudo>(old_name#"_e64").Pfl.Asm32,		!cast<VOP3_Pseudo>(old_name#"_e64").Pfl.Asm32,
real_name>;		mnemonic_from>;
}		}

class getVOPCPat64 <SDPatternOperator cond, VOPProfile P> : LetDummies {		class getVOPCPat64 <SDPatternOperator cond, VOPProfile P> : LetDummies {
list<dag> ret = !if(P.HasModifiers,		list<dag> ret = !if(P.HasModifiers,
[(set i1:$sdst,		[(set i1:$sdst,
(setcc (P.Src0VT		(setcc (P.Src0VT
!if(P.HasOMod,		!if(P.HasOMod,
(VOP3Mods0 P.Src0VT:$src0, i32:$src0_modifiers, i1:$clamp, i32:$omod),		(VOP3Mods0 P.Src0VT:$src0, i32:$src0_modifiers, i1:$clamp, i32:$omod),
▲ Show 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	if P.HasExtVOP3DPP then
let SchedRW = P_NoSDst.Schedule;		let SchedRW = P_NoSDst.Schedule;
let isCompare = 1;		let isCompare = 1;
let Constraints = "";		let Constraints = "";
}		}
} // end SubtargetPredicate = isGFX11Plus		} // end SubtargetPredicate = isGFX11Plus
}		}
} // End SubtargetPredicate = HasSdstCMPX		} // End SubtargetPredicate = HasSdstCMPX

def VOPC_I1_F16_F16 : VOPC_Profile<[Write32Bit], f16>;		defm VOPC_I1_F16_F16 : VOPC_Profile_t16<[Write32Bit], f16>;
def VOPC_I1_F32_F32 : VOPC_Profile<[Write32Bit], f32>;		def VOPC_I1_F32_F32 : VOPC_Profile<[Write32Bit], f32>;
def VOPC_I1_F64_F64 : VOPC_Profile<[WriteDoubleAdd], f64>;		def VOPC_I1_F64_F64 : VOPC_Profile<[WriteDoubleAdd], f64>;
def VOPC_I1_I16_I16 : VOPC_Profile<[Write32Bit], i16>;		defm VOPC_I1_I16_I16 : VOPC_Profile_t16<[Write32Bit], i16>;
def VOPC_I1_I32_I32 : VOPC_Profile<[Write32Bit], i32>;		def VOPC_I1_I32_I32 : VOPC_Profile<[Write32Bit], i32>;
def VOPC_I1_I64_I64 : VOPC_Profile<[Write64Bit], i64>;		def VOPC_I1_I64_I64 : VOPC_Profile<[Write64Bit], i64>;

def VOPC_F16_F16 : VOPC_NoSdst_Profile<[Write32Bit], f16>;		defm VOPC_F16_F16 : VOPC_NoSdst_Profile_t16<[Write32Bit], f16>;
def VOPC_F32_F32 : VOPC_NoSdst_Profile<[Write32Bit], f32>;		def VOPC_F32_F32 : VOPC_NoSdst_Profile<[Write32Bit], f32>;
def VOPC_F64_F64 : VOPC_NoSdst_Profile<[Write64Bit], f64>;		def VOPC_F64_F64 : VOPC_NoSdst_Profile<[Write64Bit], f64>;
def VOPC_I16_I16 : VOPC_NoSdst_Profile<[Write32Bit], i16>;		defm VOPC_I16_I16 : VOPC_NoSdst_Profile_t16<[Write32Bit], i16>;
def VOPC_I32_I32 : VOPC_NoSdst_Profile<[Write32Bit], i32>;		def VOPC_I32_I32 : VOPC_NoSdst_Profile<[Write32Bit], i32>;
def VOPC_I64_I64 : VOPC_NoSdst_Profile<[Write64Bit], i64>;		def VOPC_I64_I64 : VOPC_NoSdst_Profile<[Write64Bit], i64>;

multiclass VOPC_F16 <string opName, SDPatternOperator cond = COND_NULL,		multiclass VOPC_F16 <string opName, SDPatternOperator cond = COND_NULL,
string revOp = opName> :		string revOp = opName> {
VOPC_Pseudos <opName, VOPC_I1_F16_F16, cond, revOp, 0>;		let OtherPredicates = [NotHasTrue16BitInsts, Has16BitInsts] in {
		defm NAME : VOPC_Pseudos <opName, VOPC_I1_F16_F16, cond, revOp, 0>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		defm _t16 : VOPC_Pseudos <opName#"_t16", VOPC_I1_F16_F16_t16, cond, revOp#"_t16", 0>;
		}
		}

multiclass VOPC_F32 <string opName, SDPatternOperator cond = COND_NULL, string revOp = opName> :		multiclass VOPC_F32 <string opName, SDPatternOperator cond = COND_NULL, string revOp = opName> :
VOPC_Pseudos <opName, VOPC_I1_F32_F32, cond, revOp, 0>;		VOPC_Pseudos <opName, VOPC_I1_F32_F32, cond, revOp, 0>;

multiclass VOPC_F64 <string opName, SDPatternOperator cond = COND_NULL, string revOp = opName> :		multiclass VOPC_F64 <string opName, SDPatternOperator cond = COND_NULL, string revOp = opName> :
VOPC_Pseudos <opName, VOPC_I1_F64_F64, cond, revOp, 0>;		VOPC_Pseudos <opName, VOPC_I1_F64_F64, cond, revOp, 0>;

multiclass VOPC_I16 <string opName, SDPatternOperator cond = COND_NULL, string revOp = opName> :		multiclass VOPC_I16 <string opName, SDPatternOperator cond = COND_NULL,
VOPC_Pseudos <opName, VOPC_I1_I16_I16, cond, revOp, 0>;		string revOp = opName> {
		let OtherPredicates = [NotHasTrue16BitInsts, Has16BitInsts] in {
		defm NAME : VOPC_Pseudos <opName, VOPC_I1_I16_I16, cond, revOp, 0>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		defm _t16 : VOPC_Pseudos <opName#"_t16", VOPC_I1_I16_I16_t16, cond, revOp#"_t16", 0>;
		}
		}

multiclass VOPC_I32 <string opName, SDPatternOperator cond = COND_NULL, string revOp = opName> :		multiclass VOPC_I32 <string opName, SDPatternOperator cond = COND_NULL, string revOp = opName> :
VOPC_Pseudos <opName, VOPC_I1_I32_I32, cond, revOp, 0>;		VOPC_Pseudos <opName, VOPC_I1_I32_I32, cond, revOp, 0>;

multiclass VOPC_I64 <string opName, SDPatternOperator cond = COND_NULL, string revOp = opName> :		multiclass VOPC_I64 <string opName, SDPatternOperator cond = COND_NULL, string revOp = opName> :
VOPC_Pseudos <opName, VOPC_I1_I64_I64, cond, revOp, 0>;		VOPC_Pseudos <opName, VOPC_I1_I64_I64, cond, revOp, 0>;

multiclass VOPCX_F16 <string opName, string revOp = opName> :		multiclass VOPCX_F16<string opName, string revOp = opName> {
VOPCX_Pseudos <opName, VOPC_I1_F16_F16, VOPC_F16_F16, COND_NULL, revOp>;		let OtherPredicates = [NotHasTrue16BitInsts, Has16BitInsts] in {
		defm NAME : VOPCX_Pseudos <opName, VOPC_I1_F16_F16, VOPC_F16_F16, COND_NULL, revOp>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		defm _t16 : VOPCX_Pseudos <opName#"_t16", VOPC_I1_F16_F16_t16, VOPC_F16_F16_t16, COND_NULL, revOp#"_t16">;
		}
		}

multiclass VOPCX_F32 <string opName, string revOp = opName> :		multiclass VOPCX_F32 <string opName, string revOp = opName> :
VOPCX_Pseudos <opName, VOPC_I1_F32_F32, VOPC_F32_F32, COND_NULL, revOp>;		VOPCX_Pseudos <opName, VOPC_I1_F32_F32, VOPC_F32_F32, COND_NULL, revOp>;

multiclass VOPCX_F64 <string opName, string revOp = opName> :		multiclass VOPCX_F64 <string opName, string revOp = opName> :
VOPCX_Pseudos <opName, VOPC_I1_F64_F64, VOPC_F64_F64, COND_NULL, revOp>;		VOPCX_Pseudos <opName, VOPC_I1_F64_F64, VOPC_F64_F64, COND_NULL, revOp>;

multiclass VOPCX_I16 <string opName, string revOp = opName> :		multiclass VOPCX_I16<string opName, string revOp = opName> {
VOPCX_Pseudos <opName, VOPC_I1_I16_I16, VOPC_I16_I16, COND_NULL, revOp>;		let OtherPredicates = [NotHasTrue16BitInsts, Has16BitInsts] in {
		defm NAME : VOPCX_Pseudos <opName, VOPC_I1_I16_I16, VOPC_I16_I16, COND_NULL, revOp>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		defm _t16 : VOPCX_Pseudos <opName#"_t16", VOPC_I1_I16_I16_t16, VOPC_I16_I16_t16, COND_NULL, revOp#"_t16">;
		}
		}

multiclass VOPCX_I32 <string opName, string revOp = opName> :		multiclass VOPCX_I32 <string opName, string revOp = opName> :
VOPCX_Pseudos <opName, VOPC_I1_I32_I32, VOPC_I32_I32, COND_NULL, revOp>;		VOPCX_Pseudos <opName, VOPC_I1_I32_I32, VOPC_I32_I32, COND_NULL, revOp>;

multiclass VOPCX_I64 <string opName, string revOp = opName> :		multiclass VOPCX_I64 <string opName, string revOp = opName> :
VOPCX_Pseudos <opName, VOPC_I1_I64_I64, VOPC_I64_I64, COND_NULL, revOp>;		VOPCX_Pseudos <opName, VOPC_I1_I64_I64, VOPC_I64_I64, COND_NULL, revOp>;


▲ Show 20 Lines • Show All 286 Lines • ▼ Show 20 Lines
defm V_CMPX_NE_U64 : VOPCX_I64 <"v_cmpx_ne_u64">;		defm V_CMPX_NE_U64 : VOPCX_I64 <"v_cmpx_ne_u64">;
defm V_CMPX_GE_U64 : VOPCX_I64 <"v_cmpx_ge_u64">;		defm V_CMPX_GE_U64 : VOPCX_I64 <"v_cmpx_ge_u64">;
defm V_CMPX_T_U64 : VOPCX_I64 <"v_cmpx_t_u64">;		defm V_CMPX_T_U64 : VOPCX_I64 <"v_cmpx_t_u64">;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Class instructions		// Class instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class VOPC_Class_Profile<list<SchedReadWrite> sched, ValueType vt> :		class VOPC_Class_Profile<list<SchedReadWrite> sched, ValueType src0VT, ValueType src1VT = i32> :
VOPC_Profile<sched, vt, i32> {		VOPC_Profile<sched, src0VT, src1VT> {
let AsmDPP = "$src0_modifiers, $src1 $dpp_ctrl$row_mask$bank_mask$bound_ctrl";		let AsmDPP = "$src0_modifiers, $src1 $dpp_ctrl$row_mask$bank_mask$bound_ctrl";
let AsmDPP16 = AsmDPP#"$fi";		let AsmDPP16 = AsmDPP#"$fi";
let InsDPP = (ins FPVRegInputMods:$src0_modifiers, VGPR_32:$src0, VGPR_32:$src1, dpp_ctrl:$dpp_ctrl, row_mask:$row_mask, bank_mask:$bank_mask, bound_ctrl:$bound_ctrl);		let InsDPP = (ins Src0ModDPP:$src0_modifiers, Src0DPP:$src0, Src1DPP:$src1, dpp_ctrl:$dpp_ctrl, row_mask:$row_mask, bank_mask:$bank_mask, bound_ctrl:$bound_ctrl);
let InsDPP16 = !con(InsDPP, (ins FI:$fi));		let InsDPP16 = !con(InsDPP, (ins FI:$fi));
// DPP8 forbids modifiers and can inherit from VOPC_Profile		// DPP8 forbids modifiers and can inherit from VOPC_Profile

let Ins64 = (ins Src0Mod:$src0_modifiers, Src0RC64:$src0, Src1RC64:$src1);		let Ins64 = (ins Src0Mod:$src0_modifiers, Src0RC64:$src0, Src1RC64:$src1);
dag InsPartVOP3DPP = (ins FPVRegInputMods:$src0_modifiers, VGPRSrc_32:$src0, VGPRSrc_32:$src1);		dag InsPartVOP3DPP = (ins FPVRegInputMods:$src0_modifiers, VGPRSrc_32:$src0, VGPRSrc_32:$src1);
let InsVOP3Base = !con(InsPartVOP3DPP, !if(HasOpSel, (ins op_sel0:$op_sel),		let InsVOP3Base = !con(InsPartVOP3DPP, !if(HasOpSel, (ins op_sel0:$op_sel),
(ins)));		(ins)));
let Asm64 = "$sdst, $src0_modifiers, $src1";		let Asm64 = "$sdst, $src0_modifiers, $src1";
let AsmVOP3DPPBase = Asm64;		let AsmVOP3DPPBase = Asm64;

let InsSDWA = (ins Src0ModSDWA:$src0_modifiers, Src0SDWA:$src0,		let InsSDWA = (ins Src0ModSDWA:$src0_modifiers, Src0SDWA:$src0,
Src1ModSDWA:$src1_modifiers, Src1SDWA:$src1,		Src1ModSDWA:$src1_modifiers, Src1SDWA:$src1,
clampmod:$clamp, src0_sel:$src0_sel, src1_sel:$src1_sel);		clampmod:$clamp, src0_sel:$src0_sel, src1_sel:$src1_sel);

let AsmSDWA = " vcc, $src0_modifiers, $src1_modifiers$clamp $src0_sel $src1_sel";		let AsmSDWA = " vcc, $src0_modifiers, $src1_modifiers$clamp $src0_sel $src1_sel";
let HasSrc1Mods = 0;		let HasSrc1Mods = 0;
let HasClamp = 0;		let HasClamp = 0;
let HasOMod = 0;		let HasOMod = 0;
}		}

class VOPC_Class_NoSdst_Profile<list<SchedReadWrite> sched, ValueType vt> :		multiclass VOPC_Class_Profile_t16<list<SchedReadWrite> sched> {
VOPC_Class_Profile<sched, vt> {		def NAME : VOPC_Class_Profile<sched, f16>;
		def _t16 : VOPC_Class_Profile<sched, f16, i16> {
		let IsTrue16 = 1;
		let Src1RC32 = RegisterOperand<getVregSrcForVT_t16<Src1VT>.ret>;
		let Src0DPP = getVregSrcForVT_t16<Src0VT>.ret;
		let Src1DPP = getVregSrcForVT_t16<Src1VT>.ret;
		let Src2DPP = getVregSrcForVT_t16<Src2VT>.ret;
		let Src0ModDPP = getSrcModDPP_t16<Src0VT>.ret;
		let Src1ModDPP = getSrcModDPP_t16<Src1VT>.ret;
		let Src2ModDPP = getSrcModDPP_t16<Src2VT>.ret;
		}
		}

		class VOPC_Class_NoSdst_Profile<list<SchedReadWrite> sched, ValueType src0VT, ValueType src1VT = i32> :
		VOPC_Class_Profile<sched, src0VT, src1VT> {
let Outs64 = (outs );		let Outs64 = (outs );
let OutsSDWA = (outs );		let OutsSDWA = (outs );
let InsSDWA = (ins Src0ModSDWA:$src0_modifiers, Src0SDWA:$src0,		let InsSDWA = (ins Src0ModSDWA:$src0_modifiers, Src0SDWA:$src0,
Src1ModSDWA:$src1_modifiers, Src1SDWA:$src1,		Src1ModSDWA:$src1_modifiers, Src1SDWA:$src1,
src0_sel:$src0_sel, src1_sel:$src1_sel);		src0_sel:$src0_sel, src1_sel:$src1_sel);
let Asm64 = "$src0_modifiers, $src1";		let Asm64 = "$src0_modifiers, $src1";
let AsmVOP3DPPBase = Asm64;		let AsmVOP3DPPBase = Asm64;
let AsmSDWA9 = "$src0_modifiers, $src1_modifiers $src0_sel $src1_sel";		let AsmSDWA9 = "$src0_modifiers, $src1_modifiers $src0_sel $src1_sel";
let EmitDst = 0;		let EmitDst = 0;
}		}

		multiclass VOPC_Class_NoSdst_Profile_t16<list<SchedReadWrite> sched> {
		def NAME : VOPC_Class_NoSdst_Profile<sched, f16>;
		def _t16 : VOPC_Class_NoSdst_Profile<sched, f16, i16> {
		let IsTrue16 = 1;
		let Src1RC32 = RegisterOperand<getVregSrcForVT_t16<Src1VT>.ret>;
		let Src0DPP = getVregSrcForVT_t16<Src0VT>.ret;
		let Src1DPP = getVregSrcForVT_t16<Src1VT>.ret;
		let Src2DPP = getVregSrcForVT_t16<Src2VT>.ret;
		let Src0ModDPP = getSrcModDPP_t16<Src0VT>.ret;
		let Src1ModDPP = getSrcModDPP_t16<Src1VT>.ret;
		let Src2ModDPP = getSrcModDPP_t16<Src2VT>.ret;
		}
		}

class getVOPCClassPat64 <VOPProfile P> {		class getVOPCClassPat64 <VOPProfile P> {
list<dag> ret =		list<dag> ret =
[(set i1:$sdst,		[(set i1:$sdst,
(AMDGPUfp_class		(AMDGPUfp_class
(P.Src0VT (VOP3Mods P.Src0VT:$src0, i32:$src0_modifiers)),		(P.Src0VT (VOP3Mods P.Src0VT:$src0, i32:$src0_modifiers)),
P.Src1VT:$src1))];		i32:$src1))];
}		}


// Special case for class instructions which only have modifiers on		// Special case for class instructions which only have modifiers on
// the 1st source operand.		// the 1st source operand.
multiclass VOPC_Class_Pseudos <string opName, VOPC_Profile p, bit DefExec,		multiclass VOPC_Class_Pseudos <string opName, VOPC_Profile p, bit DefExec,
bit DefVcc = 1> {		bit DefVcc = 1> {
def _e32 : VOPC_Pseudo <opName, p>,		def _e32 : VOPC_Pseudo <opName, p>,
VCMPXNoSDstTable<1, opName#"_e32"> {		VCMPXNoSDstTable<1, opName#"_e32"> {
let Defs = !if(DefExec, !if(DefVcc, [VCC, EXEC], [EXEC]),		let Defs = !if(DefExec, !if(DefVcc, [VCC, EXEC], [EXEC]),
!if(DefVcc, [VCC], []));		!if(DefVcc, [VCC], []));
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	if P.HasExtVOP3DPP then
let Defs = [EXEC];		let Defs = [EXEC];
let SchedRW = P_NoSDst.Schedule;		let SchedRW = P_NoSDst.Schedule;
let Constraints = "";		let Constraints = "";
}		}
} // end SubtargetPredicate = isGFX11Plus		} // end SubtargetPredicate = isGFX11Plus
}		}
} // End SubtargetPredicate = HasSdstCMPX		} // End SubtargetPredicate = HasSdstCMPX

def VOPC_I1_F16_I32 : VOPC_Class_Profile<[Write32Bit], f16>;		defm VOPC_I1_F16_I16 : VOPC_Class_Profile_t16<[Write32Bit]>;
def VOPC_I1_F32_I32 : VOPC_Class_Profile<[Write32Bit], f32>;		def VOPC_I1_F32_I32 : VOPC_Class_Profile<[Write32Bit], f32>;
def VOPC_I1_F64_I32 : VOPC_Class_Profile<[WriteDoubleAdd], f64>;		def VOPC_I1_F64_I32 : VOPC_Class_Profile<[WriteDoubleAdd], f64>;

def VOPC_F16_I32 : VOPC_Class_NoSdst_Profile<[Write32Bit], f16>;		defm VOPC_F16_I16 : VOPC_Class_NoSdst_Profile_t16<[Write32Bit]>;
def VOPC_F32_I32 : VOPC_Class_NoSdst_Profile<[Write32Bit], f32>;		def VOPC_F32_I32 : VOPC_Class_NoSdst_Profile<[Write32Bit], f32>;
def VOPC_F64_I32 : VOPC_Class_NoSdst_Profile<[Write64Bit], f64>;		def VOPC_F64_I32 : VOPC_Class_NoSdst_Profile<[Write64Bit], f64>;

multiclass VOPC_CLASS_F16 <string opName> :		multiclass VOPC_CLASS_F16 <string opName> {
VOPC_Class_Pseudos <opName, VOPC_I1_F16_I32, 0>;		let OtherPredicates = [NotHasTrue16BitInsts, Has16BitInsts] in {
		defm NAME : VOPC_Class_Pseudos <opName, VOPC_I1_F16_I16, 0>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		defm _t16 : VOPC_Class_Pseudos <opName#"_t16", VOPC_I1_F16_I16_t16, 0>;
		}
		}

multiclass VOPCX_CLASS_F16 <string opName> :		multiclass VOPCX_CLASS_F16 <string opName> {
VOPCX_Class_Pseudos <opName, VOPC_I1_F16_I32, VOPC_F16_I32>;		let OtherPredicates = [NotHasTrue16BitInsts, Has16BitInsts] in {
		defm NAME : VOPCX_Class_Pseudos <opName, VOPC_I1_F16_I16, VOPC_F16_I16>;
		}
		let OtherPredicates = [HasTrue16BitInsts] in {
		defm _t16 : VOPCX_Class_Pseudos <opName#"_t16", VOPC_I1_F16_I16_t16, VOPC_F16_I16_t16>;
		}
		}

multiclass VOPC_CLASS_F32 <string opName> :		multiclass VOPC_CLASS_F32 <string opName> :
VOPC_Class_Pseudos <opName, VOPC_I1_F32_I32, 0>;		VOPC_Class_Pseudos <opName, VOPC_I1_F32_I32, 0>;

multiclass VOPCX_CLASS_F32 <string opName> :		multiclass VOPCX_CLASS_F32 <string opName> :
VOPCX_Class_Pseudos <opName, VOPC_I1_F32_I32, VOPC_F32_I32>;		VOPCX_Class_Pseudos <opName, VOPC_I1_F32_I32, VOPC_F32_I32>;

multiclass VOPC_CLASS_F64 <string opName> :		multiclass VOPC_CLASS_F64 <string opName> :
VOPC_Class_Pseudos <opName, VOPC_I1_F64_I32, 0>;		VOPC_Class_Pseudos <opName, VOPC_I1_F64_I32, 0>;

multiclass VOPCX_CLASS_F64 <string opName> :		multiclass VOPCX_CLASS_F64 <string opName> :
VOPCX_Class_Pseudos <opName, VOPC_I1_F64_I32, VOPC_F64_I32>;		VOPCX_Class_Pseudos <opName, VOPC_I1_F64_I32, VOPC_F64_I32>;

// cmp_class ignores the FP mode and faithfully reports the unmodified		// cmp_class ignores the FP mode and faithfully reports the unmodified
// source value.		// source value.
let ReadsModeReg = 0, mayRaiseFPException = 0 in {		let ReadsModeReg = 0, mayRaiseFPException = 0 in {
defm V_CMP_CLASS_F32 : VOPC_CLASS_F32 <"v_cmp_class_f32">;		defm V_CMP_CLASS_F32 : VOPC_CLASS_F32 <"v_cmp_class_f32">;
defm V_CMPX_CLASS_F32 : VOPCX_CLASS_F32 <"v_cmpx_class_f32">;		defm V_CMPX_CLASS_F32 : VOPCX_CLASS_F32 <"v_cmpx_class_f32">;
defm V_CMP_CLASS_F64 : VOPC_CLASS_F64 <"v_cmp_class_f64">;		defm V_CMP_CLASS_F64 : VOPC_CLASS_F64 <"v_cmp_class_f64">;
defm V_CMPX_CLASS_F64 : VOPCX_CLASS_F64 <"v_cmpx_class_f64">;		defm V_CMPX_CLASS_F64 : VOPCX_CLASS_F64 <"v_cmpx_class_f64">;

let SubtargetPredicate = Has16BitInsts in {
defm V_CMP_CLASS_F16 : VOPC_CLASS_F16 <"v_cmp_class_f16">;		defm V_CMP_CLASS_F16 : VOPC_CLASS_F16 <"v_cmp_class_f16">;
defm V_CMPX_CLASS_F16 : VOPCX_CLASS_F16 <"v_cmpx_class_f16">;		defm V_CMPX_CLASS_F16 : VOPCX_CLASS_F16 <"v_cmpx_class_f16">;
}
} // End ReadsModeReg = 0, mayRaiseFPException = 0		} // End ReadsModeReg = 0, mayRaiseFPException = 0

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// V_ICMPIntrinsic Pattern.		// V_ICMPIntrinsic Pattern.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// We need to use COPY_TO_REGCLASS to w/a the problem when ReplaceAllUsesWith()		// We need to use COPY_TO_REGCLASS to w/a the problem when ReplaceAllUsesWith()
// complaints it cannot replace i1 <-> i64/i32 if node was not morphed in place.		// complaints it cannot replace i1 <-> i64/i32 if node was not morphed in place.
▲ Show 20 Lines • Show All 363 Lines • ▼ Show 20 Lines	foreach _ = BoolToList<ps64.Pfl.HasExtVOP3DPP>.ret in {
let WaveSizePredicate = isWave64;		let WaveSizePredicate = isWave64;
}		}
}		}
}		}

}		}

multiclass VOPC_Real_with_name_gfx11<bits<9> op, string OpName,		multiclass VOPC_Real_with_name_gfx11<bits<9> op, string OpName,
string asm_name> {		string asm_name, string pseudo_mnemonic = ""> {
defvar ps32 = !cast<VOPC_Pseudo>(OpName#"_e32");		defvar ps32 = !cast<VOPC_Pseudo>(OpName#"_e32");
defvar ps64 = !cast<VOP3_Pseudo>(OpName#"_e64");		defvar ps64 = !cast<VOP3_Pseudo>(OpName#"_e64");
let DecoderNamespace = "GFX11" in {		let DecoderNamespace = "GFX11" in {
def _e32_gfx11 :		def _e32_gfx11 :
// 32 and 64 bit forms of the instruction have _e32 and _e64		// 32 and 64 bit forms of the instruction have _e32 and _e64
// respectively appended to their assembly mnemonic.		// respectively appended to their assembly mnemonic.
// _e64 is printed as part of the VOPDstS64orS32 operand, whereas		// _e64 is printed as part of the VOPDstS64orS32 operand, whereas
// the destination-less 32bit forms add it to the asmString here.		// the destination-less 32bit forms add it to the asmString here.
VOPC_Real<ps32, SIEncodingFamily.GFX11, asm_name#"_e32">,		VOPC_Real<ps32, SIEncodingFamily.GFX11, asm_name#"_e32">,
VOPCe<op{7-0}>,		VOPCe<op{7-0}>,
MnemonicAlias<ps32.Mnemonic, asm_name>, Requires<[isGFX11Plus]>;		MnemonicAlias<!if(!empty(pseudo_mnemonic), ps32.Mnemonic,
		pseudo_mnemonic),
		asm_name>,
		Requires<[isGFX11Plus]>;
def _e64_gfx11 :		def _e64_gfx11 :
VOP3_Real<ps64, SIEncodingFamily.GFX11, asm_name>,		VOP3_Real<ps64, SIEncodingFamily.GFX11, asm_name>,
VOP3a_gfx11<{0, op}, ps64.Pfl>,		VOP3a_gfx11<{0, op}, ps64.Pfl>,
MnemonicAlias<ps64.Mnemonic, asm_name>, Requires<[isGFX11Plus]> {		MnemonicAlias<!if(!empty(pseudo_mnemonic), ps64.Mnemonic,
		pseudo_mnemonic),
		asm_name>,
		Requires<[isGFX11Plus]> {
// Encoding used for VOPC instructions encoded as VOP3 differs from		// Encoding used for VOPC instructions encoded as VOP3 differs from
// VOP3e by destination name (sdst) as VOPC doesn't have vector dst.		// VOP3e by destination name (sdst) as VOPC doesn't have vector dst.
bits<8> sdst;		bits<8> sdst;
let Inst{7-0} = sdst;		let Inst{7-0} = sdst;
}		}
} // End DecoderNamespace = "GFX11"		} // End DecoderNamespace = "GFX11"

defm : VOPCInstAliases<OpName, "gfx11", NAME>;		defm : VOPCInstAliases<OpName, "gfx11", NAME, asm_name>;

foreach _ = BoolToList<ps32.Pfl.HasExtDPP>.ret in {		foreach _ = BoolToList<ps32.Pfl.HasExtDPP>.ret in {
defvar psDPP = !cast<VOP_DPP_Pseudo>(OpName #"_e32" #"_dpp");		defvar psDPP = !cast<VOP_DPP_Pseudo>(OpName #"_e32" #"_dpp");
defvar AsmDPP = ps32.Pfl.AsmDPP16;		defvar AsmDPP = ps32.Pfl.AsmDPP16;
let DecoderNamespace = "DPPGFX11" in {		let DecoderNamespace = "DPPGFX11" in {
def _e32_dpp_gfx11 : VOPC_DPP16_SIMC<op{7-0}, psDPP,		def _e32_dpp_gfx11 : VOPC_DPP16_SIMC<op{7-0}, psDPP,
SIEncodingFamily.GFX11, asm_name>;		SIEncodingFamily.GFX11, asm_name>;
def _e32_dpp_w32_gfx11		def _e32_dpp_w32_gfx11
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	foreach _ = BoolToList<ps64.Pfl.HasExtVOP3DPP>.ret in {
def _e64_dpp8_w64_gfx11		def _e64_dpp8_w64_gfx11
: VOPC64_DPP8_Dst<{0, op}, ps64, asm_name> {		: VOPC64_DPP8_Dst<{0, op}, ps64, asm_name> {
let AsmString = asm_name # " vcc, " # AsmDPP8;		let AsmString = asm_name # " vcc, " # AsmDPP8;
let isAsmParserOnly = 1;		let isAsmParserOnly = 1;
let WaveSizePredicate = isWave64;		let WaveSizePredicate = isWave64;
}		}
}		}
}		}

}		}

		multiclass VOPC_Real_t16_gfx11<bits<9> op, string asm_name,
		string OpName = NAME> : VOPC_Real_with_name_gfx11<op, OpName, asm_name>;

multiclass VOPCX_Real_gfx11<bits<9> op> {		multiclass VOPCX_Real_gfx11<bits<9> op> {
defvar ps32 = !cast<VOPC_Pseudo>(NAME#"_nosdst_e32");		defvar ps32 = !cast<VOPC_Pseudo>(NAME#"_nosdst_e32");
defvar ps64 = !cast<VOP3_Pseudo>(NAME#"_nosdst_e64");		defvar ps64 = !cast<VOP3_Pseudo>(NAME#"_nosdst_e64");
let DecoderNamespace = "GFX11" in {		let DecoderNamespace = "GFX11" in {
def _e32_gfx11 :		def _e32_gfx11 :
VOPC_Real<ps32, SIEncodingFamily.GFX11>,		VOPC_Real<ps32, SIEncodingFamily.GFX11>,
VOPCe<op{7-0}> {		VOPCe<op{7-0}> {
let AsmString = !subst("_nosdst", "", ps32.PseudoInstr)		let AsmString = !subst("_nosdst", "", ps32.PseudoInstr)
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	foreach _ = BoolToList<ps64.Pfl.HasExtVOP3DPP>.ret in {
let AsmString = !subst("_nosdst", "", ps64.OpName)		let AsmString = !subst("_nosdst", "", ps64.OpName)
# "{_e64_dpp} " # AsmDPP8;		# "{_e64_dpp} " # AsmDPP8;
}		}
}		}
}		}
}		}

multiclass VOPCX_Real_with_name_gfx11<bits<9> op, string OpName,		multiclass VOPCX_Real_with_name_gfx11<bits<9> op, string OpName,
string asm_name> {		string asm_name, string pseudo_mnemonic = ""> {
defvar ps32 = !cast<VOPC_Pseudo>(OpName#"_nosdst_e32");		defvar ps32 = !cast<VOPC_Pseudo>(OpName#"_nosdst_e32");
defvar ps64 = !cast<VOP3_Pseudo>(OpName#"_nosdst_e64");		defvar ps64 = !cast<VOP3_Pseudo>(OpName#"_nosdst_e64");
let DecoderNamespace = "GFX11" in {		let DecoderNamespace = "GFX11" in {
def _e32_gfx11		def _e32_gfx11
: VOPC_Real<ps32, SIEncodingFamily.GFX11, asm_name>,		: VOPC_Real<ps32, SIEncodingFamily.GFX11, asm_name>,
MnemonicAlias<!subst("_nosdst", "", ps32.Mnemonic), asm_name>,		MnemonicAlias<!if(!empty(pseudo_mnemonic), !subst("_nosdst", "", ps32.Mnemonic),
		pseudo_mnemonic),
		asm_name>,
Requires<[isGFX11Plus]>,		Requires<[isGFX11Plus]>,
VOPCe<op{7-0}> {		VOPCe<op{7-0}> {
let AsmString = asm_name # "{_e32} " # ps32.AsmOperands;		let AsmString = asm_name # "{_e32} " # ps32.AsmOperands;
}		}
def _e64_gfx11		def _e64_gfx11
: VOP3_Real<ps64, SIEncodingFamily.GFX11, asm_name>,		: VOP3_Real<ps64, SIEncodingFamily.GFX11, asm_name>,
MnemonicAlias<!subst("_nosdst", "", ps64.Mnemonic), asm_name>,		MnemonicAlias<!if(!empty(pseudo_mnemonic), !subst("_nosdst", "", ps64.Mnemonic),
		pseudo_mnemonic),
		asm_name>,
Requires<[isGFX11Plus]>,		Requires<[isGFX11Plus]>,
VOP3a_gfx11<{0, op}, ps64.Pfl> {		VOP3a_gfx11<{0, op}, ps64.Pfl> {
let Inst{7-0} = ? ; // sdst		let Inst{7-0} = ? ; // sdst
let AsmString = asm_name # "{_e64} " # ps64.AsmOperands;		let AsmString = asm_name # "{_e64} " # ps64.AsmOperands;
}		}
} // End DecoderNamespace = "GFX11"		} // End DecoderNamespace = "GFX11"

defm : VOPCXInstAliases<OpName, "gfx11", NAME>;		defm : VOPCXInstAliases<OpName, "gfx11", NAME, asm_name>;

foreach _ = BoolToList<ps32.Pfl.HasExtDPP>.ret in {		foreach _ = BoolToList<ps32.Pfl.HasExtDPP>.ret in {
defvar psDPP = !cast<VOP_DPP_Pseudo>(OpName#"_nosdst_e32"#"_dpp");		defvar psDPP = !cast<VOP_DPP_Pseudo>(OpName#"_nosdst_e32"#"_dpp");
let DecoderNamespace = "DPPGFX11" in {		let DecoderNamespace = "DPPGFX11" in {
def _e32_dpp_gfx11 : VOPC_DPP16_SIMC<op{7-0}, psDPP,		def _e32_dpp_gfx11 : VOPC_DPP16_SIMC<op{7-0}, psDPP,
SIEncodingFamily.GFX11, asm_name>;		SIEncodingFamily.GFX11, asm_name>;
}		}
let DecoderNamespace = "DPP8GFX11" in {		let DecoderNamespace = "DPP8GFX11" in {
Show All 12 Lines	foreach _ = BoolToList<ps64.Pfl.HasExtVOP3DPP>.ret in {
}		}
defvar AsmDPP8 = ps64.Pfl.AsmVOP3DPP8;		defvar AsmDPP8 = ps64.Pfl.AsmVOP3DPP8;
let DecoderNamespace = "DPP8GFX11" in {		let DecoderNamespace = "DPP8GFX11" in {
def _e64_dpp8_gfx11 : VOPC64_DPP8_NoDst<{0, op}, ps64, asm_name> {		def _e64_dpp8_gfx11 : VOPC64_DPP8_NoDst<{0, op}, ps64, asm_name> {
let AsmString = asm_name # "{_e64_dpp} " # AsmDPP8;		let AsmString = asm_name # "{_e64_dpp} " # AsmDPP8;
}		}
}		}
}		}

}		}

		multiclass VOPCX_Real_t16_gfx11<bits<9> op, string asm_name,
		string OpName = NAME> : VOPCX_Real_with_name_gfx11<op, OpName, asm_name>;


} // End AssemblerPredicate = isGFX11Only		} // End AssemblerPredicate = isGFX11Only

defm V_CMP_F_F16 : VOPC_Real_gfx11<0x000>;		defm V_CMP_F_F16_t16 : VOPC_Real_t16_gfx11<0x000, "v_cmp_f_f16">;
defm V_CMP_LT_F16 : VOPC_Real_gfx11<0x001>;		defm V_CMP_LT_F16_t16 : VOPC_Real_t16_gfx11<0x001, "v_cmp_lt_f16">;
defm V_CMP_EQ_F16 : VOPC_Real_gfx11<0x002>;		defm V_CMP_EQ_F16_t16 : VOPC_Real_t16_gfx11<0x002, "v_cmp_eq_f16">;
defm V_CMP_LE_F16 : VOPC_Real_gfx11<0x003>;		defm V_CMP_LE_F16_t16 : VOPC_Real_t16_gfx11<0x003, "v_cmp_le_f16">;
defm V_CMP_GT_F16 : VOPC_Real_gfx11<0x004>;		defm V_CMP_GT_F16_t16 : VOPC_Real_t16_gfx11<0x004, "v_cmp_gt_f16">;
defm V_CMP_LG_F16 : VOPC_Real_gfx11<0x005>;		defm V_CMP_LG_F16_t16 : VOPC_Real_t16_gfx11<0x005, "v_cmp_lg_f16">;
defm V_CMP_GE_F16 : VOPC_Real_gfx11<0x006>;		defm V_CMP_GE_F16_t16 : VOPC_Real_t16_gfx11<0x006, "v_cmp_ge_f16">;
defm V_CMP_O_F16 : VOPC_Real_gfx11<0x007>;		defm V_CMP_O_F16_t16 : VOPC_Real_t16_gfx11<0x007, "v_cmp_o_f16">;
defm V_CMP_U_F16 : VOPC_Real_gfx11<0x008>;		defm V_CMP_U_F16_t16 : VOPC_Real_t16_gfx11<0x008, "v_cmp_u_f16">;
defm V_CMP_NGE_F16 : VOPC_Real_gfx11<0x009>;		defm V_CMP_NGE_F16_t16 : VOPC_Real_t16_gfx11<0x009, "v_cmp_nge_f16">;
defm V_CMP_NLG_F16 : VOPC_Real_gfx11<0x00a>;		defm V_CMP_NLG_F16_t16 : VOPC_Real_t16_gfx11<0x00a, "v_cmp_nlg_f16">;
defm V_CMP_NGT_F16 : VOPC_Real_gfx11<0x00b>;		defm V_CMP_NGT_F16_t16 : VOPC_Real_t16_gfx11<0x00b, "v_cmp_ngt_f16">;
defm V_CMP_NLE_F16 : VOPC_Real_gfx11<0x00c>;		defm V_CMP_NLE_F16_t16 : VOPC_Real_t16_gfx11<0x00c, "v_cmp_nle_f16">;
defm V_CMP_NEQ_F16 : VOPC_Real_gfx11<0x00d>;		defm V_CMP_NEQ_F16_t16 : VOPC_Real_t16_gfx11<0x00d, "v_cmp_neq_f16">;
defm V_CMP_NLT_F16 : VOPC_Real_gfx11<0x00e>;		defm V_CMP_NLT_F16_t16 : VOPC_Real_t16_gfx11<0x00e, "v_cmp_nlt_f16">;
defm V_CMP_T_F16 : VOPC_Real_with_name_gfx11<0x00f, "V_CMP_TRU_F16", "v_cmp_t_f16">;		defm V_CMP_T_F16_t16 : VOPC_Real_with_name_gfx11<0x00f, "V_CMP_TRU_F16_t16", "v_cmp_t_f16", "v_cmp_tru_f16">;
defm V_CMP_F_F32 : VOPC_Real_gfx11<0x010>;		defm V_CMP_F_F32 : VOPC_Real_gfx11<0x010>;
defm V_CMP_LT_F32 : VOPC_Real_gfx11<0x011>;		defm V_CMP_LT_F32 : VOPC_Real_gfx11<0x011>;
defm V_CMP_EQ_F32 : VOPC_Real_gfx11<0x012>;		defm V_CMP_EQ_F32 : VOPC_Real_gfx11<0x012>;
defm V_CMP_LE_F32 : VOPC_Real_gfx11<0x013>;		defm V_CMP_LE_F32 : VOPC_Real_gfx11<0x013>;
defm V_CMP_GT_F32 : VOPC_Real_gfx11<0x014>;		defm V_CMP_GT_F32 : VOPC_Real_gfx11<0x014>;
defm V_CMP_LG_F32 : VOPC_Real_gfx11<0x015>;		defm V_CMP_LG_F32 : VOPC_Real_gfx11<0x015>;
defm V_CMP_GE_F32 : VOPC_Real_gfx11<0x016>;		defm V_CMP_GE_F32 : VOPC_Real_gfx11<0x016>;
defm V_CMP_O_F32 : VOPC_Real_gfx11<0x017>;		defm V_CMP_O_F32 : VOPC_Real_gfx11<0x017>;
defm V_CMP_U_F32 : VOPC_Real_gfx11<0x018>;		defm V_CMP_U_F32 : VOPC_Real_gfx11<0x018>;
defm V_CMP_NGE_F32 : VOPC_Real_gfx11<0x019>;		defm V_CMP_NGE_F32 : VOPC_Real_gfx11<0x019>;
defm V_CMP_NLG_F32 : VOPC_Real_gfx11<0x01a>;		defm V_CMP_NLG_F32 : VOPC_Real_gfx11<0x01a>;
defm V_CMP_NGT_F32 : VOPC_Real_gfx11<0x01b>;		defm V_CMP_NGT_F32 : VOPC_Real_gfx11<0x01b>;
defm V_CMP_NLE_F32 : VOPC_Real_gfx11<0x01c>;		defm V_CMP_NLE_F32 : VOPC_Real_gfx11<0x01c>;
defm V_CMP_NEQ_F32 : VOPC_Real_gfx11<0x01d>;		defm V_CMP_NEQ_F32 : VOPC_Real_gfx11<0x01d>;
defm V_CMP_NLT_F32 : VOPC_Real_gfx11<0x01e>;		defm V_CMP_NLT_F32 : VOPC_Real_gfx11<0x01e>;
defm V_CMP_T_F32 : VOPC_Real_with_name_gfx11<0x01f, "V_CMP_TRU_F32", "v_cmp_t_f32">;		defm V_CMP_T_F32 : VOPC_Real_with_name_gfx11<0x01f, "V_CMP_TRU_F32", "v_cmp_t_f32">;
defm V_CMP_T_F64 : VOPC_Real_with_name_gfx11<0x02f, "V_CMP_TRU_F64", "v_cmp_t_f64">;		defm V_CMP_T_F64 : VOPC_Real_with_name_gfx11<0x02f, "V_CMP_TRU_F64", "v_cmp_t_f64">;
defm V_CMP_LT_I16 : VOPC_Real_gfx11<0x031>;		defm V_CMP_LT_I16_t16 : VOPC_Real_t16_gfx11<0x031, "v_cmp_lt_i16">;
defm V_CMP_EQ_I16 : VOPC_Real_gfx11<0x032>;		defm V_CMP_EQ_I16_t16 : VOPC_Real_t16_gfx11<0x032, "v_cmp_eq_i16">;
defm V_CMP_LE_I16 : VOPC_Real_gfx11<0x033>;		defm V_CMP_LE_I16_t16 : VOPC_Real_t16_gfx11<0x033, "v_cmp_le_i16">;
defm V_CMP_GT_I16 : VOPC_Real_gfx11<0x034>;		defm V_CMP_GT_I16_t16 : VOPC_Real_t16_gfx11<0x034, "v_cmp_gt_i16">;
defm V_CMP_NE_I16 : VOPC_Real_gfx11<0x035>;		defm V_CMP_NE_I16_t16 : VOPC_Real_t16_gfx11<0x035, "v_cmp_ne_i16">;
defm V_CMP_GE_I16 : VOPC_Real_gfx11<0x036>;		defm V_CMP_GE_I16_t16 : VOPC_Real_t16_gfx11<0x036, "v_cmp_ge_i16">;
defm V_CMP_LT_U16 : VOPC_Real_gfx11<0x039>;		defm V_CMP_LT_U16_t16 : VOPC_Real_t16_gfx11<0x039, "v_cmp_lt_u16">;
defm V_CMP_EQ_U16 : VOPC_Real_gfx11<0x03a>;		defm V_CMP_EQ_U16_t16 : VOPC_Real_t16_gfx11<0x03a, "v_cmp_eq_u16">;
defm V_CMP_LE_U16 : VOPC_Real_gfx11<0x03b>;		defm V_CMP_LE_U16_t16 : VOPC_Real_t16_gfx11<0x03b, "v_cmp_le_u16">;
defm V_CMP_GT_U16 : VOPC_Real_gfx11<0x03c>;		defm V_CMP_GT_U16_t16 : VOPC_Real_t16_gfx11<0x03c, "v_cmp_gt_u16">;
defm V_CMP_NE_U16 : VOPC_Real_gfx11<0x03d>;		defm V_CMP_NE_U16_t16 : VOPC_Real_t16_gfx11<0x03d, "v_cmp_ne_u16">;
defm V_CMP_GE_U16 : VOPC_Real_gfx11<0x03e>;		defm V_CMP_GE_U16_t16 : VOPC_Real_t16_gfx11<0x03e, "v_cmp_ge_u16">;
defm V_CMP_F_I32 : VOPC_Real_gfx11<0x040>;		defm V_CMP_F_I32 : VOPC_Real_gfx11<0x040>;
defm V_CMP_LT_I32 : VOPC_Real_gfx11<0x041>;		defm V_CMP_LT_I32 : VOPC_Real_gfx11<0x041>;
defm V_CMP_EQ_I32 : VOPC_Real_gfx11<0x042>;		defm V_CMP_EQ_I32 : VOPC_Real_gfx11<0x042>;
defm V_CMP_LE_I32 : VOPC_Real_gfx11<0x043>;		defm V_CMP_LE_I32 : VOPC_Real_gfx11<0x043>;
defm V_CMP_GT_I32 : VOPC_Real_gfx11<0x044>;		defm V_CMP_GT_I32 : VOPC_Real_gfx11<0x044>;
defm V_CMP_NE_I32 : VOPC_Real_gfx11<0x045>;		defm V_CMP_NE_I32 : VOPC_Real_gfx11<0x045>;
defm V_CMP_GE_I32 : VOPC_Real_gfx11<0x046>;		defm V_CMP_GE_I32 : VOPC_Real_gfx11<0x046>;
defm V_CMP_T_I32 : VOPC_Real_gfx11<0x047>;		defm V_CMP_T_I32 : VOPC_Real_gfx11<0x047>;
Show All 18 Lines
defm V_CMP_LT_U64 : VOPC_Real_gfx11<0x059>;		defm V_CMP_LT_U64 : VOPC_Real_gfx11<0x059>;
defm V_CMP_EQ_U64 : VOPC_Real_gfx11<0x05a>;		defm V_CMP_EQ_U64 : VOPC_Real_gfx11<0x05a>;
defm V_CMP_LE_U64 : VOPC_Real_gfx11<0x05b>;		defm V_CMP_LE_U64 : VOPC_Real_gfx11<0x05b>;
defm V_CMP_GT_U64 : VOPC_Real_gfx11<0x05c>;		defm V_CMP_GT_U64 : VOPC_Real_gfx11<0x05c>;
defm V_CMP_NE_U64 : VOPC_Real_gfx11<0x05d>;		defm V_CMP_NE_U64 : VOPC_Real_gfx11<0x05d>;
defm V_CMP_GE_U64 : VOPC_Real_gfx11<0x05e>;		defm V_CMP_GE_U64 : VOPC_Real_gfx11<0x05e>;
defm V_CMP_T_U64 : VOPC_Real_gfx11<0x05f>;		defm V_CMP_T_U64 : VOPC_Real_gfx11<0x05f>;

defm V_CMP_CLASS_F16 : VOPC_Real_gfx11<0x07d>;		defm V_CMP_CLASS_F16_t16 : VOPC_Real_t16_gfx11<0x07d, "v_cmp_class_f16">;
defm V_CMP_CLASS_F32 : VOPC_Real_gfx11<0x07e>;		defm V_CMP_CLASS_F32 : VOPC_Real_gfx11<0x07e>;
defm V_CMP_CLASS_F64 : VOPC_Real_gfx11<0x07f>;		defm V_CMP_CLASS_F64 : VOPC_Real_gfx11<0x07f>;

defm V_CMPX_F_F16 : VOPCX_Real_gfx11<0x080>;		defm V_CMPX_F_F16_t16 : VOPCX_Real_t16_gfx11<0x080, "v_cmpx_f_f16">;
defm V_CMPX_LT_F16 : VOPCX_Real_gfx11<0x081>;		defm V_CMPX_LT_F16_t16 : VOPCX_Real_t16_gfx11<0x081, "v_cmpx_lt_f16">;
defm V_CMPX_EQ_F16 : VOPCX_Real_gfx11<0x082>;		defm V_CMPX_EQ_F16_t16 : VOPCX_Real_t16_gfx11<0x082, "v_cmpx_eq_f16">;
defm V_CMPX_LE_F16 : VOPCX_Real_gfx11<0x083>;		defm V_CMPX_LE_F16_t16 : VOPCX_Real_t16_gfx11<0x083, "v_cmpx_le_f16">;
defm V_CMPX_GT_F16 : VOPCX_Real_gfx11<0x084>;		defm V_CMPX_GT_F16_t16 : VOPCX_Real_t16_gfx11<0x084, "v_cmpx_gt_f16">;
defm V_CMPX_LG_F16 : VOPCX_Real_gfx11<0x085>;		defm V_CMPX_LG_F16_t16 : VOPCX_Real_t16_gfx11<0x085, "v_cmpx_lg_f16">;
defm V_CMPX_GE_F16 : VOPCX_Real_gfx11<0x086>;		defm V_CMPX_GE_F16_t16 : VOPCX_Real_t16_gfx11<0x086, "v_cmpx_ge_f16">;
defm V_CMPX_O_F16 : VOPCX_Real_gfx11<0x087>;		defm V_CMPX_O_F16_t16 : VOPCX_Real_t16_gfx11<0x087, "v_cmpx_o_f16">;
defm V_CMPX_U_F16 : VOPCX_Real_gfx11<0x088>;		defm V_CMPX_U_F16_t16 : VOPCX_Real_t16_gfx11<0x088, "v_cmpx_u_f16">;
defm V_CMPX_NGE_F16 : VOPCX_Real_gfx11<0x089>;		defm V_CMPX_NGE_F16_t16 : VOPCX_Real_t16_gfx11<0x089, "v_cmpx_nge_f16">;
defm V_CMPX_NLG_F16 : VOPCX_Real_gfx11<0x08a>;		defm V_CMPX_NLG_F16_t16 : VOPCX_Real_t16_gfx11<0x08a, "v_cmpx_nlg_f16">;
defm V_CMPX_NGT_F16 : VOPCX_Real_gfx11<0x08b>;		defm V_CMPX_NGT_F16_t16 : VOPCX_Real_t16_gfx11<0x08b, "v_cmpx_ngt_f16">;
defm V_CMPX_NLE_F16 : VOPCX_Real_gfx11<0x08c>;		defm V_CMPX_NLE_F16_t16 : VOPCX_Real_t16_gfx11<0x08c, "v_cmpx_nle_f16">;
defm V_CMPX_NEQ_F16 : VOPCX_Real_gfx11<0x08d>;		defm V_CMPX_NEQ_F16_t16 : VOPCX_Real_t16_gfx11<0x08d, "v_cmpx_neq_f16">;
defm V_CMPX_NLT_F16 : VOPCX_Real_gfx11<0x08e>;		defm V_CMPX_NLT_F16_t16 : VOPCX_Real_t16_gfx11<0x08e, "v_cmpx_nlt_f16">;
defm V_CMPX_T_F16 : VOPCX_Real_with_name_gfx11<0x08f, "V_CMPX_TRU_F16", "v_cmpx_t_f16">;		defm V_CMPX_T_F16_t16 : VOPCX_Real_with_name_gfx11<0x08f, "V_CMPX_TRU_F16_t16", "v_cmpx_t_f16", "v_cmpx_tru_f16">;
defm V_CMPX_F_F32 : VOPCX_Real_gfx11<0x090>;		defm V_CMPX_F_F32 : VOPCX_Real_gfx11<0x090>;
defm V_CMPX_LT_F32 : VOPCX_Real_gfx11<0x091>;		defm V_CMPX_LT_F32 : VOPCX_Real_gfx11<0x091>;
defm V_CMPX_EQ_F32 : VOPCX_Real_gfx11<0x092>;		defm V_CMPX_EQ_F32 : VOPCX_Real_gfx11<0x092>;
defm V_CMPX_LE_F32 : VOPCX_Real_gfx11<0x093>;		defm V_CMPX_LE_F32 : VOPCX_Real_gfx11<0x093>;
defm V_CMPX_GT_F32 : VOPCX_Real_gfx11<0x094>;		defm V_CMPX_GT_F32 : VOPCX_Real_gfx11<0x094>;
defm V_CMPX_LG_F32 : VOPCX_Real_gfx11<0x095>;		defm V_CMPX_LG_F32 : VOPCX_Real_gfx11<0x095>;
defm V_CMPX_GE_F32 : VOPCX_Real_gfx11<0x096>;		defm V_CMPX_GE_F32 : VOPCX_Real_gfx11<0x096>;
defm V_CMPX_O_F32 : VOPCX_Real_gfx11<0x097>;		defm V_CMPX_O_F32 : VOPCX_Real_gfx11<0x097>;
Show All 18 Lines
defm V_CMPX_NGE_F64 : VOPCX_Real_gfx11<0x0a9>;		defm V_CMPX_NGE_F64 : VOPCX_Real_gfx11<0x0a9>;
defm V_CMPX_NLG_F64 : VOPCX_Real_gfx11<0x0aa>;		defm V_CMPX_NLG_F64 : VOPCX_Real_gfx11<0x0aa>;
defm V_CMPX_NGT_F64 : VOPCX_Real_gfx11<0x0ab>;		defm V_CMPX_NGT_F64 : VOPCX_Real_gfx11<0x0ab>;
defm V_CMPX_NLE_F64 : VOPCX_Real_gfx11<0x0ac>;		defm V_CMPX_NLE_F64 : VOPCX_Real_gfx11<0x0ac>;
defm V_CMPX_NEQ_F64 : VOPCX_Real_gfx11<0x0ad>;		defm V_CMPX_NEQ_F64 : VOPCX_Real_gfx11<0x0ad>;
defm V_CMPX_NLT_F64 : VOPCX_Real_gfx11<0x0ae>;		defm V_CMPX_NLT_F64 : VOPCX_Real_gfx11<0x0ae>;
defm V_CMPX_T_F64 : VOPCX_Real_with_name_gfx11<0x0af, "V_CMPX_TRU_F64", "v_cmpx_t_f64">;		defm V_CMPX_T_F64 : VOPCX_Real_with_name_gfx11<0x0af, "V_CMPX_TRU_F64", "v_cmpx_t_f64">;

defm V_CMPX_LT_I16 : VOPCX_Real_gfx11<0x0b1>;		defm V_CMPX_LT_I16_t16 : VOPCX_Real_t16_gfx11<0x0b1, "v_cmpx_lt_i16">;
defm V_CMPX_EQ_I16 : VOPCX_Real_gfx11<0x0b2>;		defm V_CMPX_EQ_I16_t16 : VOPCX_Real_t16_gfx11<0x0b2, "v_cmpx_eq_i16">;
defm V_CMPX_LE_I16 : VOPCX_Real_gfx11<0x0b3>;		defm V_CMPX_LE_I16_t16 : VOPCX_Real_t16_gfx11<0x0b3, "v_cmpx_le_i16">;
defm V_CMPX_GT_I16 : VOPCX_Real_gfx11<0x0b4>;		defm V_CMPX_GT_I16_t16 : VOPCX_Real_t16_gfx11<0x0b4, "v_cmpx_gt_i16">;
defm V_CMPX_NE_I16 : VOPCX_Real_gfx11<0x0b5>;		defm V_CMPX_NE_I16_t16 : VOPCX_Real_t16_gfx11<0x0b5, "v_cmpx_ne_i16">;
defm V_CMPX_GE_I16 : VOPCX_Real_gfx11<0x0b6>;		defm V_CMPX_GE_I16_t16 : VOPCX_Real_t16_gfx11<0x0b6, "v_cmpx_ge_i16">;
defm V_CMPX_LT_U16 : VOPCX_Real_gfx11<0x0b9>;		defm V_CMPX_LT_U16_t16 : VOPCX_Real_t16_gfx11<0x0b9, "v_cmpx_lt_u16">;
defm V_CMPX_EQ_U16 : VOPCX_Real_gfx11<0x0ba>;		defm V_CMPX_EQ_U16_t16 : VOPCX_Real_t16_gfx11<0x0ba, "v_cmpx_eq_u16">;
defm V_CMPX_LE_U16 : VOPCX_Real_gfx11<0x0bb>;		defm V_CMPX_LE_U16_t16 : VOPCX_Real_t16_gfx11<0x0bb, "v_cmpx_le_u16">;
defm V_CMPX_GT_U16 : VOPCX_Real_gfx11<0x0bc>;		defm V_CMPX_GT_U16_t16 : VOPCX_Real_t16_gfx11<0x0bc, "v_cmpx_gt_u16">;
defm V_CMPX_NE_U16 : VOPCX_Real_gfx11<0x0bd>;		defm V_CMPX_NE_U16_t16 : VOPCX_Real_t16_gfx11<0x0bd, "v_cmpx_ne_u16">;
defm V_CMPX_GE_U16 : VOPCX_Real_gfx11<0x0be>;		defm V_CMPX_GE_U16_t16 : VOPCX_Real_t16_gfx11<0x0be, "v_cmpx_ge_u16">;
defm V_CMPX_F_I32 : VOPCX_Real_gfx11<0x0c0>;		defm V_CMPX_F_I32 : VOPCX_Real_gfx11<0x0c0>;
defm V_CMPX_LT_I32 : VOPCX_Real_gfx11<0x0c1>;		defm V_CMPX_LT_I32 : VOPCX_Real_gfx11<0x0c1>;
defm V_CMPX_EQ_I32 : VOPCX_Real_gfx11<0x0c2>;		defm V_CMPX_EQ_I32 : VOPCX_Real_gfx11<0x0c2>;
defm V_CMPX_LE_I32 : VOPCX_Real_gfx11<0x0c3>;		defm V_CMPX_LE_I32 : VOPCX_Real_gfx11<0x0c3>;
defm V_CMPX_GT_I32 : VOPCX_Real_gfx11<0x0c4>;		defm V_CMPX_GT_I32 : VOPCX_Real_gfx11<0x0c4>;
defm V_CMPX_NE_I32 : VOPCX_Real_gfx11<0x0c5>;		defm V_CMPX_NE_I32 : VOPCX_Real_gfx11<0x0c5>;
defm V_CMPX_GE_I32 : VOPCX_Real_gfx11<0x0c6>;		defm V_CMPX_GE_I32 : VOPCX_Real_gfx11<0x0c6>;
defm V_CMPX_T_I32 : VOPCX_Real_gfx11<0x0c7>;		defm V_CMPX_T_I32 : VOPCX_Real_gfx11<0x0c7>;
Show All 17 Lines
defm V_CMPX_F_U64 : VOPCX_Real_gfx11<0x0d8>;		defm V_CMPX_F_U64 : VOPCX_Real_gfx11<0x0d8>;
defm V_CMPX_LT_U64 : VOPCX_Real_gfx11<0x0d9>;		defm V_CMPX_LT_U64 : VOPCX_Real_gfx11<0x0d9>;
defm V_CMPX_EQ_U64 : VOPCX_Real_gfx11<0x0da>;		defm V_CMPX_EQ_U64 : VOPCX_Real_gfx11<0x0da>;
defm V_CMPX_LE_U64 : VOPCX_Real_gfx11<0x0db>;		defm V_CMPX_LE_U64 : VOPCX_Real_gfx11<0x0db>;
defm V_CMPX_GT_U64 : VOPCX_Real_gfx11<0x0dc>;		defm V_CMPX_GT_U64 : VOPCX_Real_gfx11<0x0dc>;
defm V_CMPX_NE_U64 : VOPCX_Real_gfx11<0x0dd>;		defm V_CMPX_NE_U64 : VOPCX_Real_gfx11<0x0dd>;
defm V_CMPX_GE_U64 : VOPCX_Real_gfx11<0x0de>;		defm V_CMPX_GE_U64 : VOPCX_Real_gfx11<0x0de>;
defm V_CMPX_T_U64 : VOPCX_Real_gfx11<0x0df>;		defm V_CMPX_T_U64 : VOPCX_Real_gfx11<0x0df>;
defm V_CMPX_CLASS_F16 : VOPCX_Real_gfx11<0x0fd>;		defm V_CMPX_CLASS_F16_t16 : VOPCX_Real_t16_gfx11<0x0fd, "v_cmpx_class_f16">;
defm V_CMPX_CLASS_F32 : VOPCX_Real_gfx11<0x0fe>;		defm V_CMPX_CLASS_F32 : VOPCX_Real_gfx11<0x0fe>;
defm V_CMPX_CLASS_F64 : VOPCX_Real_gfx11<0x0ff>;		defm V_CMPX_CLASS_F64 : VOPCX_Real_gfx11<0x0ff>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// GFX10.		// GFX10.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

let AssemblerPredicate = isGFX10Only in {		let AssemblerPredicate = isGFX10Only in {
▲ Show 20 Lines • Show All 599 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/VOPInstructions.td

Show All 18 Lines	class LetDummies {
bit VOPAsmPrefer32Bit;		bit VOPAsmPrefer32Bit;
bit FPDPRounding;		bit FPDPRounding;
Predicate SubtargetPredicate;		Predicate SubtargetPredicate;
string Constraints;		string Constraints;
string DisableEncoding;		string DisableEncoding;
list<SchedReadWrite> SchedRW;		list<SchedReadWrite> SchedRW;
list<Register> Uses;		list<Register> Uses;
list<Register> Defs;		list<Register> Defs;
		list<Predicate> OtherPredicates;
		Predicate AssemblerPredicate;
		string DecoderNamespace;
}		}

class VOP <string opName> {		class VOP <string opName> {
string OpName = opName;		string OpName = opName;
}		}

// First 13 insts from VOPDY are also VOPDX. DOT2ACC_F32_BF16 is omitted		// First 13 insts from VOPDY are also VOPDX. DOT2ACC_F32_BF16 is omitted
defvar VOPDX_Max_Index = 12;		defvar VOPDX_Max_Index = 12;
Show All 21 Lines	class VOP_Pseudo <string opName, string suffix, VOPProfile P, dag outs, dag ins,
InstSI <outs, ins, asm, pattern>,		InstSI <outs, ins, asm, pattern>,
VOP <opName>,		VOP <opName>,
SIMCInstr <opName#suffix, SIEncodingFamily.NONE> {		SIMCInstr <opName#suffix, SIEncodingFamily.NONE> {
let isPseudo = 1;		let isPseudo = 1;
let isCodeGenOnly = 1;		let isCodeGenOnly = 1;
let UseNamedOperandTable = 1;		let UseNamedOperandTable = 1;

string Mnemonic = opName;		string Mnemonic = opName;
		Instruction Opcode = !cast<Instruction>(NAME);
		bit IsTrue16 = P.IsTrue16;
VOPProfile Pfl = P;		VOPProfile Pfl = P;

string AsmOperands;		string AsmOperands;
}		}

class VOP3Common <dag outs, dag ins, string asm = "",		class VOP3Common <dag outs, dag ins, string asm = "",
list<dag> pattern = [], bit HasMods = 0> :		list<dag> pattern = [], bit HasMods = 0> :
VOPAnyCommon <outs, ins, asm, pattern> {		VOPAnyCommon <outs, ins, asm, pattern> {
▲ Show 20 Lines • Show All 1,274 Lines • ▼ Show 20 Lines	let AssemblerPredicate = isGFX11Only,
multiclass VOP3_Real_with_name_gfx11<bits<10> op, string opName,		multiclass VOP3_Real_with_name_gfx11<bits<10> op, string opName,
string asmName, bit isSingle = 0> {		string asmName, bit isSingle = 0> {
defvar ps = !cast<VOP_Pseudo>(opName#"_e64");		defvar ps = !cast<VOP_Pseudo>(opName#"_e64");
let AsmString = asmName # ps.AsmOperands,		let AsmString = asmName # ps.AsmOperands,
IsSingle = !or(isSingle, ps.Pfl.IsSingle) in {		IsSingle = !or(isSingle, ps.Pfl.IsSingle) in {
foreach _ = BoolToList<ps.Pfl.HasOpSel>.ret in		foreach _ = BoolToList<ps.Pfl.HasOpSel>.ret in
def _e64_gfx11 :		def _e64_gfx11 :
VOP3_Real<ps, SIEncodingFamily.GFX11>,		VOP3_Real<ps, SIEncodingFamily.GFX11>,
VOP3OpSel_gfx11<op, ps.Pfl>,		VOP3OpSel_gfx11<op, ps.Pfl>;
MnemonicAlias<ps.Mnemonic, asmName>, Requires<[isGFX11Plus]>;
foreach _ = BoolToList<!not(ps.Pfl.HasOpSel)>.ret in		foreach _ = BoolToList<!not(ps.Pfl.HasOpSel)>.ret in
def _e64_gfx11 :		def _e64_gfx11 :
VOP3_Real<ps, SIEncodingFamily.GFX11>,		VOP3_Real<ps, SIEncodingFamily.GFX11>,
VOP3e_gfx11<op, ps.Pfl>,		VOP3e_gfx11<op, ps.Pfl>;
MnemonicAlias<ps.Mnemonic, asmName>, Requires<[isGFX11Plus]>;
}		}
		def _gfx11_VOP3_alias : MnemonicAlias<ps.Mnemonic, asmName>, Requires<[isGFX11Plus]>, LetDummies;
}		}
// for READLANE/WRITELANE		// for READLANE/WRITELANE
multiclass VOP3_Real_No_Suffix_gfx11<bits<10> op, string opName = NAME> {		multiclass VOP3_Real_No_Suffix_gfx11<bits<10> op, string opName = NAME> {
defvar ps = !cast<VOP_Pseudo>(opName);		defvar ps = !cast<VOP_Pseudo>(opName);
def _e64_gfx11 :		def _e64_gfx11 :
VOP3_Real<ps, SIEncodingFamily.GFX11>,		VOP3_Real<ps, SIEncodingFamily.GFX11>,
VOP3e_gfx11<op, ps.Pfl>;		VOP3e_gfx11<op, ps.Pfl>;
}		}
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	multiclass VOP3_Realtriple_with_name_gfx11<bits<10> op, string opName,
VOP3_Real_with_name_gfx11<op, opName, asmName, isSingle>,		VOP3_Real_with_name_gfx11<op, opName, asmName, isSingle>,
VOP3_Real_dpp_with_name_gfx11<op, opName, asmName>,		VOP3_Real_dpp_with_name_gfx11<op, opName, asmName>,
VOP3_Real_dpp8_with_name_gfx11<op, opName, asmName>;		VOP3_Real_dpp8_with_name_gfx11<op, opName, asmName>;

multiclass VOP3Only_Realtriple_with_name_gfx11<bits<10> op, string opName,		multiclass VOP3Only_Realtriple_with_name_gfx11<bits<10> op, string opName,
string asmName> :		string asmName> :
VOP3_Realtriple_with_name_gfx11<op, opName, asmName, 1>;		VOP3_Realtriple_with_name_gfx11<op, opName, asmName, 1>;

		multiclass VOP3Only_Realtriple_t16_gfx11<bits<10> op, string asmName,
		string opName = NAME>
		: VOP3Only_Realtriple_with_name_gfx11<op, opName, asmName>;

multiclass VOP3be_Realtriple_gfx11<		multiclass VOP3be_Realtriple_gfx11<
bits<10> op, bit isSingle = 0, string opName = NAME,		bits<10> op, bit isSingle = 0, string opName = NAME,
string asmName = !cast<VOP_Pseudo>(opName#"_e64").Mnemonic> :		string asmName = !cast<VOP_Pseudo>(opName#"_e64").Mnemonic> :
VOP3be_Real_gfx11<op, opName, asmName, isSingle>,		VOP3be_Real_gfx11<op, opName, asmName, isSingle>,
VOP3be_Real_dpp_gfx11<op, opName, asmName>,		VOP3be_Real_dpp_gfx11<op, opName, asmName>,
VOP3be_Real_dpp8_gfx11<op, opName, asmName>;		VOP3be_Real_dpp8_gfx11<op, opName, asmName>;

multiclass VOP3beOnly_Realtriple_gfx11<bits<10> op> :		multiclass VOP3beOnly_Realtriple_gfx11<bits<10> op> :
Show All 26 Lines	class VOPC64Table <string Format> : GenericTable {
let Fields = ["Opcode"];		let Fields = ["Opcode"];

let PrimaryKey = ["Opcode"];		let PrimaryKey = ["Opcode"];
let PrimaryKeyName = "isVOPC64" # Format # "OpcodeHelper";		let PrimaryKeyName = "isVOPC64" # Format # "OpcodeHelper";
}		}

def VOPC64DPPTable : VOPC64Table<"DPP">;		def VOPC64DPPTable : VOPC64Table<"DPP">;
def VOPC64DPP8Table : VOPC64Table<"DPP8">;		def VOPC64DPP8Table : VOPC64Table<"DPP8">;

		def VOPTrue16Table : GenericTable {
		let FilterClass = "VOP_Pseudo";
		let CppTypeName = "VOPTrue16Info";
		let Fields = ["Opcode", "IsTrue16"];

		let PrimaryKey = ["Opcode"];
		let PrimaryKeyName = "getTrue16OpcodeHelper";
		}

llvm/test/CodeGen/AMDGPU/GlobalISel/fma.ll

	Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines
	; GFX10-NEXT: s_waitcnt_vscnt null, 0x0			; GFX10-NEXT: s_waitcnt_vscnt null, 0x0
	; GFX10-NEXT: v_fma_f16 v0, v0, v1, v2			; GFX10-NEXT: v_fma_f16 v0, v0, v1, v2
	; GFX10-NEXT: s_setpc_b64 s[30:31]			; GFX10-NEXT: s_setpc_b64 s[30:31]
	;			;
	; GFX11-LABEL: v_fma_f16:			; GFX11-LABEL: v_fma_f16:
	; GFX11: ; %bb.0:			; GFX11: ; %bb.0:
	; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)			; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	; GFX11-NEXT: s_waitcnt_vscnt null, 0x0			; GFX11-NEXT: s_waitcnt_vscnt null, 0x0
	; GFX11-NEXT: v_fma_f16 v0, v0, v1, v2			; GFX11-NEXT: v_fma_f16 v0, v1, v0, v2
	; GFX11-NEXT: s_setpc_b64 s[30:31]			; GFX11-NEXT: s_setpc_b64 s[30:31]
	%fma = call half @llvm.fma.f16(half %x, half %y, half %z)			%fma = call half @llvm.fma.f16(half %x, half %y, half %z)
	ret half %fma			ret half %fma
	}			}

	define <2 x half> @v_fma_v2f16(<2 x half> %x, <2 x half> %y, <2 x half> %z) {			define <2 x half> @v_fma_v2f16(<2 x half> %x, <2 x half> %y, <2 x half> %z) {
	; GFX6-LABEL: v_fma_v2f16:			; GFX6-LABEL: v_fma_v2f16:
	; GFX6: ; %bb.0:			; GFX6: ; %bb.0:
	▲ Show 20 Lines • Show All 748 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir

Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	bb.0:
; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]]
; GFX11-LABEL: name: ashr_s16_s16_vs		; GFX11-LABEL: name: ashr_s16_s16_vs
; GFX11: liveins: $sgpr0, $vgpr0		; GFX11: liveins: $sgpr0, $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_ASHRREV_I16_t16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:sgpr(s32) = COPY $sgpr0		%1:sgpr(s32) = COPY $sgpr0
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:sgpr(s16) = G_TRUNC %1		%3:sgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_ASHR %2, %3		%4:vgpr(s16) = G_ASHR %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	bb.0:
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX10-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]]
; GFX11-LABEL: name: ashr_s16_s16_vv		; GFX11-LABEL: name: ashr_s16_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_ASHRREV_I16_t16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_ASHR %2, %3		%4:vgpr(s16) = G_ASHR %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

Show All 28 Lines	bb.0:
; GFX10-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_ASHRREV_I16_e64_]], 0, 16, implicit $exec		; GFX10-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_ASHRREV_I16_e64_]], 0, 16, implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]
; GFX11-LABEL: name: ashr_s16_s16_vv_zext_to_s32		; GFX11-LABEL: name: ashr_s16_s16_vv_zext_to_s32
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_ASHRREV_I16_t16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_ASHRREV_I16_e64_]], 0, 16, implicit $exec		; GFX11-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_ASHRREV_I16_t16_e64_]], 0, 16, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_ASHR %2, %3		%4:vgpr(s16) = G_ASHR %2, %3
%5:vgpr(s32) = G_ZEXT %4		%5:vgpr(s32) = G_ZEXT %4
S_ENDPGM 0, implicit %5		S_ENDPGM 0, implicit %5
▲ Show 20 Lines • Show All 181 Lines • ▼ Show 20 Lines	bb.0:
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]]
; GFX11-LABEL: name: ashr_s16_s16_sv		; GFX11-LABEL: name: ashr_s16_s16_sv
; GFX11: liveins: $sgpr0, $vgpr0		; GFX11: liveins: $sgpr0, $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_ASHRREV_I16_t16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_ASHRREV_I16_t16_e64_]]
%0:sgpr(s32) = COPY $sgpr0		%0:sgpr(s32) = COPY $sgpr0
%1:vgpr(s32) = COPY $vgpr0		%1:vgpr(s32) = COPY $vgpr0
%2:sgpr(s16) = G_TRUNC %0		%2:sgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_ASHR %2, %3		%4:vgpr(s16) = G_ASHR %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fcanonicalize.mir

Show All 33 Lines	bb.0:
; GFX10-NEXT: {{ $}}		; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: %2:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY]], 0, 0, implicit $mode, implicit $exec		; GFX10-NEXT: %2:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit %2		; GFX10-NEXT: S_ENDPGM 0, implicit %2
; GFX11-LABEL: name: fcanonicalize_f16_denorm		; GFX11-LABEL: name: fcanonicalize_f16_denorm
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY]], 0, 0, implicit $mode, implicit $exec		; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_MAX_F16_t16_e64 0, [[COPY]], 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %2		; GFX11-NEXT: S_ENDPGM 0, implicit %2
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_TRUNC %0		%1:vgpr(s16) = G_TRUNC %0
%2:vgpr(s16) = G_FCANONICALIZE %1		%2:vgpr(s16) = G_FCANONICALIZE %1
S_ENDPGM 0, implicit %2		S_ENDPGM 0, implicit %2
...		...

---		---
Show All 26 Lines	bb.0:
; GFX10-NEXT: {{ $}}		; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: %2:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY]], 0, 0, implicit $mode, implicit $exec		; GFX10-NEXT: %2:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit %2		; GFX10-NEXT: S_ENDPGM 0, implicit %2
; GFX11-LABEL: name: fcanonicalize_f16_flush		; GFX11-LABEL: name: fcanonicalize_f16_flush
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY]], 0, 0, implicit $mode, implicit $exec		; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_MAX_F16_t16_e64 0, [[COPY]], 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %2		; GFX11-NEXT: S_ENDPGM 0, implicit %2
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_TRUNC %0		%1:vgpr(s16) = G_TRUNC %0
%2:vgpr(s16) = G_FCANONICALIZE %1		%2:vgpr(s16) = G_FCANONICALIZE %1
S_ENDPGM 0, implicit %2		S_ENDPGM 0, implicit %2
...		...

---		---
▲ Show 20 Lines • Show All 533 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fcmp.s16.mir

Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_EQ_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_EQ_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_oeq_s16_vv		; GFX11-LABEL: name: fcmp_oeq_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_EQ_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_EQ_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(oeq), %2, %3		%4:vcc(s1) = G_FCMP floatpred(oeq), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_GT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_GT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_ogt_s16_vv		; GFX11-LABEL: name: fcmp_ogt_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_GT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_GT_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(ogt), %2, %3		%4:vcc(s1) = G_FCMP floatpred(ogt), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_GE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_GE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_oge_s16_vv		; GFX11-LABEL: name: fcmp_oge_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_GE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_GE_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(oge), %2, %3		%4:vcc(s1) = G_FCMP floatpred(oge), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_olt_s16_vv		; GFX11-LABEL: name: fcmp_olt_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LT_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(olt), %2, %3		%4:vcc(s1) = G_FCMP floatpred(olt), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_ole_s16_vv		; GFX11-LABEL: name: fcmp_ole_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LE_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(ole), %2, %3		%4:vcc(s1) = G_FCMP floatpred(ole), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 19 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LG_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LG_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_one_s16_vv		; GFX11-LABEL: name: fcmp_one_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LG_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LG_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(one), %2, %3		%4:vcc(s1) = G_FCMP floatpred(one), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LG_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LG_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_ord_s16_vv		; GFX11-LABEL: name: fcmp_ord_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LG_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_LG_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(one), %2, %3		%4:vcc(s1) = G_FCMP floatpred(one), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_U_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_U_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_uno_s16_vv		; GFX11-LABEL: name: fcmp_uno_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_U_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_U_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(uno), %2, %3		%4:vcc(s1) = G_FCMP floatpred(uno), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLG_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLG_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_ueq_s16_vv		; GFX11-LABEL: name: fcmp_ueq_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLG_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLG_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(ueq), %2, %3		%4:vcc(s1) = G_FCMP floatpred(ueq), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_ugt_s16_vv		; GFX11-LABEL: name: fcmp_ugt_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLE_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(ugt), %2, %3		%4:vcc(s1) = G_FCMP floatpred(ugt), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_uge_s16_vv		; GFX11-LABEL: name: fcmp_uge_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NLT_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(uge), %2, %3		%4:vcc(s1) = G_FCMP floatpred(uge), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NGE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NGE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_ult_s16_vv		; GFX11-LABEL: name: fcmp_ult_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NGE_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NGE_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(ult), %2, %3		%4:vcc(s1) = G_FCMP floatpred(ult), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NGT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NGT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_ule_s16_vv		; GFX11-LABEL: name: fcmp_ule_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NGT_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NGT_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(ule), %2, %3		%4:vcc(s1) = G_FCMP floatpred(ule), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 20 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NEQ_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; WAVE32-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NEQ_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit %4		; WAVE32-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fcmp_une_s16_vv		; GFX11-LABEL: name: fcmp_une_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NEQ_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:sreg_32_xm0_xexec = nofpexcept V_CMP_NEQ_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_FCMP floatpred(une), %2, %3		%4:vcc(s1) = G_FCMP floatpred(une), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fmaxnum-ieee.s16.mir

Show All 19 Lines	bb.0:
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; CHECK-NEXT: %4:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; CHECK-NEXT: %4:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; CHECK-NEXT: S_ENDPGM 0, implicit %4		; CHECK-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fmaxnum_ieee_f16_vv		; GFX11-LABEL: name: fmaxnum_ieee_f16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_MAX_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_FMAXNUM_IEEE %2, %3		%4:vgpr(s16) = G_FMAXNUM_IEEE %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 14 Lines	bb.0:
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; CHECK-NEXT: %5:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; CHECK-NEXT: %5:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; CHECK-NEXT: S_ENDPGM 0, implicit %5		; CHECK-NEXT: S_ENDPGM 0, implicit %5
; GFX11-LABEL: name: fmaxnum_ieee_f16_v_fneg_v		; GFX11-LABEL: name: fmaxnum_ieee_f16_v_fneg_v
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_MAX_F16_t16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %5		; GFX11-NEXT: S_ENDPGM 0, implicit %5
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_FNEG %3		%4:vgpr(s16) = G_FNEG %3
%5:vgpr(s16) = G_FMAXNUM_IEEE %2, %4		%5:vgpr(s16) = G_FMAXNUM_IEEE %2, %4
S_ENDPGM 0, implicit %5		S_ENDPGM 0, implicit %5
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fmaxnum.s16.mir

Show All 19 Lines	bb.0:
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; CHECK-NEXT: %4:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; CHECK-NEXT: %4:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; CHECK-NEXT: S_ENDPGM 0, implicit %4		; CHECK-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fmaxnum_f16_vv		; GFX11-LABEL: name: fmaxnum_f16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_MAX_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_FMAXNUM %2, %3		%4:vgpr(s16) = G_FMAXNUM %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 14 Lines	bb.0:
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; CHECK-NEXT: %5:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; CHECK-NEXT: %5:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; CHECK-NEXT: S_ENDPGM 0, implicit %5		; CHECK-NEXT: S_ENDPGM 0, implicit %5
; GFX11-LABEL: name: fmaxnum_f16_v_fneg_v		; GFX11-LABEL: name: fmaxnum_f16_v_fneg_v
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_MAX_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_MAX_F16_t16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %5		; GFX11-NEXT: S_ENDPGM 0, implicit %5
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_FNEG %3		%4:vgpr(s16) = G_FNEG %3
%5:vgpr(s16) = G_FMAXNUM %2, %4		%5:vgpr(s16) = G_FMAXNUM %2, %4
S_ENDPGM 0, implicit %5		S_ENDPGM 0, implicit %5
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fminnum-ieee.s16.mir

Show All 19 Lines	bb.0:
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; CHECK-NEXT: %4:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; CHECK-NEXT: %4:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; CHECK-NEXT: S_ENDPGM 0, implicit %4		; CHECK-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fminnum_ieee_f16_vv		; GFX11-LABEL: name: fminnum_ieee_f16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_MIN_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_FMINNUM_IEEE %2, %3		%4:vgpr(s16) = G_FMINNUM_IEEE %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 14 Lines	bb.0:
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; CHECK-NEXT: %5:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; CHECK-NEXT: %5:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; CHECK-NEXT: S_ENDPGM 0, implicit %5		; CHECK-NEXT: S_ENDPGM 0, implicit %5
; GFX11-LABEL: name: fminnum_ieee_f16_v_fneg_v		; GFX11-LABEL: name: fminnum_ieee_f16_v_fneg_v
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_MIN_F16_t16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %5		; GFX11-NEXT: S_ENDPGM 0, implicit %5
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_FNEG %3		%4:vgpr(s16) = G_FNEG %3
%5:vgpr(s16) = G_FMINNUM_IEEE %2, %4		%5:vgpr(s16) = G_FMINNUM_IEEE %2, %4
S_ENDPGM 0, implicit %5		S_ENDPGM 0, implicit %5
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fminnum.s16.mir

Show All 19 Lines	bb.0:
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; CHECK-NEXT: %4:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; CHECK-NEXT: %4:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; CHECK-NEXT: S_ENDPGM 0, implicit %4		; CHECK-NEXT: S_ENDPGM 0, implicit %4
; GFX11-LABEL: name: fminnum_f16_vv		; GFX11-LABEL: name: fminnum_f16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_MIN_F16_t16_e64 0, [[COPY]], 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %4		; GFX11-NEXT: S_ENDPGM 0, implicit %4
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_FMINNUM %2, %3		%4:vgpr(s16) = G_FMINNUM %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...
Show All 14 Lines	bb.0:
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; CHECK-NEXT: %5:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; CHECK-NEXT: %5:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; CHECK-NEXT: S_ENDPGM 0, implicit %5		; CHECK-NEXT: S_ENDPGM 0, implicit %5
; GFX11-LABEL: name: fminnum_f16_v_fneg_v		; GFX11-LABEL: name: fminnum_f16_v_fneg_v
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_MIN_F16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec		; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_MIN_F16_t16_e64 0, [[COPY]], 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %5		; GFX11-NEXT: S_ENDPGM 0, implicit %5
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_FNEG %3		%4:vgpr(s16) = G_FNEG %3
%5:vgpr(s16) = G_FMINNUM %2, %4		%5:vgpr(s16) = G_FMINNUM %2, %4
S_ENDPGM 0, implicit %5		S_ENDPGM 0, implicit %5
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fptosi.mir

Show First 20 Lines • Show All 111 Lines • ▼ Show 20 Lines
body: \|		body: \|
bb.0:		bb.0:
liveins: $vgpr0		liveins: $vgpr0

; GCN-LABEL: name: fptosi_s16_to_s32_vv		; GCN-LABEL: name: fptosi_s16_to_s32_vv
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec		; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec
; GCN-NEXT: $vgpr0 = COPY %2		; GCN-NEXT: $vgpr0 = COPY %2
; VI-LABEL: name: fptosi_s16_to_s32_vv		; VI-LABEL: name: fptosi_s16_to_s32_vv
; VI: liveins: $vgpr0		; VI: liveins: $vgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec		; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec
; VI-NEXT: $vgpr0 = COPY %2		; VI-NEXT: $vgpr0 = COPY %2
; GFX11-LABEL: name: fptosi_s16_to_s32_vv		; GFX11-LABEL: name: fptosi_s16_to_s32_vv
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec		; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec
; GFX11-NEXT: $vgpr0 = COPY %2		; GFX11-NEXT: $vgpr0 = COPY %2
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_TRUNC %0		%1:vgpr(s16) = G_TRUNC %0
%2:vgpr(s32) = G_FPTOSI %1		%2:vgpr(s32) = G_FPTOSI %1
$vgpr0 = COPY %2		$vgpr0 = COPY %2
...		...

---		---
name: fptosi_s16_to_s32_vs		name: fptosi_s16_to_s32_vs
legalized: true		legalized: true
regBankSelected: true		regBankSelected: true
tracksRegLiveness: true		tracksRegLiveness: true

body: \|		body: \|
bb.0:		bb.0:
liveins: $sgpr0		liveins: $sgpr0

; GCN-LABEL: name: fptosi_s16_to_s32_vs		; GCN-LABEL: name: fptosi_s16_to_s32_vs
; GCN: liveins: $sgpr0		; GCN: liveins: $sgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec		; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec
; GCN-NEXT: $vgpr0 = COPY %2		; GCN-NEXT: $vgpr0 = COPY %2
; VI-LABEL: name: fptosi_s16_to_s32_vs		; VI-LABEL: name: fptosi_s16_to_s32_vs
; VI: liveins: $sgpr0		; VI: liveins: $sgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec		; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec
; VI-NEXT: $vgpr0 = COPY %2		; VI-NEXT: $vgpr0 = COPY %2
; GFX11-LABEL: name: fptosi_s16_to_s32_vs		; GFX11-LABEL: name: fptosi_s16_to_s32_vs
; GFX11: liveins: $sgpr0		; GFX11: liveins: $sgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec		; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %3, implicit $mode, implicit $exec
; GFX11-NEXT: $vgpr0 = COPY %2		; GFX11-NEXT: $vgpr0 = COPY %2
%0:sgpr(s32) = COPY $sgpr0		%0:sgpr(s32) = COPY $sgpr0
%1:sgpr(s16) = G_TRUNC %0		%1:sgpr(s16) = G_TRUNC %0
%2:vgpr(s32) = G_FPTOSI %1		%2:vgpr(s32) = G_FPTOSI %1
$vgpr0 = COPY %2		$vgpr0 = COPY %2
...		...

---		---
name: fptosi_s16_to_s32_fneg_vv		name: fptosi_s16_to_s32_fneg_vv
legalized: true		legalized: true
regBankSelected: true		regBankSelected: true
tracksRegLiveness: true		tracksRegLiveness: true

body: \|		body: \|
bb.0:		bb.0:
liveins: $vgpr0		liveins: $vgpr0

; GCN-LABEL: name: fptosi_s16_to_s32_fneg_vv		; GCN-LABEL: name: fptosi_s16_to_s32_fneg_vv
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GCN-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; GCN-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; GCN-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; GCN-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec		; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec
; GCN-NEXT: $vgpr0 = COPY %3		; GCN-NEXT: $vgpr0 = COPY %3
; VI-LABEL: name: fptosi_s16_to_s32_fneg_vv		; VI-LABEL: name: fptosi_s16_to_s32_fneg_vv
; VI: liveins: $vgpr0		; VI: liveins: $vgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec		; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec
; VI-NEXT: $vgpr0 = COPY %3		; VI-NEXT: $vgpr0 = COPY %3
; GFX11-LABEL: name: fptosi_s16_to_s32_fneg_vv		; GFX11-LABEL: name: fptosi_s16_to_s32_fneg_vv
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; GFX11-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; GFX11-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec		; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec
; GFX11-NEXT: $vgpr0 = COPY %3		; GFX11-NEXT: $vgpr0 = COPY %3
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_TRUNC %0		%1:vgpr(s16) = G_TRUNC %0
%2:vgpr(s16) = G_FNEG %1		%2:vgpr(s16) = G_FNEG %1
%3:vgpr(s32) = G_FPTOSI %2		%3:vgpr(s32) = G_FPTOSI %2
$vgpr0 = COPY %3		$vgpr0 = COPY %3
...		...

---		---
name: fptosi_s16_to_s1_vv		name: fptosi_s16_to_s1_vv
legalized: true		legalized: true
regBankSelected: true		regBankSelected: true
tracksRegLiveness: true		tracksRegLiveness: true

body: \|		body: \|
bb.0:		bb.0:
liveins: $vgpr0		liveins: $vgpr0

; GCN-LABEL: name: fptosi_s16_to_s1_vv		; GCN-LABEL: name: fptosi_s16_to_s1_vv
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec		; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec
; GCN-NEXT: S_ENDPGM 0, implicit %2		; GCN-NEXT: S_ENDPGM 0, implicit %2
; VI-LABEL: name: fptosi_s16_to_s1_vv		; VI-LABEL: name: fptosi_s16_to_s1_vv
; VI: liveins: $vgpr0		; VI: liveins: $vgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec		; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec
; VI-NEXT: S_ENDPGM 0, implicit %2		; VI-NEXT: S_ENDPGM 0, implicit %2
; GFX11-LABEL: name: fptosi_s16_to_s1_vv		; GFX11-LABEL: name: fptosi_s16_to_s1_vv
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec		; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %2		; GFX11-NEXT: S_ENDPGM 0, implicit %2
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_TRUNC %0		%1:vgpr(s16) = G_TRUNC %0
%2:vgpr(s32) = G_FPTOSI %1		%2:vgpr(s32) = G_FPTOSI %1
%3:vgpr(s1) = G_TRUNC %2		%3:vgpr(s1) = G_TRUNC %2
S_ENDPGM 0, implicit %3		S_ENDPGM 0, implicit %3
...		...

---		---
name: fptosi_s16_to_s1_vs		name: fptosi_s16_to_s1_vs
legalized: true		legalized: true
regBankSelected: true		regBankSelected: true
tracksRegLiveness: true		tracksRegLiveness: true

body: \|		body: \|
bb.0:		bb.0:
liveins: $sgpr0		liveins: $sgpr0

; GCN-LABEL: name: fptosi_s16_to_s1_vs		; GCN-LABEL: name: fptosi_s16_to_s1_vs
; GCN: liveins: $sgpr0		; GCN: liveins: $sgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec		; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec
; GCN-NEXT: S_ENDPGM 0, implicit %2		; GCN-NEXT: S_ENDPGM 0, implicit %2
; VI-LABEL: name: fptosi_s16_to_s1_vs		; VI-LABEL: name: fptosi_s16_to_s1_vs
; VI: liveins: $sgpr0		; VI: liveins: $sgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec		; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec
; VI-NEXT: S_ENDPGM 0, implicit %2		; VI-NEXT: S_ENDPGM 0, implicit %2
; GFX11-LABEL: name: fptosi_s16_to_s1_vs		; GFX11-LABEL: name: fptosi_s16_to_s1_vs
; GFX11: liveins: $sgpr0		; GFX11: liveins: $sgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec		; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %4, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %2		; GFX11-NEXT: S_ENDPGM 0, implicit %2
%0:sgpr(s32) = COPY $sgpr0		%0:sgpr(s32) = COPY $sgpr0
%1:sgpr(s16) = G_TRUNC %0		%1:sgpr(s16) = G_TRUNC %0
%2:vgpr(s32) = G_FPTOSI %1		%2:vgpr(s32) = G_FPTOSI %1
%3:vgpr(s1) = G_TRUNC %2		%3:vgpr(s1) = G_TRUNC %2
S_ENDPGM 0, implicit %3		S_ENDPGM 0, implicit %3
...		...
Show All 9 Lines	bb.0:
liveins: $vgpr0		liveins: $vgpr0

; GCN-LABEL: name: fptosi_s16_to_s1_fneg_vv		; GCN-LABEL: name: fptosi_s16_to_s1_fneg_vv
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GCN-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; GCN-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; GCN-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; GCN-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; GCN-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; GCN-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %5, implicit $mode, implicit $exec		; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %5, implicit $mode, implicit $exec
; GCN-NEXT: S_ENDPGM 0, implicit %3		; GCN-NEXT: S_ENDPGM 0, implicit %3
; VI-LABEL: name: fptosi_s16_to_s1_fneg_vv		; VI-LABEL: name: fptosi_s16_to_s1_fneg_vv
; VI: liveins: $vgpr0		; VI: liveins: $vgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; VI-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; VI-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %5, implicit $mode, implicit $exec		; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %5, implicit $mode, implicit $exec
; VI-NEXT: S_ENDPGM 0, implicit %3		; VI-NEXT: S_ENDPGM 0, implicit %3
; GFX11-LABEL: name: fptosi_s16_to_s1_fneg_vv		; GFX11-LABEL: name: fptosi_s16_to_s1_fneg_vv
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; GFX11-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; GFX11-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %5, implicit $mode, implicit $exec		; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_I32_F32_e32 %5, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %3		; GFX11-NEXT: S_ENDPGM 0, implicit %3
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_TRUNC %0		%1:vgpr(s16) = G_TRUNC %0
%2:vgpr(s16) = G_FNEG %1		%2:vgpr(s16) = G_FNEG %1
%3:vgpr(s32) = G_FPTOSI %2		%3:vgpr(s32) = G_FPTOSI %2
%4:vgpr(s1) = G_TRUNC %3		%4:vgpr(s1) = G_TRUNC %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fptoui.mir

Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines
body: \|		body: \|
bb.0:		bb.0:
liveins: $vgpr0		liveins: $vgpr0

; GCN-LABEL: name: fptoui_s16_to_s32_vv		; GCN-LABEL: name: fptoui_s16_to_s32_vv
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec		; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec
; GCN-NEXT: $vgpr0 = COPY %2		; GCN-NEXT: $vgpr0 = COPY %2
; VI-LABEL: name: fptoui_s16_to_s32_vv		; VI-LABEL: name: fptoui_s16_to_s32_vv
; VI: liveins: $vgpr0		; VI: liveins: $vgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec		; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec
; VI-NEXT: $vgpr0 = COPY %2		; VI-NEXT: $vgpr0 = COPY %2
; GFX11-LABEL: name: fptoui_s16_to_s32_vv		; GFX11-LABEL: name: fptoui_s16_to_s32_vv
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec		; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec
; GFX11-NEXT: $vgpr0 = COPY %2		; GFX11-NEXT: $vgpr0 = COPY %2
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_TRUNC %0		%1:vgpr(s16) = G_TRUNC %0
%2:vgpr(s32) = G_FPTOUI %1		%2:vgpr(s32) = G_FPTOUI %1
$vgpr0 = COPY %2		$vgpr0 = COPY %2
...		...

---		---
name: fptoui_s16_to_s32_vs		name: fptoui_s16_to_s32_vs
legalized: true		legalized: true
regBankSelected: true		regBankSelected: true
tracksRegLiveness: true		tracksRegLiveness: true

body: \|		body: \|
bb.0:		bb.0:
liveins: $sgpr0		liveins: $sgpr0

; GCN-LABEL: name: fptoui_s16_to_s32_vs		; GCN-LABEL: name: fptoui_s16_to_s32_vs
; GCN: liveins: $sgpr0		; GCN: liveins: $sgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec		; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec
; GCN-NEXT: $vgpr0 = COPY %2		; GCN-NEXT: $vgpr0 = COPY %2
; VI-LABEL: name: fptoui_s16_to_s32_vs		; VI-LABEL: name: fptoui_s16_to_s32_vs
; VI: liveins: $sgpr0		; VI: liveins: $sgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec		; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec
; VI-NEXT: $vgpr0 = COPY %2		; VI-NEXT: $vgpr0 = COPY %2
; GFX11-LABEL: name: fptoui_s16_to_s32_vs		; GFX11-LABEL: name: fptoui_s16_to_s32_vs
; GFX11: liveins: $sgpr0		; GFX11: liveins: $sgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec		; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %3, implicit $mode, implicit $exec
; GFX11-NEXT: $vgpr0 = COPY %2		; GFX11-NEXT: $vgpr0 = COPY %2
%0:sgpr(s32) = COPY $sgpr0		%0:sgpr(s32) = COPY $sgpr0
%1:sgpr(s16) = G_TRUNC %0		%1:sgpr(s16) = G_TRUNC %0
%2:vgpr(s32) = G_FPTOUI %1		%2:vgpr(s32) = G_FPTOUI %1
$vgpr0 = COPY %2		$vgpr0 = COPY %2
...		...

---		---
name: fptoui_s16_to_s32_fneg_vv		name: fptoui_s16_to_s32_fneg_vv
legalized: true		legalized: true
regBankSelected: true		regBankSelected: true
tracksRegLiveness: true		tracksRegLiveness: true

body: \|		body: \|
bb.0:		bb.0:
liveins: $vgpr0		liveins: $vgpr0

; GCN-LABEL: name: fptoui_s16_to_s32_fneg_vv		; GCN-LABEL: name: fptoui_s16_to_s32_fneg_vv
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GCN-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; GCN-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; GCN-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; GCN-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec		; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec
; GCN-NEXT: $vgpr0 = COPY %3		; GCN-NEXT: $vgpr0 = COPY %3
; VI-LABEL: name: fptoui_s16_to_s32_fneg_vv		; VI-LABEL: name: fptoui_s16_to_s32_fneg_vv
; VI: liveins: $vgpr0		; VI: liveins: $vgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec		; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec
; VI-NEXT: $vgpr0 = COPY %3		; VI-NEXT: $vgpr0 = COPY %3
; GFX11-LABEL: name: fptoui_s16_to_s32_fneg_vv		; GFX11-LABEL: name: fptoui_s16_to_s32_fneg_vv
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; GFX11-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; GFX11-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec		; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec
; GFX11-NEXT: $vgpr0 = COPY %3		; GFX11-NEXT: $vgpr0 = COPY %3
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_TRUNC %0		%1:vgpr(s16) = G_TRUNC %0
%2:vgpr(s16) = G_FNEG %1		%2:vgpr(s16) = G_FNEG %1
%3:vgpr(s32) = G_FPTOUI %2		%3:vgpr(s32) = G_FPTOUI %2
$vgpr0 = COPY %3		$vgpr0 = COPY %3
...		...

---		---
name: fptoui_s16_to_s1_vv		name: fptoui_s16_to_s1_vv
legalized: true		legalized: true
regBankSelected: true		regBankSelected: true
tracksRegLiveness: true		tracksRegLiveness: true

body: \|		body: \|
bb.0:		bb.0:
liveins: $vgpr0		liveins: $vgpr0

; GCN-LABEL: name: fptoui_s16_to_s1_vv		; GCN-LABEL: name: fptoui_s16_to_s1_vv
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec		; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec
; GCN-NEXT: S_ENDPGM 0, implicit %2		; GCN-NEXT: S_ENDPGM 0, implicit %2
; VI-LABEL: name: fptoui_s16_to_s1_vv		; VI-LABEL: name: fptoui_s16_to_s1_vv
; VI: liveins: $vgpr0		; VI: liveins: $vgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec		; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec
; VI-NEXT: S_ENDPGM 0, implicit %2		; VI-NEXT: S_ENDPGM 0, implicit %2
; GFX11-LABEL: name: fptoui_s16_to_s1_vv		; GFX11-LABEL: name: fptoui_s16_to_s1_vv
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec		; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %2		; GFX11-NEXT: S_ENDPGM 0, implicit %2
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_TRUNC %0		%1:vgpr(s16) = G_TRUNC %0
%2:vgpr(s32) = G_FPTOUI %1		%2:vgpr(s32) = G_FPTOUI %1
%3:vgpr(s1) = G_TRUNC %2		%3:vgpr(s1) = G_TRUNC %2
S_ENDPGM 0, implicit %3		S_ENDPGM 0, implicit %3
...		...

---		---
name: fptoui_s16_to_s1_vs		name: fptoui_s16_to_s1_vs
legalized: true		legalized: true
regBankSelected: true		regBankSelected: true
tracksRegLiveness: true		tracksRegLiveness: true

body: \|		body: \|
bb.0:		bb.0:
liveins: $sgpr0		liveins: $sgpr0

; GCN-LABEL: name: fptoui_s16_to_s1_vs		; GCN-LABEL: name: fptoui_s16_to_s1_vs
; GCN: liveins: $sgpr0		; GCN: liveins: $sgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GCN-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec		; GCN-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec
; GCN-NEXT: S_ENDPGM 0, implicit %2		; GCN-NEXT: S_ENDPGM 0, implicit %2
; VI-LABEL: name: fptoui_s16_to_s1_vs		; VI-LABEL: name: fptoui_s16_to_s1_vs
; VI: liveins: $sgpr0		; VI: liveins: $sgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; VI-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec		; VI-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec
; VI-NEXT: S_ENDPGM 0, implicit %2		; VI-NEXT: S_ENDPGM 0, implicit %2
; GFX11-LABEL: name: fptoui_s16_to_s1_vs		; GFX11-LABEL: name: fptoui_s16_to_s1_vs
; GFX11: liveins: $sgpr0		; GFX11: liveins: $sgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: %4:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec		; GFX11-NEXT: %2:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %4, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %2		; GFX11-NEXT: S_ENDPGM 0, implicit %2
%0:sgpr(s32) = COPY $sgpr0		%0:sgpr(s32) = COPY $sgpr0
%1:sgpr(s16) = G_TRUNC %0		%1:sgpr(s16) = G_TRUNC %0
%2:vgpr(s32) = G_FPTOUI %1		%2:vgpr(s32) = G_FPTOUI %1
%3:vgpr(s1) = G_TRUNC %2		%3:vgpr(s1) = G_TRUNC %2
S_ENDPGM 0, implicit %3		S_ENDPGM 0, implicit %3
...		...
Show All 9 Lines	bb.0:
liveins: $vgpr0		liveins: $vgpr0

; GCN-LABEL: name: fptoui_s16_to_s1_fneg_vv		; GCN-LABEL: name: fptoui_s16_to_s1_fneg_vv
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GCN-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; GCN-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; GCN-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; GCN-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; GCN-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; GCN-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %5, implicit $mode, implicit $exec		; GCN-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %5, implicit $mode, implicit $exec
; GCN-NEXT: S_ENDPGM 0, implicit %3		; GCN-NEXT: S_ENDPGM 0, implicit %3
; VI-LABEL: name: fptoui_s16_to_s1_fneg_vv		; VI-LABEL: name: fptoui_s16_to_s1_fneg_vv
; VI: liveins: $vgpr0		; VI: liveins: $vgpr0
; VI-NEXT: {{ $}}		; VI-NEXT: {{ $}}
; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; VI-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; VI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; VI-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; VI-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; VI-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %5, implicit $mode, implicit $exec		; VI-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %5, implicit $mode, implicit $exec
; VI-NEXT: S_ENDPGM 0, implicit %3		; VI-NEXT: S_ENDPGM 0, implicit %3
; GFX11-LABEL: name: fptoui_s16_to_s1_fneg_vv		; GFX11-LABEL: name: fptoui_s16_to_s1_fneg_vv
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768		; GFX11-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 32768
; GFX11-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_XOR_B32_e64_:%[0-9]+]]:vgpr_32 = V_XOR_B32_e64 [[S_MOV_B32_]], [[COPY]], implicit $exec
; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_e32 [[V_XOR_B32_e64_]], implicit $mode, implicit $exec		; GFX11-NEXT: %5:vgpr_32 = nofpexcept V_CVT_F32_F16_t16_e64 0, [[V_XOR_B32_e64_]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %5, implicit $mode, implicit $exec		; GFX11-NEXT: %3:vgpr_32 = nofpexcept V_CVT_U32_F32_e32 %5, implicit $mode, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit %3		; GFX11-NEXT: S_ENDPGM 0, implicit %3
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_TRUNC %0		%1:vgpr(s16) = G_TRUNC %0
%2:vgpr(s16) = G_FNEG %1		%2:vgpr(s16) = G_FNEG %1
%3:vgpr(s32) = G_FPTOUI %2		%3:vgpr(s32) = G_FPTOUI %2
%4:vgpr(s1) = G_TRUNC %3		%4:vgpr(s1) = G_TRUNC %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-icmp.s16.mir

Show All 26 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; WAVE32-NEXT: [[V_CMP_EQ_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; WAVE32-NEXT: [[V_CMP_EQ_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_e64 [[COPY]], [[COPY1]], implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_e64_]]		; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_e64_]]
; GFX11-LABEL: name: icmp_eq_s16_sv		; GFX11-LABEL: name: icmp_eq_s16_sv
; GFX11: liveins: $sgpr0, $vgpr0		; GFX11: liveins: $sgpr0, $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[V_CMP_EQ_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_CMP_EQ_U16_t16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_t16_e64_]]
%0:sgpr(s32) = COPY $sgpr0		%0:sgpr(s32) = COPY $sgpr0
%1:vgpr(s32) = COPY $vgpr0		%1:vgpr(s32) = COPY $vgpr0
%2:sgpr(s16) = G_TRUNC %0		%2:sgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_ICMP intpred(eq), %2, %3		%4:vcc(s1) = G_ICMP intpred(eq), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

Show All 21 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0
; WAVE32-NEXT: [[V_CMP_EQ_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; WAVE32-NEXT: [[V_CMP_EQ_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_e64 [[COPY]], [[COPY1]], implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_e64_]]		; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_e64_]]
; GFX11-LABEL: name: icmp_eq_s16_vs		; GFX11-LABEL: name: icmp_eq_s16_vs
; GFX11: liveins: $sgpr0, $vgpr0		; GFX11: liveins: $sgpr0, $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: [[V_CMP_EQ_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_CMP_EQ_U16_t16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:sgpr(s32) = COPY $sgpr0		%1:sgpr(s32) = COPY $sgpr0
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:sgpr(s16) = G_TRUNC %1		%3:sgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_ICMP intpred(eq), %2, %3		%4:vcc(s1) = G_ICMP intpred(eq), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

Show All 21 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: [[V_CMP_EQ_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; WAVE32-NEXT: [[V_CMP_EQ_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_e64 [[COPY]], [[COPY1]], implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_e64_]]		; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_e64_]]
; GFX11-LABEL: name: icmp_eq_s16_vv		; GFX11-LABEL: name: icmp_eq_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_CMP_EQ_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_CMP_EQ_U16_t16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_EQ_U16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_ICMP intpred(eq), %2, %3		%4:vcc(s1) = G_ICMP intpred(eq), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

Show All 21 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: [[V_CMP_NE_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_NE_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; WAVE32-NEXT: [[V_CMP_NE_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_NE_U16_e64 [[COPY]], [[COPY1]], implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_NE_U16_e64_]]		; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_NE_U16_e64_]]
; GFX11-LABEL: name: icmp_ne_s16_vv		; GFX11-LABEL: name: icmp_ne_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_CMP_NE_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_NE_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_CMP_NE_U16_t16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_NE_U16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_NE_U16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_NE_U16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_ICMP intpred(ne), %2, %3		%4:vcc(s1) = G_ICMP intpred(ne), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

Show All 21 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: [[V_CMP_LT_I16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LT_I16_e64 [[COPY]], [[COPY1]], implicit $exec		; WAVE32-NEXT: [[V_CMP_LT_I16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LT_I16_e64 [[COPY]], [[COPY1]], implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_LT_I16_e64_]]		; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_LT_I16_e64_]]
; GFX11-LABEL: name: icmp_slt_s16_vv		; GFX11-LABEL: name: icmp_slt_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_CMP_LT_I16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LT_I16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_CMP_LT_I16_t16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LT_I16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_LT_I16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_LT_I16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_ICMP intpred(slt), %2, %3		%4:vcc(s1) = G_ICMP intpred(slt), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

Show All 21 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: [[V_CMP_LE_I16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LE_I16_e64 [[COPY]], [[COPY1]], implicit $exec		; WAVE32-NEXT: [[V_CMP_LE_I16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LE_I16_e64 [[COPY]], [[COPY1]], implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_LE_I16_e64_]]		; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_LE_I16_e64_]]
; GFX11-LABEL: name: icmp_sle_s16_vv		; GFX11-LABEL: name: icmp_sle_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_CMP_LE_I16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LE_I16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_CMP_LE_I16_t16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LE_I16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_LE_I16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_LE_I16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_ICMP intpred(sle), %2, %3		%4:vcc(s1) = G_ICMP intpred(sle), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

Show All 21 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: [[V_CMP_LT_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LT_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; WAVE32-NEXT: [[V_CMP_LT_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LT_U16_e64 [[COPY]], [[COPY1]], implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_LT_U16_e64_]]		; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_LT_U16_e64_]]
; GFX11-LABEL: name: icmp_ult_s16_vv		; GFX11-LABEL: name: icmp_ult_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_CMP_LT_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LT_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_CMP_LT_U16_t16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LT_U16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_LT_U16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_LT_U16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_ICMP intpred(ult), %2, %3		%4:vcc(s1) = G_ICMP intpred(ult), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

Show All 21 Lines	bb.0:
; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; WAVE32-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; WAVE32-NEXT: [[V_CMP_LE_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LE_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; WAVE32-NEXT: [[V_CMP_LE_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LE_U16_e64 [[COPY]], [[COPY1]], implicit $exec
; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_LE_U16_e64_]]		; WAVE32-NEXT: S_ENDPGM 0, implicit [[V_CMP_LE_U16_e64_]]
; GFX11-LABEL: name: icmp_ule_s16_vv		; GFX11-LABEL: name: icmp_ule_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_CMP_LE_U16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LE_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_CMP_LE_U16_t16_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_LE_U16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_LE_U16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_CMP_LE_U16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vcc(s1) = G_ICMP intpred(ule), %2, %3		%4:vcc(s1) = G_ICMP intpred(ule), %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir

Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	bb.0:
; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]]
; GFX11-LABEL: name: lshr_s16_s16_vs		; GFX11-LABEL: name: lshr_s16_s16_vs
; GFX11: liveins: $sgpr0, $vgpr0		; GFX11: liveins: $sgpr0, $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_LSHRREV_B16_t16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:sgpr(s32) = COPY $sgpr0		%1:sgpr(s32) = COPY $sgpr0
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:sgpr(s16) = G_TRUNC %1		%3:sgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_LSHR %2, %3		%4:vgpr(s16) = G_LSHR %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	bb.0:
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX10-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]]
; GFX11-LABEL: name: lshr_s16_s16_vv		; GFX11-LABEL: name: lshr_s16_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_LSHRREV_B16_t16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_LSHR %2, %3		%4:vgpr(s16) = G_LSHR %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

Show All 28 Lines	bb.0:
; GFX10-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_LSHRREV_B16_e64_]], 0, 16, implicit $exec		; GFX10-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_LSHRREV_B16_e64_]], 0, 16, implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]
; GFX11-LABEL: name: lshr_s16_s16_vv_zext_to_s32		; GFX11-LABEL: name: lshr_s16_s16_vv_zext_to_s32
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_LSHRREV_B16_t16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_LSHRREV_B16_e64_]], 0, 16, implicit $exec		; GFX11-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_LSHRREV_B16_t16_e64_]], 0, 16, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_LSHR %2, %3		%4:vgpr(s16) = G_LSHR %2, %3
%5:vgpr(s32) = G_ZEXT %4		%5:vgpr(s32) = G_ZEXT %4
S_ENDPGM 0, implicit %5		S_ENDPGM 0, implicit %5
▲ Show 20 Lines • Show All 181 Lines • ▼ Show 20 Lines	bb.0:
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]]
; GFX11-LABEL: name: lshr_s16_s16_sv		; GFX11-LABEL: name: lshr_s16_s16_sv
; GFX11: liveins: $sgpr0, $vgpr0		; GFX11: liveins: $sgpr0, $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_LSHRREV_B16_t16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHRREV_B16_t16_e64_]]
%0:sgpr(s32) = COPY $sgpr0		%0:sgpr(s32) = COPY $sgpr0
%1:vgpr(s32) = COPY $vgpr0		%1:vgpr(s32) = COPY $vgpr0
%2:sgpr(s16) = G_TRUNC %0		%2:sgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_LSHR %2, %3		%4:vgpr(s16) = G_LSHR %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-smed3.s16.mir

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	bb.0:
; GFX9-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX9-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MAX_I16_e64_]]		; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MAX_I16_e64_]]
; GFX11-LABEL: name: smed3_s16_vvv_multiuse0		; GFX11-LABEL: name: smed3_s16_vvv_multiuse0
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2		; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2		; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MAX_I16_e64_:%[0-9]+]]:vgpr_32 = V_MAX_I16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_MAX_I16_t16_e64_:%[0-9]+]]:vgpr_32 = V_MAX_I16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX11-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MAX_I16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MAX_I16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s32) = COPY $vgpr2		%2:vgpr(s32) = COPY $vgpr2
%3:vgpr(s16) = G_TRUNC %0		%3:vgpr(s16) = G_TRUNC %0
%4:vgpr(s16) = G_TRUNC %1		%4:vgpr(s16) = G_TRUNC %1
%5:vgpr(s16) = G_TRUNC %2		%5:vgpr(s16) = G_TRUNC %2

%6:vgpr(s16) = G_SMAX %3, %4		%6:vgpr(s16) = G_SMAX %3, %4
Show All 34 Lines	bb.0:
; GFX9-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX9-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MIN_I16_e64_]]		; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MIN_I16_e64_]]
; GFX11-LABEL: name: smed3_s16_vvv_multiuse1		; GFX11-LABEL: name: smed3_s16_vvv_multiuse1
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2		; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2		; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MIN_I16_e64_:%[0-9]+]]:vgpr_32 = V_MIN_I16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_MIN_I16_t16_e64_:%[0-9]+]]:vgpr_32 = V_MIN_I16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX11-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MIN_I16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MIN_I16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s32) = COPY $vgpr2		%2:vgpr(s32) = COPY $vgpr2
%3:vgpr(s16) = G_TRUNC %0		%3:vgpr(s16) = G_TRUNC %0
%4:vgpr(s16) = G_TRUNC %1		%4:vgpr(s16) = G_TRUNC %1
%5:vgpr(s16) = G_TRUNC %2		%5:vgpr(s16) = G_TRUNC %2

%6:vgpr(s16) = G_SMAX %3, %4		%6:vgpr(s16) = G_SMAX %3, %4
Show All 35 Lines	bb.0:
; GFX9-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX9-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MAX_I16_e64_]]		; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MAX_I16_e64_]]
; GFX11-LABEL: name: smed3_s16_vvv_multiuse2		; GFX11-LABEL: name: smed3_s16_vvv_multiuse2
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2		; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2		; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MIN_I16_e64_:%[0-9]+]]:vgpr_32 = V_MIN_I16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_MIN_I16_t16_e64_:%[0-9]+]]:vgpr_32 = V_MIN_I16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: [[V_MAX_I16_e64_:%[0-9]+]]:vgpr_32 = V_MAX_I16_e64 [[V_MIN_I16_e64_]], [[COPY2]], implicit $exec		; GFX11-NEXT: [[V_MAX_I16_t16_e64_:%[0-9]+]]:vgpr_32 = V_MAX_I16_t16_e64 [[V_MIN_I16_t16_e64_]], [[COPY2]], implicit $exec
; GFX11-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX11-NEXT: [[V_MED3_I16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_I16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MAX_I16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_I16_e64_]], implicit [[V_MAX_I16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s32) = COPY $vgpr2		%2:vgpr(s32) = COPY $vgpr2
%3:vgpr(s16) = G_TRUNC %0		%3:vgpr(s16) = G_TRUNC %0
%4:vgpr(s16) = G_TRUNC %1		%4:vgpr(s16) = G_TRUNC %1
%5:vgpr(s16) = G_TRUNC %2		%5:vgpr(s16) = G_TRUNC %2

%6:vgpr(s16) = G_SMAX %3, %4		%6:vgpr(s16) = G_SMAX %3, %4
%7:vgpr(s16) = G_SMIN %3, %4		%7:vgpr(s16) = G_SMIN %3, %4
%8:vgpr(s16) = G_SMAX %7, %5		%8:vgpr(s16) = G_SMAX %7, %5
%9:vgpr(s16) = G_SMIN %6, %8		%9:vgpr(s16) = G_SMIN %6, %8

S_ENDPGM 0, implicit %9, implicit %8		S_ENDPGM 0, implicit %9, implicit %8
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-umed3.s16.mir

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	bb.0:
; GFX9-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX9-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MAX_U16_e64_]]		; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MAX_U16_e64_]]
; GFX11-LABEL: name: umed3_s16_vvv_multiuse0		; GFX11-LABEL: name: umed3_s16_vvv_multiuse0
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2		; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2		; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MAX_U16_e64_:%[0-9]+]]:vgpr_32 = V_MAX_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_MAX_U16_t16_e64_:%[0-9]+]]:vgpr_32 = V_MAX_U16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX11-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MAX_U16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MAX_U16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s32) = COPY $vgpr2		%2:vgpr(s32) = COPY $vgpr2
%3:vgpr(s16) = G_TRUNC %0		%3:vgpr(s16) = G_TRUNC %0
%4:vgpr(s16) = G_TRUNC %1		%4:vgpr(s16) = G_TRUNC %1
%5:vgpr(s16) = G_TRUNC %2		%5:vgpr(s16) = G_TRUNC %2

%6:vgpr(s16) = G_UMAX %3, %4		%6:vgpr(s16) = G_UMAX %3, %4
Show All 34 Lines	bb.0:
; GFX9-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX9-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MIN_U16_e64_]]		; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MIN_U16_e64_]]
; GFX11-LABEL: name: umed3_s16_vvv_multiuse1		; GFX11-LABEL: name: umed3_s16_vvv_multiuse1
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2		; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2		; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MIN_U16_e64_:%[0-9]+]]:vgpr_32 = V_MIN_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_MIN_U16_t16_e64_:%[0-9]+]]:vgpr_32 = V_MIN_U16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX11-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MIN_U16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MIN_U16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s32) = COPY $vgpr2		%2:vgpr(s32) = COPY $vgpr2
%3:vgpr(s16) = G_TRUNC %0		%3:vgpr(s16) = G_TRUNC %0
%4:vgpr(s16) = G_TRUNC %1		%4:vgpr(s16) = G_TRUNC %1
%5:vgpr(s16) = G_TRUNC %2		%5:vgpr(s16) = G_TRUNC %2

%6:vgpr(s16) = G_UMAX %3, %4		%6:vgpr(s16) = G_UMAX %3, %4
Show All 35 Lines	bb.0:
; GFX9-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX9-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MAX_U16_e64_]]		; GFX9-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MAX_U16_e64_]]
; GFX11-LABEL: name: umed3_s16_vvv_multiuse2		; GFX11-LABEL: name: umed3_s16_vvv_multiuse2
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2		; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2		; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MIN_U16_e64_:%[0-9]+]]:vgpr_32 = V_MIN_U16_e64 [[COPY]], [[COPY1]], implicit $exec		; GFX11-NEXT: [[V_MIN_U16_t16_e64_:%[0-9]+]]:vgpr_32 = V_MIN_U16_t16_e64 [[COPY]], [[COPY1]], implicit $exec
; GFX11-NEXT: [[V_MAX_U16_e64_:%[0-9]+]]:vgpr_32 = V_MAX_U16_e64 [[V_MIN_U16_e64_]], [[COPY2]], implicit $exec		; GFX11-NEXT: [[V_MAX_U16_t16_e64_:%[0-9]+]]:vgpr_32 = V_MAX_U16_t16_e64 [[V_MIN_U16_t16_e64_]], [[COPY2]], implicit $exec
; GFX11-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec		; GFX11-NEXT: [[V_MED3_U16_e64_:%[0-9]+]]:vgpr_32 = V_MED3_U16_e64 0, [[COPY]], 0, [[COPY1]], 0, [[COPY2]], 0, 0, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MAX_U16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_MED3_U16_e64_]], implicit [[V_MAX_U16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s32) = COPY $vgpr2		%2:vgpr(s32) = COPY $vgpr2
%3:vgpr(s16) = G_TRUNC %0		%3:vgpr(s16) = G_TRUNC %0
%4:vgpr(s16) = G_TRUNC %1		%4:vgpr(s16) = G_TRUNC %1
%5:vgpr(s16) = G_TRUNC %2		%5:vgpr(s16) = G_TRUNC %2

%6:vgpr(s16) = G_UMAX %3, %4		%6:vgpr(s16) = G_UMAX %3, %4
%7:vgpr(s16) = G_UMIN %3, %4		%7:vgpr(s16) = G_UMIN %3, %4
%8:vgpr(s16) = G_UMAX %7, %5		%8:vgpr(s16) = G_UMAX %7, %5
%9:vgpr(s16) = G_UMIN %6, %8		%9:vgpr(s16) = G_UMIN %6, %8

S_ENDPGM 0, implicit %9, implicit %8		S_ENDPGM 0, implicit %9, implicit %8
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir

Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	bb.0:
; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]]
; GFX11-LABEL: name: shl_s16_s16_vs		; GFX11-LABEL: name: shl_s16_s16_vs
; GFX11: liveins: $sgpr0, $vgpr0		; GFX11: liveins: $sgpr0, $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_LSHLREV_B16_t16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:sgpr(s32) = COPY $sgpr0		%1:sgpr(s32) = COPY $sgpr0
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:sgpr(s16) = G_TRUNC %1		%3:sgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_SHL %2, %3		%4:vgpr(s16) = G_SHL %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	bb.0:
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX10-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]]
; GFX11-LABEL: name: shl_s16_s16_vv		; GFX11-LABEL: name: shl_s16_s16_vv
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_LSHLREV_B16_t16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_t16_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_SHL %2, %3		%4:vgpr(s16) = G_SHL %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

Show All 28 Lines	bb.0:
; GFX10-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_LSHLREV_B16_e64_]], 0, 16, implicit $exec		; GFX10-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_LSHLREV_B16_e64_]], 0, 16, implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]
; GFX11-LABEL: name: shl_s16_s16_vv_zext_to_s32		; GFX11-LABEL: name: shl_s16_s16_vv_zext_to_s32
; GFX11: liveins: $vgpr0, $vgpr1		; GFX11: liveins: $vgpr0, $vgpr1
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX11-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_LSHLREV_B16_t16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_LSHLREV_B16_e64_]], 0, 16, implicit $exec		; GFX11-NEXT: [[V_BFE_U32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_U32_e64 [[V_LSHLREV_B16_t16_e64_]], 0, 16, implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_BFE_U32_e64_]]
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s32) = COPY $vgpr1		%1:vgpr(s32) = COPY $vgpr1
%2:vgpr(s16) = G_TRUNC %0		%2:vgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_SHL %2, %3		%4:vgpr(s16) = G_SHL %2, %3
%5:vgpr(s32) = G_ZEXT %4		%5:vgpr(s32) = G_ZEXT %4
S_ENDPGM 0, implicit %5		S_ENDPGM 0, implicit %5
▲ Show 20 Lines • Show All 181 Lines • ▼ Show 20 Lines	bb.0:
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX10-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]]		; GFX10-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]]
; GFX11-LABEL: name: shl_s16_s16_sv		; GFX11-LABEL: name: shl_s16_s16_sv
; GFX11: liveins: $sgpr0, $vgpr0		; GFX11: liveins: $sgpr0, $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec		; GFX11-NEXT: [[V_LSHLREV_B16_t16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_t16_e64 [[COPY1]], [[COPY]], implicit $exec
; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]]		; GFX11-NEXT: S_ENDPGM 0, implicit [[V_LSHLREV_B16_t16_e64_]]
%0:sgpr(s32) = COPY $sgpr0		%0:sgpr(s32) = COPY $sgpr0
%1:vgpr(s32) = COPY $vgpr0		%1:vgpr(s32) = COPY $vgpr0
%2:sgpr(s16) = G_TRUNC %0		%2:sgpr(s16) = G_TRUNC %0
%3:vgpr(s16) = G_TRUNC %1		%3:vgpr(s16) = G_TRUNC %1
%4:vgpr(s16) = G_SHL %2, %3		%4:vgpr(s16) = G_SHL %2, %3
S_ENDPGM 0, implicit %4		S_ENDPGM 0, implicit %4
...		...

▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sitofp.mir

Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	body: \|
bb.0:		bb.0:
liveins: $vgpr0		liveins: $vgpr0

; WAVE64-LABEL: name: sitofp_s32_to_s16_vv		; WAVE64-LABEL: name: sitofp_s32_to_s16_vv
; WAVE64: liveins: $vgpr0		; WAVE64: liveins: $vgpr0
; WAVE64-NEXT: {{ $}}		; WAVE64-NEXT: {{ $}}
; WAVE64-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; WAVE64-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; WAVE64-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec		; WAVE64-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec
; WAVE64-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_I32_e32_]], implicit $mode, implicit $exec		; WAVE64-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e64 0, [[V_CVT_F32_I32_e32_]], 0, 0, implicit $mode, implicit $exec
; WAVE64-NEXT: $vgpr0 = COPY %1		; WAVE64-NEXT: $vgpr0 = COPY %1
; WAVE32-LABEL: name: sitofp_s32_to_s16_vv		; WAVE32-LABEL: name: sitofp_s32_to_s16_vv
; WAVE32: liveins: $vgpr0		; WAVE32: liveins: $vgpr0
; WAVE32-NEXT: {{ $}}		; WAVE32-NEXT: {{ $}}
; WAVE32-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; WAVE32-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; WAVE32-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec		; WAVE32-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec
; WAVE32-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_I32_e32_]], implicit $mode, implicit $exec		; WAVE32-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e64 0, [[V_CVT_F32_I32_e32_]], 0, 0, implicit $mode, implicit $exec
; WAVE32-NEXT: $vgpr0 = COPY %1		; WAVE32-NEXT: $vgpr0 = COPY %1
; GFX11-LABEL: name: sitofp_s32_to_s16_vv		; GFX11-LABEL: name: sitofp_s32_to_s16_vv
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec
; GFX11-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_I32_e32_]], implicit $mode, implicit $exec		; GFX11-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_t16_e64 0, [[V_CVT_F32_I32_e32_]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: $vgpr0 = COPY %1		; GFX11-NEXT: $vgpr0 = COPY %1
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
		foadUnsubmitted Done Reply Inline Actions This is undesirable. We don't want to see _f128 classes here. We should be selecting the _e64 form of the V_CVT instruction instead. If you rebase on 3743f9afeb51e0b7bdf2269583f32b7e35369168 you will see the same problem for fptosi as well as sitofp. The fix is to change the patterns in SIInstructions.td to always use the _e64 forms: https://reviews.llvm.org/differential/diff/459699/ Pre-GFX11 this should be pretty harmless even if selecting the _e64 forms doesn't actually give any benefit. foad: This is undesirable. We don't want to see _f128 classes here. We should be selecting the _e64…
		Joe_NashAuthorUnsubmitted Done Reply Inline Actions I picked up those changes, thanks! Joe_Nash: I picked up those changes, thanks!
%1:vgpr(s16) = G_SITOFP %0		%1:vgpr(s16) = G_SITOFP %0
%2:vgpr(s32) = G_ANYEXT %1		%2:vgpr(s32) = G_ANYEXT %1
$vgpr0 = COPY %2		$vgpr0 = COPY %2
...		...

---		---
name: sitofp_s32_to_s16_vs		name: sitofp_s32_to_s16_vs
legalized: true		legalized: true
regBankSelected: true		regBankSelected: true
tracksRegLiveness: true		tracksRegLiveness: true

body: \|		body: \|
bb.0:		bb.0:
liveins: $sgpr0		liveins: $sgpr0

; WAVE64-LABEL: name: sitofp_s32_to_s16_vs		; WAVE64-LABEL: name: sitofp_s32_to_s16_vs
; WAVE64: liveins: $sgpr0		; WAVE64: liveins: $sgpr0
; WAVE64-NEXT: {{ $}}		; WAVE64-NEXT: {{ $}}
; WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; WAVE64-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec		; WAVE64-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec
; WAVE64-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_I32_e32_]], implicit $mode, implicit $exec		; WAVE64-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e64 0, [[V_CVT_F32_I32_e32_]], 0, 0, implicit $mode, implicit $exec
; WAVE64-NEXT: $vgpr0 = COPY %1		; WAVE64-NEXT: $vgpr0 = COPY %1
; WAVE32-LABEL: name: sitofp_s32_to_s16_vs		; WAVE32-LABEL: name: sitofp_s32_to_s16_vs
; WAVE32: liveins: $sgpr0		; WAVE32: liveins: $sgpr0
; WAVE32-NEXT: {{ $}}		; WAVE32-NEXT: {{ $}}
; WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; WAVE32-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec		; WAVE32-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec
; WAVE32-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_I32_e32_]], implicit $mode, implicit $exec		; WAVE32-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e64 0, [[V_CVT_F32_I32_e32_]], 0, 0, implicit $mode, implicit $exec
; WAVE32-NEXT: $vgpr0 = COPY %1		; WAVE32-NEXT: $vgpr0 = COPY %1
; GFX11-LABEL: name: sitofp_s32_to_s16_vs		; GFX11-LABEL: name: sitofp_s32_to_s16_vs
; GFX11: liveins: $sgpr0		; GFX11: liveins: $sgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: [[V_CVT_F32_I32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_I32_e32 [[COPY]], implicit $mode, implicit $exec
; GFX11-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_I32_e32_]], implicit $mode, implicit $exec		; GFX11-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_t16_e64 0, [[V_CVT_F32_I32_e32_]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: $vgpr0 = COPY %1		; GFX11-NEXT: $vgpr0 = COPY %1
%0:sgpr(s32) = COPY $sgpr0		%0:sgpr(s32) = COPY $sgpr0
%1:vgpr(s16) = G_SITOFP %0		%1:vgpr(s16) = G_SITOFP %0
%2:vgpr(s32) = G_ANYEXT %1		%2:vgpr(s32) = G_ANYEXT %1
$vgpr0 = COPY %2		$vgpr0 = COPY %2
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-uitofp.mir

Show First 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	body: \|
bb.0:		bb.0:
liveins: $vgpr0		liveins: $vgpr0

; WAVE64-LABEL: name: uitofp_s32_to_s16_vv		; WAVE64-LABEL: name: uitofp_s32_to_s16_vv
; WAVE64: liveins: $vgpr0		; WAVE64: liveins: $vgpr0
; WAVE64-NEXT: {{ $}}		; WAVE64-NEXT: {{ $}}
; WAVE64-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; WAVE64-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; WAVE64-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec		; WAVE64-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec
; WAVE64-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_U32_e32_]], implicit $mode, implicit $exec		; WAVE64-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e64 0, [[V_CVT_F32_U32_e32_]], 0, 0, implicit $mode, implicit $exec
; WAVE64-NEXT: $vgpr0 = COPY %1		; WAVE64-NEXT: $vgpr0 = COPY %1
; WAVE32-LABEL: name: uitofp_s32_to_s16_vv		; WAVE32-LABEL: name: uitofp_s32_to_s16_vv
; WAVE32: liveins: $vgpr0		; WAVE32: liveins: $vgpr0
; WAVE32-NEXT: {{ $}}		; WAVE32-NEXT: {{ $}}
; WAVE32-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; WAVE32-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; WAVE32-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec		; WAVE32-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec
; WAVE32-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_U32_e32_]], implicit $mode, implicit $exec		; WAVE32-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e64 0, [[V_CVT_F32_U32_e32_]], 0, 0, implicit $mode, implicit $exec
; WAVE32-NEXT: $vgpr0 = COPY %1		; WAVE32-NEXT: $vgpr0 = COPY %1
; GFX11-LABEL: name: uitofp_s32_to_s16_vv		; GFX11-LABEL: name: uitofp_s32_to_s16_vv
; GFX11: liveins: $vgpr0		; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX11-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec
; GFX11-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_U32_e32_]], implicit $mode, implicit $exec		; GFX11-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_t16_e64 0, [[V_CVT_F32_U32_e32_]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: $vgpr0 = COPY %1		; GFX11-NEXT: $vgpr0 = COPY %1
%0:vgpr(s32) = COPY $vgpr0		%0:vgpr(s32) = COPY $vgpr0
%1:vgpr(s16) = G_UITOFP %0		%1:vgpr(s16) = G_UITOFP %0
%2:vgpr(s32) = G_ANYEXT %1		%2:vgpr(s32) = G_ANYEXT %1
$vgpr0 = COPY %2		$vgpr0 = COPY %2
...		...

---		---
name: uitofp_s32_to_s16_vs		name: uitofp_s32_to_s16_vs
legalized: true		legalized: true
regBankSelected: true		regBankSelected: true
tracksRegLiveness: true		tracksRegLiveness: true

body: \|		body: \|
bb.0:		bb.0:
liveins: $sgpr0		liveins: $sgpr0

; WAVE64-LABEL: name: uitofp_s32_to_s16_vs		; WAVE64-LABEL: name: uitofp_s32_to_s16_vs
; WAVE64: liveins: $sgpr0		; WAVE64: liveins: $sgpr0
; WAVE64-NEXT: {{ $}}		; WAVE64-NEXT: {{ $}}
; WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; WAVE64-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec		; WAVE64-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec
; WAVE64-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_U32_e32_]], implicit $mode, implicit $exec		; WAVE64-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e64 0, [[V_CVT_F32_U32_e32_]], 0, 0, implicit $mode, implicit $exec
; WAVE64-NEXT: $vgpr0 = COPY %1		; WAVE64-NEXT: $vgpr0 = COPY %1
; WAVE32-LABEL: name: uitofp_s32_to_s16_vs		; WAVE32-LABEL: name: uitofp_s32_to_s16_vs
; WAVE32: liveins: $sgpr0		; WAVE32: liveins: $sgpr0
; WAVE32-NEXT: {{ $}}		; WAVE32-NEXT: {{ $}}
; WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; WAVE32-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec		; WAVE32-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec
; WAVE32-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_U32_e32_]], implicit $mode, implicit $exec		; WAVE32-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e64 0, [[V_CVT_F32_U32_e32_]], 0, 0, implicit $mode, implicit $exec
; WAVE32-NEXT: $vgpr0 = COPY %1		; WAVE32-NEXT: $vgpr0 = COPY %1
; GFX11-LABEL: name: uitofp_s32_to_s16_vs		; GFX11-LABEL: name: uitofp_s32_to_s16_vs
; GFX11: liveins: $sgpr0		; GFX11: liveins: $sgpr0
; GFX11-NEXT: {{ $}}		; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0		; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX11-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec		; GFX11-NEXT: [[V_CVT_F32_U32_e32_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e32 [[COPY]], implicit $mode, implicit $exec
; GFX11-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_e32 [[V_CVT_F32_U32_e32_]], implicit $mode, implicit $exec		; GFX11-NEXT: %1:vgpr_32 = nofpexcept V_CVT_F16_F32_t16_e64 0, [[V_CVT_F32_U32_e32_]], 0, 0, implicit $mode, implicit $exec
; GFX11-NEXT: $vgpr0 = COPY %1		; GFX11-NEXT: $vgpr0 = COPY %1
%0:sgpr(s32) = COPY $sgpr0		%0:sgpr(s32) = COPY $sgpr0
%1:vgpr(s16) = G_UITOFP %0		%1:vgpr(s16) = G_UITOFP %0
%2:vgpr(s32) = G_ANYEXT %1		%2:vgpr(s32) = G_ANYEXT %1
$vgpr0 = COPY %2		$vgpr0 = COPY %2
...		...

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @asm_simple_agpr_clobber() {
; CHECK-NEXT: S_ENDPGM 0		; CHECK-NEXT: S_ENDPGM 0
call void asm sideeffect "; def a0", "~{a0}"(), !srcloc !0		call void asm sideeffect "; def a0", "~{a0}"(), !srcloc !0
ret void		ret void
}		}

define i32 @asm_vgpr_early_clobber() {		define i32 @asm_vgpr_early_clobber() {
; CHECK-LABEL: name: asm_vgpr_early_clobber		; CHECK-LABEL: name: asm_vgpr_early_clobber
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 7; v_mov_b32 $1, 7", 1 /* sideeffect attdialect /, 1835019 / regdef-ec:VGPR_32 /, def early-clobber %0, 1835019 / regdef-ec:VGPR_32 */, def early-clobber %1, !0		; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 7; v_mov_b32 $1, 7", 1 /* sideeffect attdialect /, 1966091 / regdef-ec:VGPR_32 /, def early-clobber %0, 1966091 / regdef-ec:VGPR_32 */, def early-clobber %1, !0
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %1
; CHECK-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[COPY]], [[COPY1]]		; CHECK-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[COPY]], [[COPY1]]
; CHECK-NEXT: $vgpr0 = COPY [[ADD]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[ADD]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0		; CHECK-NEXT: SI_RETURN implicit $vgpr0
call { i32, i32 } asm sideeffect "v_mov_b32 $0, 7; v_mov_b32 $1, 7", "=&v,=&v"(), !srcloc !0		call { i32, i32 } asm sideeffect "v_mov_b32 $0, 7; v_mov_b32 $1, 7", "=&v,=&v"(), !srcloc !0
%asmresult = extractvalue { i32, i32 } %1, 0		%asmresult = extractvalue { i32, i32 } %1, 0
%asmresult1 = extractvalue { i32, i32 } %1, 1		%asmresult1 = extractvalue { i32, i32 } %1, 1
Show All 11 Lines
entry:		entry:
%0 = tail call i32 asm "v_mov_b32 v1, 7", "={v1}"() nounwind		%0 = tail call i32 asm "v_mov_b32 v1, 7", "={v1}"() nounwind
ret i32 %0		ret i32 %0
}		}

define i32 @test_single_vgpr_output() nounwind {		define i32 @test_single_vgpr_output() nounwind {
; CHECK-LABEL: name: test_single_vgpr_output		; CHECK-LABEL: name: test_single_vgpr_output
; CHECK: bb.1.entry:		; CHECK: bb.1.entry:
; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 7", 0 /* attdialect /, 1835018 / regdef:VGPR_32 */, def %0		; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 7", 0 /* attdialect /, 1966090 / regdef:VGPR_32 */, def %0
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0
; CHECK-NEXT: $vgpr0 = COPY [[COPY]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[COPY]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0		; CHECK-NEXT: SI_RETURN implicit $vgpr0
entry:		entry:
%0 = tail call i32 asm "v_mov_b32 $0, 7", "=v"() nounwind		%0 = tail call i32 asm "v_mov_b32 $0, 7", "=v"() nounwind
ret i32 %0		ret i32 %0
}		}

define i32 @test_single_sgpr_output_s32() nounwind {		define i32 @test_single_sgpr_output_s32() nounwind {
; CHECK-LABEL: name: test_single_sgpr_output_s32		; CHECK-LABEL: name: test_single_sgpr_output_s32
; CHECK: bb.1.entry:		; CHECK: bb.1.entry:
; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect /, 1966090 / regdef:SReg_32 */, def %0		; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect /, 2097162 / regdef:SReg_32 */, def %0
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0
; CHECK-NEXT: $vgpr0 = COPY [[COPY]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[COPY]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0		; CHECK-NEXT: SI_RETURN implicit $vgpr0
entry:		entry:
%0 = tail call i32 asm "s_mov_b32 $0, 7", "=s"() nounwind		%0 = tail call i32 asm "s_mov_b32 $0, 7", "=s"() nounwind
ret i32 %0		ret i32 %0
}		}

; Check support for returning several floats		; Check support for returning several floats
define float @test_multiple_register_outputs_same() #0 {		define float @test_multiple_register_outputs_same() #0 {
; CHECK-LABEL: name: test_multiple_register_outputs_same		; CHECK-LABEL: name: test_multiple_register_outputs_same
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 0; v_mov_b32 $1, 1", 0 /* attdialect /, 1835018 / regdef:VGPR_32 /, def %0, 1835018 / regdef:VGPR_32 */, def %1		; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 0; v_mov_b32 $1, 1", 0 /* attdialect /, 1966090 / regdef:VGPR_32 /, def %0, 1966090 / regdef:VGPR_32 */, def %1
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %1
; CHECK-NEXT: [[FADD:%[0-9]+]]:_(s32) = G_FADD [[COPY]], [[COPY1]]		; CHECK-NEXT: [[FADD:%[0-9]+]]:_(s32) = G_FADD [[COPY]], [[COPY1]]
; CHECK-NEXT: $vgpr0 = COPY [[FADD]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[FADD]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0		; CHECK-NEXT: SI_RETURN implicit $vgpr0
%1 = call { float, float } asm "v_mov_b32 $0, 0; v_mov_b32 $1, 1", "=v,=v"()		%1 = call { float, float } asm "v_mov_b32 $0, 0; v_mov_b32 $1, 1", "=v,=v"()
%asmresult = extractvalue { float, float } %1, 0		%asmresult = extractvalue { float, float } %1, 0
%asmresult1 = extractvalue { float, float } %1, 1		%asmresult1 = extractvalue { float, float } %1, 1
%add = fadd float %asmresult, %asmresult1		%add = fadd float %asmresult, %asmresult1
ret float %add		ret float %add
}		}

; Check support for returning several floats		; Check support for returning several floats
define double @test_multiple_register_outputs_mixed() #0 {		define double @test_multiple_register_outputs_mixed() #0 {
; CHECK-LABEL: name: test_multiple_register_outputs_mixed		; CHECK-LABEL: name: test_multiple_register_outputs_mixed
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 0; v_add_f64 $1, 0, 0", 0 /* attdialect /, 1835018 / regdef:VGPR_32 /, def %0, 2949130 / regdef:VReg_64 */, def %1		; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 0; v_add_f64 $1, 0, 0", 0 /* attdialect /, 1966090 / regdef:VGPR_32 /, def %0, 3211274 / regdef:VReg_64 */, def %1
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY %1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY %1
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY1]](s64)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY1]](s64)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1		; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1
%1 = call { float, double } asm "v_mov_b32 $0, 0; v_add_f64 $1, 0, 0", "=v,=v"()		%1 = call { float, double } asm "v_mov_b32 $0, 0; v_add_f64 $1, 0, 0", "=v,=v"()
%asmresult = extractvalue { float, double } %1, 1		%asmresult = extractvalue { float, double } %1, 1
Show All 15 Lines	define float @test_vector_output() nounwind {
ret float %2		ret float %2
}		}

define amdgpu_kernel void @test_input_vgpr_imm() {		define amdgpu_kernel void @test_input_vgpr_imm() {
; CHECK-LABEL: name: test_input_vgpr_imm		; CHECK-LABEL: name: test_input_vgpr_imm
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 42		; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 42
; CHECK-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY [[C]](s32)		; CHECK-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY [[C]](s32)
; CHECK-NEXT: INLINEASM &"v_mov_b32 v0, $0", 1 /* sideeffect attdialect /, 1835017 / reguse:VGPR_32 */, [[COPY]]		; CHECK-NEXT: INLINEASM &"v_mov_b32 v0, $0", 1 /* sideeffect attdialect /, 1966089 / reguse:VGPR_32 */, [[COPY]]
; CHECK-NEXT: S_ENDPGM 0		; CHECK-NEXT: S_ENDPGM 0
call void asm sideeffect "v_mov_b32 v0, $0", "v"(i32 42)		call void asm sideeffect "v_mov_b32 v0, $0", "v"(i32 42)
ret void		ret void
}		}

define amdgpu_kernel void @test_input_sgpr_imm() {		define amdgpu_kernel void @test_input_sgpr_imm() {
; CHECK-LABEL: name: test_input_sgpr_imm		; CHECK-LABEL: name: test_input_sgpr_imm
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 42		; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 42
; CHECK-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY [[C]](s32)		; CHECK-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY [[C]](s32)
; CHECK-NEXT: INLINEASM &"s_mov_b32 s0, $0", 1 /* sideeffect attdialect /, 1966089 / reguse:SReg_32 */, [[COPY]]		; CHECK-NEXT: INLINEASM &"s_mov_b32 s0, $0", 1 /* sideeffect attdialect /, 2097161 / reguse:SReg_32 */, [[COPY]]
; CHECK-NEXT: S_ENDPGM 0		; CHECK-NEXT: S_ENDPGM 0
call void asm sideeffect "s_mov_b32 s0, $0", "s"(i32 42)		call void asm sideeffect "s_mov_b32 s0, $0", "s"(i32 42)
ret void		ret void
}		}

define amdgpu_kernel void @test_input_imm() {		define amdgpu_kernel void @test_input_imm() {
; CHECK-LABEL: name: test_input_imm		; CHECK-LABEL: name: test_input_imm
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: INLINEASM &"s_mov_b32 s0, $0", 9 /* sideeffect mayload attdialect /, 13 / imm */, 42		; CHECK-NEXT: INLINEASM &"s_mov_b32 s0, $0", 9 /* sideeffect mayload attdialect /, 13 / imm */, 42
; CHECK-NEXT: INLINEASM &"s_mov_b64 s[0:1], $0", 9 /* sideeffect mayload attdialect /, 13 / imm */, 42		; CHECK-NEXT: INLINEASM &"s_mov_b64 s[0:1], $0", 9 /* sideeffect mayload attdialect /, 13 / imm */, 42
; CHECK-NEXT: S_ENDPGM 0		; CHECK-NEXT: S_ENDPGM 0
call void asm sideeffect "s_mov_b32 s0, $0", "i"(i32 42)		call void asm sideeffect "s_mov_b32 s0, $0", "i"(i32 42)
call void asm sideeffect "s_mov_b64 s[0:1], $0", "i"(i64 42)		call void asm sideeffect "s_mov_b64 s[0:1], $0", "i"(i64 42)
ret void		ret void
}		}

define float @test_input_vgpr(i32 %src) nounwind {		define float @test_input_vgpr(i32 %src) nounwind {
; CHECK-LABEL: name: test_input_vgpr		; CHECK-LABEL: name: test_input_vgpr
; CHECK: bb.1.entry:		; CHECK: bb.1.entry:
; CHECK-NEXT: liveins: $vgpr0		; CHECK-NEXT: liveins: $vgpr0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]](s32)		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]](s32)
; CHECK-NEXT: INLINEASM &"v_add_f32 $0, 1.0, $1", 0 /* attdialect /, 1835018 / regdef:VGPR_32 /, def %1, 1835017 / reguse:VGPR_32 */, [[COPY1]]		; CHECK-NEXT: INLINEASM &"v_add_f32 $0, 1.0, $1", 0 /* attdialect /, 1966090 / regdef:VGPR_32 /, def %1, 1966089 / reguse:VGPR_32 */, [[COPY1]]
; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY %1		; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY %1
; CHECK-NEXT: $vgpr0 = COPY [[COPY2]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[COPY2]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0		; CHECK-NEXT: SI_RETURN implicit $vgpr0
entry:		entry:
%0 = tail call float asm "v_add_f32 $0, 1.0, $1", "=v,v"(i32 %src) nounwind		%0 = tail call float asm "v_add_f32 $0, 1.0, $1", "=v,v"(i32 %src) nounwind
ret float %0		ret float %0
}		}

define i32 @test_memory_constraint(i32 addrspace(3)* %a) nounwind {		define i32 @test_memory_constraint(i32 addrspace(3)* %a) nounwind {
; CHECK-LABEL: name: test_memory_constraint		; CHECK-LABEL: name: test_memory_constraint
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: liveins: $vgpr0		; CHECK-NEXT: liveins: $vgpr0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; CHECK-NEXT: INLINEASM &"ds_read_b32 $0, $1", 8 /* mayload attdialect /, 1835018 / regdef:VGPR_32 /, def %1, 196622 / mem:m */, [[COPY]](p3)		; CHECK-NEXT: INLINEASM &"ds_read_b32 $0, $1", 8 /* mayload attdialect /, 1966090 / regdef:VGPR_32 /, def %1, 196622 / mem:m */, [[COPY]](p3)
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %1
; CHECK-NEXT: $vgpr0 = COPY [[COPY1]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[COPY1]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0		; CHECK-NEXT: SI_RETURN implicit $vgpr0
%1 = tail call i32 asm "ds_read_b32 $0, $1", "=v,m"(i32 addrspace(3) elementtype(i32) %a)		%1 = tail call i32 asm "ds_read_b32 $0, $1", "=v,m"(i32 addrspace(3) elementtype(i32) %a)
ret i32 %1		ret i32 %1
}		}

define i32 @test_vgpr_matching_constraint(i32 %a) nounwind {		define i32 @test_vgpr_matching_constraint(i32 %a) nounwind {
; CHECK-LABEL: name: test_vgpr_matching_constraint		; CHECK-LABEL: name: test_vgpr_matching_constraint
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: liveins: $vgpr0		; CHECK-NEXT: liveins: $vgpr0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1		; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]		; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[AND]](s32)		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[AND]](s32)
; CHECK-NEXT: INLINEASM &";", 1 /* sideeffect attdialect /, 1835018 / regdef:VGPR_32 /, def %3, 2147483657 / reguse tiedto:$0 */, [[COPY1]](tied-def 3)		; CHECK-NEXT: INLINEASM &";", 1 /* sideeffect attdialect /, 1966090 / regdef:VGPR_32 /, def %3, 2147483657 / reguse tiedto:$0 */, [[COPY1]](tied-def 3)
; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY %3		; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY %3
; CHECK-NEXT: $vgpr0 = COPY [[COPY2]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[COPY2]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0		; CHECK-NEXT: SI_RETURN implicit $vgpr0
%and = and i32 %a, 1		%and = and i32 %a, 1
%asm = call i32 asm sideeffect ";", "=v,0"(i32 %and)		%asm = call i32 asm sideeffect ";", "=v,0"(i32 %and)
ret i32 %asm		ret i32 %asm
}		}

define i32 @test_sgpr_matching_constraint() nounwind {		define i32 @test_sgpr_matching_constraint() nounwind {
; CHECK-LABEL: name: test_sgpr_matching_constraint		; CHECK-LABEL: name: test_sgpr_matching_constraint
; CHECK: bb.1.entry:		; CHECK: bb.1.entry:
; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect /, 1966090 / regdef:SReg_32 */, def %0		; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect /, 2097162 / regdef:SReg_32 */, def %0
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0
; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 8", 0 /* attdialect /, 1966090 / regdef:SReg_32 */, def %2		; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 8", 0 /* attdialect /, 2097162 / regdef:SReg_32 */, def %2
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %2		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %2
; CHECK-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]](s32)		; CHECK-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]](s32)
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]](s32)		; CHECK-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]](s32)
; CHECK-NEXT: INLINEASM &"s_add_u32 $0, $1, $2", 0 /* attdialect /, 1966090 / regdef:SReg_32 /, def %4, 1966089 / reguse:SReg_32 /, [[COPY2]], 2147483657 / reguse tiedto:$0 */, [[COPY3]](tied-def 3)		; CHECK-NEXT: INLINEASM &"s_add_u32 $0, $1, $2", 0 /* attdialect /, 2097162 / regdef:SReg_32 /, def %4, 2097161 / reguse:SReg_32 /, [[COPY2]], 2147483657 / reguse tiedto:$0 */, [[COPY3]](tied-def 3)
; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY %4		; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY %4
; CHECK-NEXT: $vgpr0 = COPY [[COPY4]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[COPY4]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0		; CHECK-NEXT: SI_RETURN implicit $vgpr0
entry:		entry:
%asm0 = tail call i32 asm "s_mov_b32 $0, 7", "=s"() nounwind		%asm0 = tail call i32 asm "s_mov_b32 $0, 7", "=s"() nounwind
%asm1 = tail call i32 asm "s_mov_b32 $0, 8", "=s"() nounwind		%asm1 = tail call i32 asm "s_mov_b32 $0, 8", "=s"() nounwind
%asm2 = tail call i32 asm "s_add_u32 $0, $1, $2", "=s,s,0"(i32 %asm0, i32 %asm1) nounwind		%asm2 = tail call i32 asm "s_add_u32 $0, $1, $2", "=s,s,0"(i32 %asm0, i32 %asm1) nounwind
ret i32 %asm2		ret i32 %asm2
}		}

define void @test_many_matching_constraints(i32 %a, i32 %b, i32 %c) nounwind {		define void @test_many_matching_constraints(i32 %a, i32 %b, i32 %c) nounwind {
; CHECK-LABEL: name: test_many_matching_constraints		; CHECK-LABEL: name: test_many_matching_constraints
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: liveins: $vgpr0, $vgpr1, $vgpr2		; CHECK-NEXT: liveins: $vgpr0, $vgpr1, $vgpr2
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1
; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2		; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p1) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p1) = G_IMPLICIT_DEF
; CHECK-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]](s32)		; CHECK-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]](s32)
; CHECK-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]](s32)		; CHECK-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]](s32)
; CHECK-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]](s32)		; CHECK-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]](s32)
; CHECK-NEXT: INLINEASM &"; ", 1 /* sideeffect attdialect /, 1835018 / regdef:VGPR_32 /, def %3, 1835018 / regdef:VGPR_32 /, def %4, 1835018 / regdef:VGPR_32 /, def %5, 2147483657 / reguse tiedto:$0 /, [[COPY3]](tied-def 3), 2147614729 / reguse tiedto:$2 /, [[COPY4]](tied-def 7), 2147549193 / reguse tiedto:$1 */, [[COPY5]](tied-def 5)		; CHECK-NEXT: INLINEASM &"; ", 1 /* sideeffect attdialect /, 1966090 / regdef:VGPR_32 /, def %3, 1966090 / regdef:VGPR_32 /, def %4, 1966090 / regdef:VGPR_32 /, def %5, 2147483657 / reguse tiedto:$0 /, [[COPY3]](tied-def 3), 2147614729 / reguse tiedto:$2 /, [[COPY4]](tied-def 7), 2147549193 / reguse tiedto:$1 */, [[COPY5]](tied-def 5)
; CHECK-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY %3		; CHECK-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY %3
; CHECK-NEXT: [[COPY7:%[0-9]+]]:_(s32) = COPY %4		; CHECK-NEXT: [[COPY7:%[0-9]+]]:_(s32) = COPY %4
; CHECK-NEXT: [[COPY8:%[0-9]+]]:_(s32) = COPY %5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:_(s32) = COPY %5
; CHECK-NEXT: G_STORE [[COPY6]](s32), [[DEF]](p1) :: (store (s32) into `i32 addrspace(1)* undef`, addrspace 1)		; CHECK-NEXT: G_STORE [[COPY6]](s32), [[DEF]](p1) :: (store (s32) into `i32 addrspace(1)* undef`, addrspace 1)
; CHECK-NEXT: G_STORE [[COPY7]](s32), [[DEF]](p1) :: (store (s32) into `i32 addrspace(1)* undef`, addrspace 1)		; CHECK-NEXT: G_STORE [[COPY7]](s32), [[DEF]](p1) :: (store (s32) into `i32 addrspace(1)* undef`, addrspace 1)
; CHECK-NEXT: G_STORE [[COPY8]](s32), [[DEF]](p1) :: (store (s32) into `i32 addrspace(1)* undef`, addrspace 1)		; CHECK-NEXT: G_STORE [[COPY8]](s32), [[DEF]](p1) :: (store (s32) into `i32 addrspace(1)* undef`, addrspace 1)
; CHECK-NEXT: SI_RETURN		; CHECK-NEXT: SI_RETURN
%asm = call {i32, i32, i32} asm sideeffect "; ", "=v,=v,=v,0,2,1"(i32 %c, i32 %a, i32 %b)		%asm = call {i32, i32, i32} asm sideeffect "; ", "=v,=v,=v,0,2,1"(i32 %c, i32 %a, i32 %b)
%asmresult0 = extractvalue {i32, i32, i32} %asm, 0		%asmresult0 = extractvalue {i32, i32, i32} %asm, 0
store i32 %asmresult0, i32 addrspace(1)* undef		store i32 %asmresult0, i32 addrspace(1)* undef
%asmresult1 = extractvalue {i32, i32, i32} %asm, 1		%asmresult1 = extractvalue {i32, i32, i32} %asm, 1
store i32 %asmresult1, i32 addrspace(1)* undef		store i32 %asmresult1, i32 addrspace(1)* undef
%asmresult2 = extractvalue {i32, i32, i32} %asm, 2		%asmresult2 = extractvalue {i32, i32, i32} %asm, 2
store i32 %asmresult2, i32 addrspace(1)* undef		store i32 %asmresult2, i32 addrspace(1)* undef
ret void		ret void
}		}

define i32 @test_sgpr_to_vgpr_move_matching_constraint() nounwind {		define i32 @test_sgpr_to_vgpr_move_matching_constraint() nounwind {
; CHECK-LABEL: name: test_sgpr_to_vgpr_move_matching_constraint		; CHECK-LABEL: name: test_sgpr_to_vgpr_move_matching_constraint
; CHECK: bb.1.entry:		; CHECK: bb.1.entry:
; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect /, 1966090 / regdef:SReg_32 */, def %0		; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect /, 2097162 / regdef:SReg_32 */, def %0
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %0
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]](s32)		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]](s32)
; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, $1", 0 /* attdialect /, 1835018 / regdef:VGPR_32 /, def %2, 2147483657 / reguse tiedto:$0 */, [[COPY1]](tied-def 3)		; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, $1", 0 /* attdialect /, 1966090 / regdef:VGPR_32 /, def %2, 2147483657 / reguse tiedto:$0 */, [[COPY1]](tied-def 3)
; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY %2		; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY %2
; CHECK-NEXT: $vgpr0 = COPY [[COPY2]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[COPY2]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0		; CHECK-NEXT: SI_RETURN implicit $vgpr0
entry:		entry:
%asm0 = tail call i32 asm "s_mov_b32 $0, 7", "=s"() nounwind		%asm0 = tail call i32 asm "s_mov_b32 $0, 7", "=s"() nounwind
%asm1 = tail call i32 asm "v_mov_b32 $0, $1", "=v,0"(i32 %asm0) nounwind		%asm1 = tail call i32 asm "v_mov_b32 $0, $1", "=v,0"(i32 %asm0) nounwind
ret i32 %asm1		ret i32 %asm1
}		}
Show All 11 Lines

llvm/test/CodeGen/AMDGPU/attr-amdgpu-flat-work-group-size-vgpr-limit.ll

	Show First 20 Lines • Show All 530 Lines • ▼ Show 20 Lines
	; GFX9: NumVgprs: 256			; GFX9: NumVgprs: 256
	; GFX90A: NumVgprs: 256			; GFX90A: NumVgprs: 256
	; GFX90A: NumAgprs: 0			; GFX90A: NumAgprs: 0
	; GFX90A: TotalNumVgprs: 256			; GFX90A: TotalNumVgprs: 256
	; GFX10WGP-WAVE32: NumVgprs: 256			; GFX10WGP-WAVE32: NumVgprs: 256
	; GFX10WGP-WAVE64: NumVgprs: 256			; GFX10WGP-WAVE64: NumVgprs: 256
	; GFX10CU-WAVE32: NumVgprs: 256			; GFX10CU-WAVE32: NumVgprs: 256
	; GFX10CU-WAVE64: NumVgprs: 256			; GFX10CU-WAVE64: NumVgprs: 256
	; GFX11WGP-WAVE32: NumVgprs: 128			; GFX11WGP-WAVE32: NumVgprs: 256
	; GFX11WGP-WAVE64: NumVgprs: 128			; GFX11WGP-WAVE64: NumVgprs: 256
	; GFX11CU-WAVE32: NumVgprs: 128			; GFX11CU-WAVE32: NumVgprs: 256
	; GFX11CU-WAVE64: NumVgprs: 128			; GFX11CU-WAVE64: NumVgprs: 256
	define amdgpu_kernel void @f256() #256 {			define amdgpu_kernel void @f256() #256 {
	call void @use256vgprs()			call void @use256vgprs()
	ret void			ret void
	}			}
	attributes #256 = { nounwind "amdgpu-flat-work-group-size"="256,256" }			attributes #256 = { nounwind "amdgpu-flat-work-group-size"="256,256" }

	; GCN-LABEL: {{^}}f512:			; GCN-LABEL: {{^}}f512:
	; GFX9: NumVgprs: 128			; GFX9: NumVgprs: 128
	; GFX90A: NumVgprs: 128			; GFX90A: NumVgprs: 128
	; GFX90A: NumAgprs: 128			; GFX90A: NumAgprs: 128
	; GFX90A: TotalNumVgprs: 256			; GFX90A: TotalNumVgprs: 256
	; GFX10WGP-WAVE32: NumVgprs: 256			; GFX10WGP-WAVE32: NumVgprs: 256
	; GFX10WGP-WAVE64: NumVgprs: 256			; GFX10WGP-WAVE64: NumVgprs: 256
	; GFX10CU-WAVE32: NumVgprs: 128			; GFX10CU-WAVE32: NumVgprs: 128
	; GFX10CU-WAVE64: NumVgprs: 128			; GFX10CU-WAVE64: NumVgprs: 128
	; GFX11WGP-WAVE32: NumVgprs: 128			; GFX11WGP-WAVE32: NumVgprs: 256
	; GFX11WGP-WAVE64: NumVgprs: 128			; GFX11WGP-WAVE64: NumVgprs: 256
	; GFX11CU-WAVE32: NumVgprs: 128			; GFX11CU-WAVE32: NumVgprs: 128
	; GFX11CU-WAVE64: NumVgprs: 128			; GFX11CU-WAVE64: NumVgprs: 128
	define amdgpu_kernel void @f512() #512 {			define amdgpu_kernel void @f512() #512 {
	call void @foo()			call void @foo()
	call void @use256vgprs()			call void @use256vgprs()
	ret void			ret void
	}			}
	attributes #512 = { nounwind "amdgpu-flat-work-group-size"="512,512" }			attributes #512 = { nounwind "amdgpu-flat-work-group-size"="512,512" }
	Show All 22 Lines

llvm/test/CodeGen/AMDGPU/coalescer-early-clobber-subreg.mir

	Show All 14 Lines
	tracksRegLiveness: true			tracksRegLiveness: true
	body: \|			body: \|
	bb.0:			bb.0:
	liveins: $vgpr0_vgpr1			liveins: $vgpr0_vgpr1

	; CHECK-LABEL: name: foo1			; CHECK-LABEL: name: foo1
	; CHECK: liveins: $vgpr0_vgpr1			; CHECK: liveins: $vgpr0_vgpr1
	; CHECK-NEXT: {{ $}}			; CHECK-NEXT: {{ $}}
	; CHECK-NEXT: INLINEASM &"", 0 /* attdialect /, 1835018 / regdef:VGPR_32 /, def undef %2.sub0, 1835019 / regdef-ec:VGPR_32 */, def undef early-clobber %2.sub1			; CHECK-NEXT: INLINEASM &"", 0 /* attdialect /, 1835018 / regdef:VRegOrLds_32 /, def undef %2.sub0, 1835019 / regdef-ec:VRegOrLds_32 */, def undef early-clobber %2.sub1
	; CHECK-NEXT: FLAT_STORE_DWORDX2 $vgpr0_vgpr1, %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))			; CHECK-NEXT: FLAT_STORE_DWORDX2 $vgpr0_vgpr1, %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))
	; CHECK-NEXT: S_ENDPGM 0			; CHECK-NEXT: S_ENDPGM 0
	INLINEASM &"", 0 /* attdialect /, 1835018 / regdef:VGPR_32 /, def %0:vgpr_32, 1835019 / regdef-ec:VGPR_32 */, def early-clobber %1:vgpr_32			INLINEASM &"", 0 /* attdialect /, 1835018 / regdef:VGPR_32 /, def %0:vgpr_32, 1835019 / regdef-ec:VGPR_32 */, def early-clobber %1:vgpr_32
	undef %2.sub0:vreg_64 = COPY killed %0			undef %2.sub0:vreg_64 = COPY killed %0
	%2.sub1:vreg_64 = COPY killed %1			%2.sub1:vreg_64 = COPY killed %1
	FLAT_STORE_DWORDX2 killed $vgpr0_vgpr1, killed %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))			FLAT_STORE_DWORDX2 killed $vgpr0_vgpr1, killed %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))
	S_ENDPGM 0			S_ENDPGM 0

	...			...

	---			---
	name: foo2			name: foo2
	tracksRegLiveness: true			tracksRegLiveness: true
	body: \|			body: \|
	bb.0:			bb.0:
	liveins: $vgpr0_vgpr1			liveins: $vgpr0_vgpr1

	; CHECK-LABEL: name: foo2			; CHECK-LABEL: name: foo2
	; CHECK: liveins: $vgpr0_vgpr1			; CHECK: liveins: $vgpr0_vgpr1
	; CHECK-NEXT: {{ $}}			; CHECK-NEXT: {{ $}}
	; CHECK-NEXT: INLINEASM &"", 0 /* attdialect /, 1835019 / regdef-ec:VGPR_32 /, def undef early-clobber %2.sub1, 1835018 / regdef:VGPR_32 */, def undef %2.sub0			; CHECK-NEXT: INLINEASM &"", 0 /* attdialect /, 1835019 / regdef-ec:VRegOrLds_32 /, def undef early-clobber %2.sub1, 1835018 / regdef:VRegOrLds_32 */, def undef %2.sub0
	; CHECK-NEXT: FLAT_STORE_DWORDX2 $vgpr0_vgpr1, %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))			; CHECK-NEXT: FLAT_STORE_DWORDX2 $vgpr0_vgpr1, %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))
	; CHECK-NEXT: S_ENDPGM 0			; CHECK-NEXT: S_ENDPGM 0
	INLINEASM &"", 0 /* attdialect /, 1835019 / regdef-ec:VGPR_32 /, def early-clobber %1:vgpr_32, 1835018 / regdef:VGPR_32 */, def %0:vgpr_32			INLINEASM &"", 0 /* attdialect /, 1835019 / regdef-ec:VGPR_32 /, def early-clobber %1:vgpr_32, 1835018 / regdef:VGPR_32 */, def %0:vgpr_32
	undef %2.sub0:vreg_64 = COPY killed %0			undef %2.sub0:vreg_64 = COPY killed %0
	%2.sub1:vreg_64 = COPY killed %1			%2.sub1:vreg_64 = COPY killed %1
	FLAT_STORE_DWORDX2 killed $vgpr0_vgpr1, killed %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))			FLAT_STORE_DWORDX2 killed $vgpr0_vgpr1, killed %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))
	S_ENDPGM 0			S_ENDPGM 0

	...			...

	---			---
	name: foo3			name: foo3
	tracksRegLiveness: true			tracksRegLiveness: true
	body: \|			body: \|
	bb.0:			bb.0:
	liveins: $vgpr0_vgpr1			liveins: $vgpr0_vgpr1

	; CHECK-LABEL: name: foo3			; CHECK-LABEL: name: foo3
	; CHECK: liveins: $vgpr0_vgpr1			; CHECK: liveins: $vgpr0_vgpr1
	; CHECK-NEXT: {{ $}}			; CHECK-NEXT: {{ $}}
	; CHECK-NEXT: INLINEASM &"", 0 /* attdialect /, 1835018 / regdef:VGPR_32 /, def undef %2.sub0, 1835019 / regdef-ec:VGPR_32 */, def undef early-clobber %2.sub1			; CHECK-NEXT: INLINEASM &"", 0 /* attdialect /, 1835018 / regdef:VRegOrLds_32 /, def undef %2.sub0, 1835019 / regdef-ec:VRegOrLds_32 */, def undef early-clobber %2.sub1
	; CHECK-NEXT: FLAT_STORE_DWORDX2 $vgpr0_vgpr1, %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))			; CHECK-NEXT: FLAT_STORE_DWORDX2 $vgpr0_vgpr1, %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))
	; CHECK-NEXT: S_ENDPGM 0			; CHECK-NEXT: S_ENDPGM 0
	INLINEASM &"", 0 /* attdialect /, 1835018 / regdef:VGPR_32 /, def %1:vgpr_32, 1835019 / regdef-ec:VGPR_32 */, def early-clobber %0:vgpr_32			INLINEASM &"", 0 /* attdialect /, 1835018 / regdef:VGPR_32 /, def %1:vgpr_32, 1835019 / regdef-ec:VGPR_32 */, def early-clobber %0:vgpr_32
	undef %2.sub0:vreg_64 = COPY killed %1			undef %2.sub0:vreg_64 = COPY killed %1
	%2.sub1:vreg_64 = COPY killed %0			%2.sub1:vreg_64 = COPY killed %0
	FLAT_STORE_DWORDX2 killed $vgpr0_vgpr1, killed %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))			FLAT_STORE_DWORDX2 killed $vgpr0_vgpr1, killed %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))
	S_ENDPGM 0			S_ENDPGM 0

	...			...

	---			---
	name: foo4			name: foo4
	tracksRegLiveness: true			tracksRegLiveness: true
	body: \|			body: \|
	bb.0:			bb.0:
	liveins: $vgpr0_vgpr1			liveins: $vgpr0_vgpr1

	; CHECK-LABEL: name: foo4			; CHECK-LABEL: name: foo4
	; CHECK: liveins: $vgpr0_vgpr1			; CHECK: liveins: $vgpr0_vgpr1
	; CHECK-NEXT: {{ $}}			; CHECK-NEXT: {{ $}}
	; CHECK-NEXT: INLINEASM &"", 0 /* attdialect /, 1835019 / regdef-ec:VGPR_32 /, def undef early-clobber %2.sub1, 1835018 / regdef:VGPR_32 */, def undef %2.sub0			; CHECK-NEXT: INLINEASM &"", 0 /* attdialect /, 1835019 / regdef-ec:VRegOrLds_32 /, def undef early-clobber %2.sub1, 1835018 / regdef:VRegOrLds_32 */, def undef %2.sub0
	; CHECK-NEXT: FLAT_STORE_DWORDX2 $vgpr0_vgpr1, %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))			; CHECK-NEXT: FLAT_STORE_DWORDX2 $vgpr0_vgpr1, %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))
	; CHECK-NEXT: S_ENDPGM 0			; CHECK-NEXT: S_ENDPGM 0
	INLINEASM &"", 0 /* attdialect /, 1835019 / regdef-ec:VGPR_32 /, def early-clobber %0:vgpr_32, 1835018 / regdef:VGPR_32 */, def %1:vgpr_32			INLINEASM &"", 0 /* attdialect /, 1835019 / regdef-ec:VGPR_32 /, def early-clobber %0:vgpr_32, 1835018 / regdef:VGPR_32 */, def %1:vgpr_32
	undef %2.sub0:vreg_64 = COPY killed %1			undef %2.sub0:vreg_64 = COPY killed %1
	%2.sub1:vreg_64 = COPY killed %0			%2.sub1:vreg_64 = COPY killed %0
	FLAT_STORE_DWORDX2 killed $vgpr0_vgpr1, killed %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))			FLAT_STORE_DWORDX2 killed $vgpr0_vgpr1, killed %2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64))
	S_ENDPGM 0			S_ENDPGM 0

	...			...

llvm/test/CodeGen/AMDGPU/gfx10-shrink-mad-fma.mir

This file was moved to llvm/test/CodeGen/AMDGPU/shrink-mad-fma.mir.

llvm/test/CodeGen/AMDGPU/gfx10-twoaddr-fma.mir

This file was added.

				# RUN: llc -march=amdgcn -mcpu=gfx1010 %s -run-pass twoaddressinstruction -verify-machineinstrs -o - \| FileCheck --check-prefixes=GFX10 %s

				# GFX10-LABEL: name: test_fmamk_reg_imm_f16
				# GFX10: %2:vgpr_32 = IMPLICIT_DEF
				# GFX10-NOT: V_MOV_B32
				# GFX10: V_FMAMK_F16 killed %0.sub0, 1078523331, killed %1, implicit $mode, implicit $exec
				---
				name: test_fmamk_reg_imm_f16
				registers:
				- { id: 0, class: vreg_64 }
				- { id: 1, class: vgpr_32 }
				- { id: 2, class: vgpr_32 }
				- { id: 3, class: vgpr_32 }
				body: \|
				bb.0:

				%0 = IMPLICIT_DEF
				%1 = COPY %0.sub1
				%2 = V_MOV_B32_e32 1078523331, implicit $exec
				%3 = V_FMAC_F16_e32 killed %0.sub0, %2, killed %1, implicit $mode, implicit $exec

				...

				# GFX10-LABEL: name: test_fmamk_imm_reg_f16
				# GFX10: %2:vgpr_32 = IMPLICIT_DEF
				# GFX10-NOT: V_MOV_B32
				# GFX10: V_FMAMK_F16 killed %0.sub0, 1078523331, killed %1, implicit $mode, implicit $exec
				---
				name: test_fmamk_imm_reg_f16
				registers:
				- { id: 0, class: vreg_64 }
				- { id: 1, class: vgpr_32 }
				- { id: 2, class: vgpr_32 }
				- { id: 3, class: vgpr_32 }
				body: \|
				bb.0:

				%0 = IMPLICIT_DEF
				%1 = COPY %0.sub1
				%2 = V_MOV_B32_e32 1078523331, implicit $exec
				%3 = V_FMAC_F16_e32 %2, killed %0.sub0, killed %1, implicit $mode, implicit $exec

				...

				# GFX10-LABEL: name: test_fmaak_f16
				# GFX10: %1:vgpr_32 = IMPLICIT_DEF
				# GFX10-NOT: V_MOV_B32
				# GFX10: V_FMAAK_F16 killed %0.sub0, %0.sub1, 1078523331, implicit $mode, implicit $exec
				---
				name: test_fmaak_f16
				registers:
				- { id: 0, class: vreg_64 }
				- { id: 1, class: vgpr_32 }
				- { id: 2, class: vgpr_32 }
				body: \|
				bb.0:

				%0 = IMPLICIT_DEF
				%1 = V_MOV_B32_e32 1078523331, implicit $exec
				%2 = V_FMAC_F16_e32 killed %0.sub0, %0.sub1, %1, implicit $mode, implicit $exec
				...

				# GFX10-LABEL: name: test_fmaak_inline_literal_f16
				# GFX10: %1:vgpr_32 = IMPLICIT_DEF
				# GFX10-NOT: V_MOV_B32
				# GFX10: %2:vgpr_32 = V_FMAAK_F16 16384, killed %0, 49664, implicit $mode, implicit $exec

				---
				name: test_fmaak_inline_literal_f16
				tracksRegLiveness: true
				liveins:
				- { reg: '$vgpr0', virtual-reg: '%0' }
				body: \|
				bb.0:
				liveins: $vgpr0

				%0:vgpr_32 = COPY killed $vgpr0

				%1:vgpr_32 = V_MOV_B32_e32 49664, implicit $exec
				%2:vgpr_32 = V_FMAC_F16_e32 16384, killed %0, %1, implicit $mode, implicit $exec
				S_ENDPGM 0

				...

llvm/test/CodeGen/AMDGPU/gfx11-twoaddr-fma.mir

This file was added.

				# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
				# RUN: llc -march=amdgcn -mcpu=gfx1100 %s -run-pass twoaddressinstruction -verify-machineinstrs -o - \| FileCheck --check-prefixes=GFX11 %s

				---
				name: test_fmamk_reg_imm_f16
				registers:
				- { id: 0, class: vreg_64 }
				- { id: 1, class: vgpr_32 }
				- { id: 2, class: vgpr_32 }
				- { id: 3, class: vgpr_32 }
				- { id: 4, class: vgpr_32 }
				body: \|
				bb.0:

				; GFX11-LABEL: name: test_fmamk_reg_imm_f16
				; GFX11: [[DEF:%[0-9]+]]:vreg_64 = IMPLICIT_DEF
				; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]].sub1
				; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[DEF]].sub0
				; GFX11-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1078523331, implicit $exec
				; GFX11-NEXT: [[V_FMA_F16_gfx9_e64_:%[0-9]+]]:vgpr_32 = V_FMA_F16_gfx9_e64 0, killed [[COPY1]], 0, [[V_MOV_B32_e32_]], 0, killed [[COPY]], 0, 0, implicit $mode, implicit $exec
				%0 = IMPLICIT_DEF
				%1 = COPY %0.sub1
				%2 = COPY %0.sub0
				%3 = V_MOV_B32_e32 1078523331, implicit $exec
				%4 = V_FMAC_F16_t16_e64 0, killed %2, 0, %3, 0, killed %1, 0, 0, implicit $mode, implicit $exec

				...

				---
				name: test_fmamk_imm_reg_f16
				registers:
				- { id: 0, class: vreg_64 }
				- { id: 1, class: vgpr_32 }
				- { id: 2, class: vgpr_32 }
				- { id: 3, class: vgpr_32 }
				- { id: 4, class: vgpr_32 }
				body: \|
				bb.0:

				; GFX11-LABEL: name: test_fmamk_imm_reg_f16
				; GFX11: [[DEF:%[0-9]+]]:vreg_64 = IMPLICIT_DEF
				; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]].sub1
				; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[DEF]].sub0
				; GFX11-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1078523331, implicit $exec
				; GFX11-NEXT: [[V_FMA_F16_gfx9_e64_:%[0-9]+]]:vgpr_32 = V_FMA_F16_gfx9_e64 0, [[COPY1]], 0, killed [[V_MOV_B32_e32_]], 0, killed [[COPY]], 0, 0, implicit $mode, implicit $exec
				%0 = IMPLICIT_DEF
				%1 = COPY %0.sub1
				%2 = COPY %0.sub0
				%3 = V_MOV_B32_e32 1078523331, implicit $exec
				%4 = V_FMAC_F16_t16_e64 0, %2, 0, killed %3, 0, killed %1, 0, 0, implicit $mode, implicit $exec

				...

				---
				name: test_fmaak_f16
				registers:
				- { id: 0, class: vreg_64 }
				- { id: 1, class: vgpr_32 }
				- { id: 2, class: vgpr_32 }
				- { id: 3, class: vgpr_32 }
				- { id: 4, class: vgpr_32 }
				body: \|
				bb.0:

				; GFX11-LABEL: name: test_fmaak_f16
				; GFX11: [[DEF:%[0-9]+]]:vreg_64 = IMPLICIT_DEF
				; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]].sub0
				; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[DEF]].sub1
				; GFX11-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1078523331, implicit $exec
				; GFX11-NEXT: [[V_FMA_F16_gfx9_e64_:%[0-9]+]]:vgpr_32 = V_FMA_F16_gfx9_e64 0, killed [[COPY]], 0, [[COPY1]], 0, [[V_MOV_B32_e32_]], 0, 0, implicit $mode, implicit $exec
				%0 = IMPLICIT_DEF
				%1 = COPY %0.sub0
				%2 = COPY %0.sub1
				%3 = V_MOV_B32_e32 1078523331, implicit $exec
				%4 = V_FMAC_F16_t16_e64 0, killed %1, 0, %2, 0, %3, 0, 0, implicit $mode, implicit $exec
				...

				---
				name: test_fmaak_inline_literal_f16
				tracksRegLiveness: true
				liveins:
				- { reg: '$vgpr0', virtual-reg: '%0' }
				body: \|
				bb.0:
				liveins: $vgpr0

				; GFX11-LABEL: name: test_fmaak_inline_literal_f16
				; GFX11: liveins: $vgpr0
				; GFX11-NEXT: {{ $}}
				; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY killed $vgpr0
				; GFX11-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 49664, implicit $exec
				; GFX11-NEXT: [[V_FMA_F16_gfx9_e64_:%[0-9]+]]:vgpr_32 = V_FMA_F16_gfx9_e64 0, 16384, 0, killed [[COPY]], 0, [[V_MOV_B32_e32_]], 0, 0, implicit $mode, implicit $exec
				; GFX11-NEXT: S_ENDPGM 0
				%0:vgpr_32 = COPY killed $vgpr0

				%1:vgpr_32 = V_MOV_B32_e32 49664, implicit $exec
				%2:vgpr_32 = V_FMAC_F16_t16_e64 0, 16384, 0, killed %0, 0, %1, 0, 0, implicit $mode, implicit $exec
				S_ENDPGM 0

				...

llvm/test/CodeGen/AMDGPU/inline-asm.i128.ll

	; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -stop-after=finalize-isel -o - %s \| FileCheck -check-prefix=GFX908 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -stop-after=finalize-isel -o - %s \| FileCheck -check-prefix=GFX908 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -stop-after=finalize-isel -o - %s \| FileCheck -check-prefix=GFX90A %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -stop-after=finalize-isel -o - %s \| FileCheck -check-prefix=GFX90A %s

	; Make sure we only use one 128-bit register instead of 2 for i128 asm			; Make sure we only use one 128-bit register instead of 2 for i128 asm
	; constraints			; constraints

	define amdgpu_kernel void @s_input_output_i128() {			define amdgpu_kernel void @s_input_output_i128() {
	; GFX908-LABEL: name: s_input_output_i128			; GFX908-LABEL: name: s_input_output_i128
	; GFX908: bb.0 (%ir-block.0):			; GFX908: bb.0 (%ir-block.0):
	; GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 5242890 / regdef:SGPR_128 */, def %4			; GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 6881290 / regdef:SGPR_128 */, def %4
	; GFX908-NEXT: [[COPY:%[0-9]+]]:sgpr_128 = COPY %4			; GFX908-NEXT: [[COPY:%[0-9]+]]:sgpr_128 = COPY %4
	; GFX908-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 5242889 / reguse:SGPR_128 */, [[COPY]]			; GFX908-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 6881289 / reguse:SGPR_128 */, [[COPY]]
	; GFX908-NEXT: S_ENDPGM 0			; GFX908-NEXT: S_ENDPGM 0
	; GFX90A-LABEL: name: s_input_output_i128			; GFX90A-LABEL: name: s_input_output_i128
	; GFX90A: bb.0 (%ir-block.0):			; GFX90A: bb.0 (%ir-block.0):
	; GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 5242890 / regdef:SGPR_128 */, def %4			; GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 6881290 / regdef:SGPR_128 */, def %4
	; GFX90A-NEXT: [[COPY:%[0-9]+]]:sgpr_128 = COPY %4			; GFX90A-NEXT: [[COPY:%[0-9]+]]:sgpr_128 = COPY %4
	; GFX90A-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 5242889 / reguse:SGPR_128 */, [[COPY]]			; GFX90A-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 6881289 / reguse:SGPR_128 */, [[COPY]]
	; GFX90A-NEXT: S_ENDPGM 0			; GFX90A-NEXT: S_ENDPGM 0
	%val = tail call i128 asm sideeffect "; def $0", "=s"()			%val = tail call i128 asm sideeffect "; def $0", "=s"()
	call void asm sideeffect "; use $0", "s"(i128 %val)			call void asm sideeffect "; use $0", "s"(i128 %val)
	ret void			ret void
	}			}

	define amdgpu_kernel void @v_input_output_i128() {			define amdgpu_kernel void @v_input_output_i128() {
	; GFX908-LABEL: name: v_input_output_i128			; GFX908-LABEL: name: v_input_output_i128
	; GFX908: bb.0 (%ir-block.0):			; GFX908: bb.0 (%ir-block.0):
	; GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 4784138 / regdef:VReg_128 */, def %4			; GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 5832714 / regdef:VReg_128 */, def %4
	; GFX908-NEXT: [[COPY:%[0-9]+]]:vreg_128 = COPY %4			; GFX908-NEXT: [[COPY:%[0-9]+]]:vreg_128 = COPY %4
	; GFX908-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 4784137 / reguse:VReg_128 */, [[COPY]]			; GFX908-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 5832713 / reguse:VReg_128 */, [[COPY]]
	; GFX908-NEXT: S_ENDPGM 0			; GFX908-NEXT: S_ENDPGM 0
	; GFX90A-LABEL: name: v_input_output_i128			; GFX90A-LABEL: name: v_input_output_i128
	; GFX90A: bb.0 (%ir-block.0):			; GFX90A: bb.0 (%ir-block.0):
	; GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 4980746 / regdef:VReg_128_Align2 */, def %4			; GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 6160394 / regdef:VReg_128_Align2 */, def %4
	; GFX90A-NEXT: [[COPY:%[0-9]+]]:vreg_128_align2 = COPY %4			; GFX90A-NEXT: [[COPY:%[0-9]+]]:vreg_128_align2 = COPY %4
	; GFX90A-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 4980745 / reguse:VReg_128_Align2 */, [[COPY]]			; GFX90A-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 6160393 / reguse:VReg_128_Align2 */, [[COPY]]
	; GFX90A-NEXT: S_ENDPGM 0			; GFX90A-NEXT: S_ENDPGM 0
	%val = tail call i128 asm sideeffect "; def $0", "=v"()			%val = tail call i128 asm sideeffect "; def $0", "=v"()
	call void asm sideeffect "; use $0", "v"(i128 %val)			call void asm sideeffect "; use $0", "v"(i128 %val)
	ret void			ret void
	}			}

	define amdgpu_kernel void @a_input_output_i128() {			define amdgpu_kernel void @a_input_output_i128() {
	; GFX908-LABEL: name: a_input_output_i128			; GFX908-LABEL: name: a_input_output_i128
	; GFX908: bb.0 (%ir-block.0):			; GFX908: bb.0 (%ir-block.0):
	; GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 4718602 / regdef:AReg_128 */, def %4			; GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 5767178 / regdef:AReg_128 */, def %4
	; GFX908-NEXT: [[COPY:%[0-9]+]]:areg_128 = COPY %4			; GFX908-NEXT: [[COPY:%[0-9]+]]:areg_128 = COPY %4
	; GFX908-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 4718601 / reguse:AReg_128 */, [[COPY]]			; GFX908-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 5767177 / reguse:AReg_128 */, [[COPY]]
	; GFX908-NEXT: S_ENDPGM 0			; GFX908-NEXT: S_ENDPGM 0
	; GFX90A-LABEL: name: a_input_output_i128			; GFX90A-LABEL: name: a_input_output_i128
	; GFX90A: bb.0 (%ir-block.0):			; GFX90A: bb.0 (%ir-block.0):
	; GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 4915210 / regdef:AReg_128_Align2 */, def %4			; GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 6029322 / regdef:AReg_128_Align2 */, def %4
	; GFX90A-NEXT: [[COPY:%[0-9]+]]:areg_128_align2 = COPY %4			; GFX90A-NEXT: [[COPY:%[0-9]+]]:areg_128_align2 = COPY %4
	; GFX90A-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 4915209 / reguse:AReg_128_Align2 */, [[COPY]]			; GFX90A-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 6029321 / reguse:AReg_128_Align2 */, [[COPY]]
	; GFX90A-NEXT: S_ENDPGM 0			; GFX90A-NEXT: S_ENDPGM 0
	%val = call i128 asm sideeffect "; def $0", "=a"()			%val = call i128 asm sideeffect "; def $0", "=a"()
	call void asm sideeffect "; use $0", "a"(i128 %val)			call void asm sideeffect "; use $0", "a"(i128 %val)
	ret void			ret void
	}			}

llvm/test/CodeGen/AMDGPU/partial-regcopy-and-spill-missed-at-regalloc.ll

	; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
	;RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 --stop-after=greedy,1 -verify-machineinstrs < %s \| FileCheck -check-prefix=REGALLOC-GFX908 %s			;RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 --stop-after=greedy,1 -verify-machineinstrs < %s \| FileCheck -check-prefix=REGALLOC-GFX908 %s
	;RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 --stop-after=prologepilog -verify-machineinstrs < %s \| FileCheck -check-prefix=PEI-GFX908 %s			;RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 --stop-after=prologepilog -verify-machineinstrs < %s \| FileCheck -check-prefix=PEI-GFX908 %s
	;RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a --stop-after=greedy,1 -verify-machineinstrs < %s \| FileCheck -check-prefix=REGALLOC-GFX90A %s			;RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a --stop-after=greedy,1 -verify-machineinstrs < %s \| FileCheck -check-prefix=REGALLOC-GFX90A %s
	;RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a --stop-after=prologepilog -verify-machineinstrs < %s \| FileCheck -check-prefix=PEI-GFX90A %s			;RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a --stop-after=prologepilog -verify-machineinstrs < %s \| FileCheck -check-prefix=PEI-GFX90A %s

	; Partial reg copy and spill missed during regalloc handled later at frame lowering.			; Partial reg copy and spill missed during regalloc handled later at frame lowering.
	define amdgpu_kernel void @partial_copy(<4 x i32> %arg) #0 {			define amdgpu_kernel void @partial_copy(<4 x i32> %arg) #0 {
	; REGALLOC-GFX908-LABEL: name: partial_copy			; REGALLOC-GFX908-LABEL: name: partial_copy
	; REGALLOC-GFX908: bb.0 (%ir-block.0):			; REGALLOC-GFX908: bb.0 (%ir-block.0):
	; REGALLOC-GFX908-NEXT: liveins: $sgpr4_sgpr5			; REGALLOC-GFX908-NEXT: liveins: $sgpr4_sgpr5
	; REGALLOC-GFX908-NEXT: {{ $}}			; REGALLOC-GFX908-NEXT: {{ $}}
	; REGALLOC-GFX908-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 1769481 / reguse:AGPR_32 */, undef %5:agpr_32			; REGALLOC-GFX908-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 1900553 / reguse:AGPR_32 */, undef %5:agpr_32
	; REGALLOC-GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 4784138 / regdef:VReg_128 */, def %26			; REGALLOC-GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 5832714 / regdef:VReg_128 */, def %26
	; REGALLOC-GFX908-NEXT: [[COPY:%[0-9]+]]:av_128 = COPY %26			; REGALLOC-GFX908-NEXT: [[COPY:%[0-9]+]]:av_128 = COPY %26
	; REGALLOC-GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 2949130 / regdef:VReg_64 */, def %23			; REGALLOC-GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 3211274 / regdef:VReg_64 */, def %23
	; REGALLOC-GFX908-NEXT: SI_SPILL_V64_SAVE %23, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, align 4, addrspace 5)			; REGALLOC-GFX908-NEXT: SI_SPILL_V64_SAVE %23, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, align 4, addrspace 5)
	; REGALLOC-GFX908-NEXT: [[COPY1:%[0-9]+]]:vreg_128 = COPY [[COPY]]			; REGALLOC-GFX908-NEXT: [[COPY1:%[0-9]+]]:vreg_128 = COPY [[COPY]]
	; REGALLOC-GFX908-NEXT: GLOBAL_STORE_DWORDX4 undef %14:vreg_64, [[COPY1]], 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)			; REGALLOC-GFX908-NEXT: GLOBAL_STORE_DWORDX4 undef %14:vreg_64, [[COPY1]], 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)
	; REGALLOC-GFX908-NEXT: renamable $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM killed renamable $sgpr4_sgpr5, 0, 0 :: (dereferenceable invariant load (s128) from %ir.arg.kernarg.offset.cast, addrspace 4)			; REGALLOC-GFX908-NEXT: renamable $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM killed renamable $sgpr4_sgpr5, 0, 0 :: (dereferenceable invariant load (s128) from %ir.arg.kernarg.offset.cast, addrspace 4)
	; REGALLOC-GFX908-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1, implicit $exec			; REGALLOC-GFX908-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
	; REGALLOC-GFX908-NEXT: [[COPY2:%[0-9]+]]:areg_128 = COPY killed renamable $sgpr0_sgpr1_sgpr2_sgpr3			; REGALLOC-GFX908-NEXT: [[COPY2:%[0-9]+]]:areg_128 = COPY killed renamable $sgpr0_sgpr1_sgpr2_sgpr3
	; REGALLOC-GFX908-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 2, implicit $exec			; REGALLOC-GFX908-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 2, implicit $exec
	; REGALLOC-GFX908-NEXT: [[V_MFMA_I32_4X4X4I8_e64_:%[0-9]+]]:areg_128 = V_MFMA_I32_4X4X4I8_e64 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], [[COPY2]], 0, 0, 0, implicit $mode, implicit $exec			; REGALLOC-GFX908-NEXT: [[V_MFMA_I32_4X4X4I8_e64_:%[0-9]+]]:areg_128 = V_MFMA_I32_4X4X4I8_e64 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], [[COPY2]], 0, 0, 0, implicit $mode, implicit $exec
	; REGALLOC-GFX908-NEXT: [[SI_SPILL_V64_RESTORE:%[0-9]+]]:vreg_64 = SI_SPILL_V64_RESTORE %stack.0, $sgpr32, 0, implicit $exec :: (load (s64) from %stack.0, align 4, addrspace 5)			; REGALLOC-GFX908-NEXT: [[SI_SPILL_V64_RESTORE:%[0-9]+]]:vreg_64 = SI_SPILL_V64_RESTORE %stack.0, $sgpr32, 0, implicit $exec :: (load (s64) from %stack.0, align 4, addrspace 5)
	; REGALLOC-GFX908-NEXT: GLOBAL_STORE_DWORDX2 undef %16:vreg_64, [[SI_SPILL_V64_RESTORE]], 0, 0, implicit $exec :: (volatile store (s64) into `<2 x i32> addrspace(1)* undef`, addrspace 1)			; REGALLOC-GFX908-NEXT: GLOBAL_STORE_DWORDX2 undef %16:vreg_64, [[SI_SPILL_V64_RESTORE]], 0, 0, implicit $exec :: (volatile store (s64) into `<2 x i32> addrspace(1)* undef`, addrspace 1)
	; REGALLOC-GFX908-NEXT: [[COPY3:%[0-9]+]]:vreg_128 = COPY [[V_MFMA_I32_4X4X4I8_e64_]]			; REGALLOC-GFX908-NEXT: [[COPY3:%[0-9]+]]:vreg_128 = COPY [[V_MFMA_I32_4X4X4I8_e64_]]
	; REGALLOC-GFX908-NEXT: GLOBAL_STORE_DWORDX4 undef %18:vreg_64, [[COPY3]], 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)			; REGALLOC-GFX908-NEXT: GLOBAL_STORE_DWORDX4 undef %18:vreg_64, [[COPY3]], 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)
	; REGALLOC-GFX908-NEXT: S_ENDPGM 0			; REGALLOC-GFX908-NEXT: S_ENDPGM 0
	; PEI-GFX908-LABEL: name: partial_copy			; PEI-GFX908-LABEL: name: partial_copy
	; PEI-GFX908: bb.0 (%ir-block.0):			; PEI-GFX908: bb.0 (%ir-block.0):
	; PEI-GFX908-NEXT: liveins: $agpr4, $sgpr4_sgpr5, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr7			; PEI-GFX908-NEXT: liveins: $agpr4, $sgpr4_sgpr5, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr7
	; PEI-GFX908-NEXT: {{ $}}			; PEI-GFX908-NEXT: {{ $}}
	; PEI-GFX908-NEXT: $sgpr8_sgpr9_sgpr10_sgpr11 = COPY killed $sgpr0_sgpr1_sgpr2_sgpr3			; PEI-GFX908-NEXT: $sgpr8_sgpr9_sgpr10_sgpr11 = COPY killed $sgpr0_sgpr1_sgpr2_sgpr3
	; PEI-GFX908-NEXT: $sgpr8 = S_ADD_U32 $sgpr8, $sgpr7, implicit-def $scc, implicit-def $sgpr8_sgpr9_sgpr10_sgpr11			; PEI-GFX908-NEXT: $sgpr8 = S_ADD_U32 $sgpr8, $sgpr7, implicit-def $scc, implicit-def $sgpr8_sgpr9_sgpr10_sgpr11
	; PEI-GFX908-NEXT: $sgpr9 = S_ADDC_U32 $sgpr9, 0, implicit-def dead $scc, implicit $scc, implicit-def $sgpr8_sgpr9_sgpr10_sgpr11			; PEI-GFX908-NEXT: $sgpr9 = S_ADDC_U32 $sgpr9, 0, implicit-def dead $scc, implicit $scc, implicit-def $sgpr8_sgpr9_sgpr10_sgpr11
	; PEI-GFX908-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 1769481 / reguse:AGPR_32 */, undef renamable $agpr0			; PEI-GFX908-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 1900553 / reguse:AGPR_32 */, undef renamable $agpr0
	; PEI-GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 4784138 / regdef:VReg_128 */, def renamable $vgpr0_vgpr1_vgpr2_vgpr3			; PEI-GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 5832714 / regdef:VReg_128 */, def renamable $vgpr0_vgpr1_vgpr2_vgpr3
	; PEI-GFX908-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = COPY killed renamable $vgpr0_vgpr1_vgpr2_vgpr3, implicit $exec			; PEI-GFX908-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = COPY killed renamable $vgpr0_vgpr1_vgpr2_vgpr3, implicit $exec
	; PEI-GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 2949130 / regdef:VReg_64 */, def renamable $vgpr0_vgpr1			; PEI-GFX908-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 3211274 / regdef:VReg_64 */, def renamable $vgpr0_vgpr1
	; PEI-GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET killed $vgpr0, $sgpr8_sgpr9_sgpr10_sgpr11, 0, 4, 0, 0, 0, implicit $exec, implicit-def $vgpr0_vgpr1, implicit $vgpr0_vgpr1 :: (store (s32) into %stack.0, addrspace 5)			; PEI-GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET killed $vgpr0, $sgpr8_sgpr9_sgpr10_sgpr11, 0, 4, 0, 0, 0, implicit $exec, implicit-def $vgpr0_vgpr1, implicit $vgpr0_vgpr1 :: (store (s32) into %stack.0, addrspace 5)
	; PEI-GFX908-NEXT: $agpr4 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr1, implicit $exec, implicit killed $vgpr0_vgpr1			; PEI-GFX908-NEXT: $agpr4 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr1, implicit $exec, implicit killed $vgpr0_vgpr1
	; PEI-GFX908-NEXT: renamable $vgpr0_vgpr1_vgpr2_vgpr3 = COPY killed renamable $agpr0_agpr1_agpr2_agpr3, implicit $exec			; PEI-GFX908-NEXT: renamable $vgpr0_vgpr1_vgpr2_vgpr3 = COPY killed renamable $agpr0_agpr1_agpr2_agpr3, implicit $exec
	; PEI-GFX908-NEXT: GLOBAL_STORE_DWORDX4 undef renamable $vgpr0_vgpr1, killed renamable $vgpr0_vgpr1_vgpr2_vgpr3, 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)			; PEI-GFX908-NEXT: GLOBAL_STORE_DWORDX4 undef renamable $vgpr0_vgpr1, killed renamable $vgpr0_vgpr1_vgpr2_vgpr3, 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)
	; PEI-GFX908-NEXT: renamable $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM killed renamable $sgpr4_sgpr5, 0, 0 :: (dereferenceable invariant load (s128) from %ir.arg.kernarg.offset.cast, addrspace 4)			; PEI-GFX908-NEXT: renamable $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM killed renamable $sgpr4_sgpr5, 0, 0 :: (dereferenceable invariant load (s128) from %ir.arg.kernarg.offset.cast, addrspace 4)
	; PEI-GFX908-NEXT: renamable $vgpr0 = V_MOV_B32_e32 1, implicit $exec			; PEI-GFX908-NEXT: renamable $vgpr0 = V_MOV_B32_e32 1, implicit $exec
	; PEI-GFX908-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = COPY killed renamable $sgpr0_sgpr1_sgpr2_sgpr3, implicit $exec			; PEI-GFX908-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = COPY killed renamable $sgpr0_sgpr1_sgpr2_sgpr3, implicit $exec
	; PEI-GFX908-NEXT: renamable $vgpr1 = V_MOV_B32_e32 2, implicit $exec			; PEI-GFX908-NEXT: renamable $vgpr1 = V_MOV_B32_e32 2, implicit $exec
	; PEI-GFX908-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = V_MFMA_I32_4X4X4I8_e64 killed $vgpr0, killed $vgpr1, killed $agpr0_agpr1_agpr2_agpr3, 0, 0, 0, implicit $mode, implicit $exec			; PEI-GFX908-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = V_MFMA_I32_4X4X4I8_e64 killed $vgpr0, killed $vgpr1, killed $agpr0_agpr1_agpr2_agpr3, 0, 0, 0, implicit $mode, implicit $exec
	; PEI-GFX908-NEXT: $vgpr0 = BUFFER_LOAD_DWORD_OFFSET $sgpr8_sgpr9_sgpr10_sgpr11, 0, 4, 0, 0, 0, implicit $exec, implicit-def $vgpr0_vgpr1 :: (load (s32) from %stack.0, addrspace 5)			; PEI-GFX908-NEXT: $vgpr0 = BUFFER_LOAD_DWORD_OFFSET $sgpr8_sgpr9_sgpr10_sgpr11, 0, 4, 0, 0, 0, implicit $exec, implicit-def $vgpr0_vgpr1 :: (load (s32) from %stack.0, addrspace 5)
	; PEI-GFX908-NEXT: $vgpr1 = V_ACCVGPR_READ_B32_e64 $agpr4, implicit $exec, implicit-def $vgpr0_vgpr1			; PEI-GFX908-NEXT: $vgpr1 = V_ACCVGPR_READ_B32_e64 $agpr4, implicit $exec, implicit-def $vgpr0_vgpr1
	; PEI-GFX908-NEXT: GLOBAL_STORE_DWORDX2 undef renamable $vgpr0_vgpr1, killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec :: (volatile store (s64) into `<2 x i32> addrspace(1)* undef`, addrspace 1)			; PEI-GFX908-NEXT: GLOBAL_STORE_DWORDX2 undef renamable $vgpr0_vgpr1, killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec :: (volatile store (s64) into `<2 x i32> addrspace(1)* undef`, addrspace 1)
	; PEI-GFX908-NEXT: renamable $vgpr0_vgpr1_vgpr2_vgpr3 = COPY killed renamable $agpr0_agpr1_agpr2_agpr3, implicit $exec			; PEI-GFX908-NEXT: renamable $vgpr0_vgpr1_vgpr2_vgpr3 = COPY killed renamable $agpr0_agpr1_agpr2_agpr3, implicit $exec
	; PEI-GFX908-NEXT: GLOBAL_STORE_DWORDX4 undef renamable $vgpr0_vgpr1, killed renamable $vgpr0_vgpr1_vgpr2_vgpr3, 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)			; PEI-GFX908-NEXT: GLOBAL_STORE_DWORDX4 undef renamable $vgpr0_vgpr1, killed renamable $vgpr0_vgpr1_vgpr2_vgpr3, 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)
	; PEI-GFX908-NEXT: S_ENDPGM 0			; PEI-GFX908-NEXT: S_ENDPGM 0
	; REGALLOC-GFX90A-LABEL: name: partial_copy			; REGALLOC-GFX90A-LABEL: name: partial_copy
	; REGALLOC-GFX90A: bb.0 (%ir-block.0):			; REGALLOC-GFX90A: bb.0 (%ir-block.0):
	; REGALLOC-GFX90A-NEXT: liveins: $sgpr4_sgpr5			; REGALLOC-GFX90A-NEXT: liveins: $sgpr4_sgpr5
	; REGALLOC-GFX90A-NEXT: {{ $}}			; REGALLOC-GFX90A-NEXT: {{ $}}
	; REGALLOC-GFX90A-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 1769481 / reguse:AGPR_32 */, undef %5:agpr_32			; REGALLOC-GFX90A-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 1900553 / reguse:AGPR_32 */, undef %5:agpr_32
	; REGALLOC-GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 4980746 / regdef:VReg_128_Align2 */, def %25			; REGALLOC-GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 6160394 / regdef:VReg_128_Align2 */, def %25
	; REGALLOC-GFX90A-NEXT: [[COPY:%[0-9]+]]:av_128_align2 = COPY %25			; REGALLOC-GFX90A-NEXT: [[COPY:%[0-9]+]]:av_128_align2 = COPY %25
	; REGALLOC-GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 3080202 / regdef:VReg_64_Align2 */, def %23			; REGALLOC-GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 3538954 / regdef:VReg_64_Align2 */, def %23
	; REGALLOC-GFX90A-NEXT: SI_SPILL_V64_SAVE %23, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, align 4, addrspace 5)			; REGALLOC-GFX90A-NEXT: SI_SPILL_V64_SAVE %23, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, align 4, addrspace 5)
	; REGALLOC-GFX90A-NEXT: GLOBAL_STORE_DWORDX4 undef %14:vreg_64_align2, [[COPY]], 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)			; REGALLOC-GFX90A-NEXT: GLOBAL_STORE_DWORDX4 undef %14:vreg_64_align2, [[COPY]], 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)
	; REGALLOC-GFX90A-NEXT: renamable $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM killed renamable $sgpr4_sgpr5, 0, 0 :: (dereferenceable invariant load (s128) from %ir.arg.kernarg.offset.cast, addrspace 4)			; REGALLOC-GFX90A-NEXT: renamable $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM killed renamable $sgpr4_sgpr5, 0, 0 :: (dereferenceable invariant load (s128) from %ir.arg.kernarg.offset.cast, addrspace 4)
	; REGALLOC-GFX90A-NEXT: [[COPY1:%[0-9]+]]:areg_128_align2 = COPY killed renamable $sgpr0_sgpr1_sgpr2_sgpr3			; REGALLOC-GFX90A-NEXT: [[COPY1:%[0-9]+]]:areg_128_align2 = COPY killed renamable $sgpr0_sgpr1_sgpr2_sgpr3
	; REGALLOC-GFX90A-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1, implicit $exec			; REGALLOC-GFX90A-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
	; REGALLOC-GFX90A-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 2, implicit $exec			; REGALLOC-GFX90A-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 2, implicit $exec
	; REGALLOC-GFX90A-NEXT: [[V_MFMA_I32_4X4X4I8_e64_:%[0-9]+]]:areg_128_align2 = V_MFMA_I32_4X4X4I8_e64 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], [[COPY1]], 0, 0, 0, implicit $mode, implicit $exec			; REGALLOC-GFX90A-NEXT: [[V_MFMA_I32_4X4X4I8_e64_:%[0-9]+]]:areg_128_align2 = V_MFMA_I32_4X4X4I8_e64 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], [[COPY1]], 0, 0, 0, implicit $mode, implicit $exec
	; REGALLOC-GFX90A-NEXT: [[SI_SPILL_AV64_RESTORE:%[0-9]+]]:av_64_align2 = SI_SPILL_AV64_RESTORE %stack.0, $sgpr32, 0, implicit $exec :: (load (s64) from %stack.0, align 4, addrspace 5)			; REGALLOC-GFX90A-NEXT: [[SI_SPILL_AV64_RESTORE:%[0-9]+]]:av_64_align2 = SI_SPILL_AV64_RESTORE %stack.0, $sgpr32, 0, implicit $exec :: (load (s64) from %stack.0, align 4, addrspace 5)
	; REGALLOC-GFX90A-NEXT: GLOBAL_STORE_DWORDX2 undef %16:vreg_64_align2, [[SI_SPILL_AV64_RESTORE]], 0, 0, implicit $exec :: (volatile store (s64) into `<2 x i32> addrspace(1)* undef`, addrspace 1)			; REGALLOC-GFX90A-NEXT: GLOBAL_STORE_DWORDX2 undef %16:vreg_64_align2, [[SI_SPILL_AV64_RESTORE]], 0, 0, implicit $exec :: (volatile store (s64) into `<2 x i32> addrspace(1)* undef`, addrspace 1)
	; REGALLOC-GFX90A-NEXT: GLOBAL_STORE_DWORDX4 undef %18:vreg_64_align2, [[V_MFMA_I32_4X4X4I8_e64_]], 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)			; REGALLOC-GFX90A-NEXT: GLOBAL_STORE_DWORDX4 undef %18:vreg_64_align2, [[V_MFMA_I32_4X4X4I8_e64_]], 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)
	; REGALLOC-GFX90A-NEXT: S_ENDPGM 0			; REGALLOC-GFX90A-NEXT: S_ENDPGM 0
	; PEI-GFX90A-LABEL: name: partial_copy			; PEI-GFX90A-LABEL: name: partial_copy
	; PEI-GFX90A: bb.0 (%ir-block.0):			; PEI-GFX90A: bb.0 (%ir-block.0):
	; PEI-GFX90A-NEXT: liveins: $agpr4, $sgpr4_sgpr5, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr7			; PEI-GFX90A-NEXT: liveins: $agpr4, $sgpr4_sgpr5, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr7
	; PEI-GFX90A-NEXT: {{ $}}			; PEI-GFX90A-NEXT: {{ $}}
	; PEI-GFX90A-NEXT: $sgpr8_sgpr9_sgpr10_sgpr11 = COPY killed $sgpr0_sgpr1_sgpr2_sgpr3			; PEI-GFX90A-NEXT: $sgpr8_sgpr9_sgpr10_sgpr11 = COPY killed $sgpr0_sgpr1_sgpr2_sgpr3
	; PEI-GFX90A-NEXT: $sgpr8 = S_ADD_U32 $sgpr8, $sgpr7, implicit-def $scc, implicit-def $sgpr8_sgpr9_sgpr10_sgpr11			; PEI-GFX90A-NEXT: $sgpr8 = S_ADD_U32 $sgpr8, $sgpr7, implicit-def $scc, implicit-def $sgpr8_sgpr9_sgpr10_sgpr11
	; PEI-GFX90A-NEXT: $sgpr9 = S_ADDC_U32 $sgpr9, 0, implicit-def dead $scc, implicit $scc, implicit-def $sgpr8_sgpr9_sgpr10_sgpr11			; PEI-GFX90A-NEXT: $sgpr9 = S_ADDC_U32 $sgpr9, 0, implicit-def dead $scc, implicit $scc, implicit-def $sgpr8_sgpr9_sgpr10_sgpr11
	; PEI-GFX90A-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 1769481 / reguse:AGPR_32 */, undef renamable $agpr0			; PEI-GFX90A-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 1900553 / reguse:AGPR_32 */, undef renamable $agpr0
	; PEI-GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 4980746 / regdef:VReg_128_Align2 */, def renamable $vgpr0_vgpr1_vgpr2_vgpr3			; PEI-GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 6160394 / regdef:VReg_128_Align2 */, def renamable $vgpr0_vgpr1_vgpr2_vgpr3
	; PEI-GFX90A-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = COPY killed renamable $vgpr0_vgpr1_vgpr2_vgpr3, implicit $exec			; PEI-GFX90A-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = COPY killed renamable $vgpr0_vgpr1_vgpr2_vgpr3, implicit $exec
	; PEI-GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 3080202 / regdef:VReg_64_Align2 */, def renamable $vgpr0_vgpr1			; PEI-GFX90A-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 3538954 / regdef:VReg_64_Align2 */, def renamable $vgpr0_vgpr1
	; PEI-GFX90A-NEXT: BUFFER_STORE_DWORD_OFFSET killed $vgpr0, $sgpr8_sgpr9_sgpr10_sgpr11, 0, 4, 0, 0, 0, implicit $exec, implicit-def $vgpr0_vgpr1, implicit $vgpr0_vgpr1 :: (store (s32) into %stack.0, addrspace 5)			; PEI-GFX90A-NEXT: BUFFER_STORE_DWORD_OFFSET killed $vgpr0, $sgpr8_sgpr9_sgpr10_sgpr11, 0, 4, 0, 0, 0, implicit $exec, implicit-def $vgpr0_vgpr1, implicit $vgpr0_vgpr1 :: (store (s32) into %stack.0, addrspace 5)
	; PEI-GFX90A-NEXT: $agpr4 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr1, implicit $exec, implicit killed $vgpr0_vgpr1			; PEI-GFX90A-NEXT: $agpr4 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr1, implicit $exec, implicit killed $vgpr0_vgpr1
	; PEI-GFX90A-NEXT: GLOBAL_STORE_DWORDX4 undef renamable $vgpr0_vgpr1, killed renamable $agpr0_agpr1_agpr2_agpr3, 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)			; PEI-GFX90A-NEXT: GLOBAL_STORE_DWORDX4 undef renamable $vgpr0_vgpr1, killed renamable $agpr0_agpr1_agpr2_agpr3, 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)
	; PEI-GFX90A-NEXT: renamable $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM killed renamable $sgpr4_sgpr5, 0, 0 :: (dereferenceable invariant load (s128) from %ir.arg.kernarg.offset.cast, addrspace 4)			; PEI-GFX90A-NEXT: renamable $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM killed renamable $sgpr4_sgpr5, 0, 0 :: (dereferenceable invariant load (s128) from %ir.arg.kernarg.offset.cast, addrspace 4)
	; PEI-GFX90A-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = COPY killed renamable $sgpr0_sgpr1_sgpr2_sgpr3, implicit $exec			; PEI-GFX90A-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = COPY killed renamable $sgpr0_sgpr1_sgpr2_sgpr3, implicit $exec
	; PEI-GFX90A-NEXT: renamable $vgpr0 = V_MOV_B32_e32 1, implicit $exec			; PEI-GFX90A-NEXT: renamable $vgpr0 = V_MOV_B32_e32 1, implicit $exec
	; PEI-GFX90A-NEXT: renamable $vgpr1 = V_MOV_B32_e32 2, implicit $exec			; PEI-GFX90A-NEXT: renamable $vgpr1 = V_MOV_B32_e32 2, implicit $exec
	; PEI-GFX90A-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = V_MFMA_I32_4X4X4I8_e64 killed $vgpr0, killed $vgpr1, killed $agpr0_agpr1_agpr2_agpr3, 0, 0, 0, implicit $mode, implicit $exec			; PEI-GFX90A-NEXT: renamable $agpr0_agpr1_agpr2_agpr3 = V_MFMA_I32_4X4X4I8_e64 killed $vgpr0, killed $vgpr1, killed $agpr0_agpr1_agpr2_agpr3, 0, 0, 0, implicit $mode, implicit $exec
	Show All 18 Lines

llvm/test/CodeGen/AMDGPU/preserve-hi16.ll

	Show First 20 Lines • Show All 209 Lines • ▼ Show 20 Lines

	; GCN-LABEL: {{^}}zext_fma_f16:			; GCN-LABEL: {{^}}zext_fma_f16:
	; GFX8: v_fma_f16 [[FMA:v[0-9]+]], v0, v1, v2			; GFX8: v_fma_f16 [[FMA:v[0-9]+]], v0, v1, v2
	; GFX8-NEXT: s_setpc_b64			; GFX8-NEXT: s_setpc_b64

	; GFX9: v_fma_f16 [[FMA:v[0-9]+]], v0, v1, v2			; GFX9: v_fma_f16 [[FMA:v[0-9]+]], v0, v1, v2
	; GFX9-NEXT: v_and_b32_e32 v0, 0xffff, [[FMA]]			; GFX9-NEXT: v_and_b32_e32 v0, 0xffff, [[FMA]]

	; GFX10Plus: v_fmac_f16_e32 [[FMA:v[0-9]+]], v0, v1			; GFX10Plus: v_fmac_f16{{_e64\|_e32}} [[FMA:v[0-9]+]], v0, v1
	; GFX10Plus-NEXT: v_and_b32_e32 v0, 0xffff, [[FMA]]			; GFX10Plus-NEXT: v_and_b32_e32 v0, 0xffff, [[FMA]]
	define i32 @zext_fma_f16(half %x, half %y, half %z) {			define i32 @zext_fma_f16(half %x, half %y, half %z) {
	%fma = call half @llvm.fma.f16(half %x, half %y, half %z)			%fma = call half @llvm.fma.f16(half %x, half %y, half %z)
	%cast = bitcast half %fma to i16			%cast = bitcast half %fma to i16
	%zext = zext i16 %cast to i32			%zext = zext i16 %cast to i32
	ret i32 %zext			ret i32 %zext
	}			}

	▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/shrink-mad-fma.mir

This file was moved from llvm/test/CodeGen/AMDGPU/gfx10-shrink-mad-fma.mir.

	# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py			# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
	# RUN: llc -march=amdgcn -mcpu=gfx1010 -run-pass si-shrink-instructions -verify-machineinstrs %s -o - \| FileCheck %s -check-prefixes=GFX10			# RUN: llc -march=amdgcn -mcpu=gfx1010 -run-pass si-shrink-instructions -verify-machineinstrs %s -o - \| FileCheck %s -check-prefixes=GFX10
				# RUN: llc -march=amdgcn -mcpu=gfx1100 -run-pass si-shrink-instructions -verify-machineinstrs %s -o - \| FileCheck %s -check-prefixes=GFX11

	---			---
	name: mad_cvv_f32			name: mad_cvv_f32
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: mad_cvv_f32			; GFX10-LABEL: name: mad_cvv_f32
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_MADMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_MADMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: mad_cvv_f32
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_MADMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_MAD_F32_e64 0, 1092616192, 0, $vgpr0, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_MAD_F32_e64 0, 1092616192, 0, $vgpr0, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: mad_vcv_f32			name: mad_vcv_f32
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: mad_vcv_f32			; GFX10-LABEL: name: mad_vcv_f32
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_MADMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_MADMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: mad_vcv_f32
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_MADMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_MAD_F32_e64 0, $vgpr0, 0, 1092616192, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_MAD_F32_e64 0, $vgpr0, 0, 1092616192, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: mad_vvc_f32			name: mad_vvc_f32
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: mad_vvc_f32			; GFX10-LABEL: name: mad_vvc_f32
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_MADAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_MADAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: mad_vvc_f32
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_MADAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_MAD_F32_e64 0, $vgpr0, 0, $vgpr1, 0, 1092616192, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_MAD_F32_e64 0, $vgpr0, 0, $vgpr1, 0, 1092616192, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: mad_vsc_f32			name: mad_vsc_f32
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: mad_vsc_f32			; GFX10-LABEL: name: mad_vsc_f32
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $sgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $sgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_MADAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_MADAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: mad_vsc_f32
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $sgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_MADAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$sgpr1 = IMPLICIT_DEF			$sgpr1 = IMPLICIT_DEF
	$vgpr2 = V_MAD_F32_e64 0, $vgpr0, 0, $vgpr1, 0, 1092616192, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_MAD_F32_e64 0, $vgpr0, 0, $vgpr1, 0, 1092616192, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: fma_cvv_f32			name: fma_cvv_f32
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: fma_cvv_f32			; GFX10-LABEL: name: fma_cvv_f32
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_FMAMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_FMAMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: fma_cvv_f32
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_FMAMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_FMA_F32_e64 0, 1092616192, 0, $vgpr0, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_FMA_F32_e64 0, 1092616192, 0, $vgpr0, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: fma_vcv_f32			name: fma_vcv_f32
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: fma_vcv_f32			; GFX10-LABEL: name: fma_vcv_f32
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_FMAMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_FMAMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: fma_vcv_f32
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_FMAMK_F32 $vgpr0, 1092616192, $vgpr1, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_FMA_F32_e64 0, $vgpr0, 0, 1092616192, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_FMA_F32_e64 0, $vgpr0, 0, 1092616192, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: fma_vvc_f32			name: fma_vvc_f32
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: fma_vvc_f32			; GFX10-LABEL: name: fma_vvc_f32
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_FMAAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_FMAAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: fma_vvc_f32
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_FMAAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_FMA_F32_e64 0, $vgpr0, 0, $vgpr1, 0, 1092616192, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_FMA_F32_e64 0, $vgpr0, 0, $vgpr1, 0, 1092616192, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: fma_vsc_f32			name: fma_vsc_f32
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: fma_vsc_f32			; GFX10-LABEL: name: fma_vsc_f32
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $sgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $sgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_FMAAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_FMAAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: fma_vsc_f32
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $sgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_FMAAK_F32 $vgpr0, $vgpr1, 1092616192, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$sgpr1 = IMPLICIT_DEF			$sgpr1 = IMPLICIT_DEF
	$vgpr2 = V_FMA_F32_e64 0, $vgpr0, 0, $vgpr1, 0, 1092616192, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_FMA_F32_e64 0, $vgpr0, 0, $vgpr1, 0, 1092616192, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: mad_cvv_f16			name: mad_cvv_f16
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: mad_cvv_f16			; GFX10-LABEL: name: mad_cvv_f16
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_MADMK_F16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_MADMK_F16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: mad_cvv_f16
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_MADMK_F16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_MAD_F16_e64 0, 18688, 0, $vgpr0, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_MAD_F16_e64 0, 18688, 0, $vgpr0, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: mad_vcv_f16			name: mad_vcv_f16
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: mad_vcv_f16			; GFX10-LABEL: name: mad_vcv_f16
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_MADMK_F16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_MADMK_F16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: mad_vcv_f16
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_MADMK_F16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_MAD_F16_e64 0, $vgpr0, 0, 18688, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_MAD_F16_e64 0, $vgpr0, 0, 18688, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: mad_vvc_f16			name: mad_vvc_f16
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: mad_vvc_f16			; GFX10-LABEL: name: mad_vvc_f16
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_MADAK_F16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_MADAK_F16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: mad_vvc_f16
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_MADAK_F16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_MAD_F16_e64 0, $vgpr0, 0, $vgpr1, 0, 18688, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_MAD_F16_e64 0, $vgpr0, 0, $vgpr1, 0, 18688, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: mad_vsc_f16			name: mad_vsc_f16
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: mad_vsc_f16			; GFX10-LABEL: name: mad_vsc_f16
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $sgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $sgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_MADAK_F16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_MADAK_F16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: mad_vsc_f16
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $sgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_MADAK_F16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$sgpr1 = IMPLICIT_DEF			$sgpr1 = IMPLICIT_DEF
	$vgpr2 = V_MAD_F16_e64 0, $vgpr0, 0, $vgpr1, 0, 18688, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_MAD_F16_e64 0, $vgpr0, 0, $vgpr1, 0, 18688, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: fma_cvv_f16			name: fma_cvv_f16
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: fma_cvv_f16			; GFX10-LABEL: name: fma_cvv_f16
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_FMAMK_F16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_FMAMK_F16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: fma_cvv_f16
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_FMAMK_F16_t16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_FMA_F16_gfx9_e64 0, 18688, 0, $vgpr0, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_FMA_F16_gfx9_e64 0, 18688, 0, $vgpr0, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: fma_vcv_f16			name: fma_vcv_f16
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: fma_vcv_f16			; GFX10-LABEL: name: fma_vcv_f16
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_FMAMK_F16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_FMAMK_F16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: fma_vcv_f16
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_FMAMK_F16_t16 $vgpr0, 18688, $vgpr1, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_FMA_F16_gfx9_e64 0, $vgpr0, 0, 18688, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_FMA_F16_gfx9_e64 0, $vgpr0, 0, 18688, 0, $vgpr1, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: fma_vvc_f16			name: fma_vvc_f16
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: fma_vvc_f16			; GFX10-LABEL: name: fma_vvc_f16
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $vgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_FMAAK_F16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_FMAAK_F16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: fma_vvc_f16
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_FMAAK_F16_t16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$vgpr1 = IMPLICIT_DEF			$vgpr1 = IMPLICIT_DEF
	$vgpr2 = V_FMA_F16_gfx9_e64 0, $vgpr0, 0, $vgpr1, 0, 18688, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_FMA_F16_gfx9_e64 0, $vgpr0, 0, $vgpr1, 0, 18688, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

	---			---
	name: fma_vsc_f16			name: fma_vsc_f16
	body: \|			body: \|
	bb.0:			bb.0:
	; GFX10-LABEL: name: fma_vsc_f16			; GFX10-LABEL: name: fma_vsc_f16
	; GFX10: $vgpr0 = IMPLICIT_DEF			; GFX10: $vgpr0 = IMPLICIT_DEF
	; GFX10-NEXT: $sgpr1 = IMPLICIT_DEF			; GFX10-NEXT: $sgpr1 = IMPLICIT_DEF
	; GFX10-NEXT: $vgpr2 = V_FMAAK_F16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec			; GFX10-NEXT: $vgpr2 = V_FMAAK_F16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec
	; GFX10-NEXT: SI_RETURN implicit $vgpr2			; GFX10-NEXT: SI_RETURN implicit $vgpr2
				; GFX11-LABEL: name: fma_vsc_f16
				; GFX11: $vgpr0 = IMPLICIT_DEF
				; GFX11-NEXT: $sgpr1 = IMPLICIT_DEF
				; GFX11-NEXT: $vgpr2 = V_FMAAK_F16_t16 $vgpr0, $vgpr1, 18688, implicit $mode, implicit $exec
				; GFX11-NEXT: SI_RETURN implicit $vgpr2
	$vgpr0 = IMPLICIT_DEF			$vgpr0 = IMPLICIT_DEF
	$sgpr1 = IMPLICIT_DEF			$sgpr1 = IMPLICIT_DEF
	$vgpr2 = V_FMA_F16_gfx9_e64 0, $vgpr0, 0, $vgpr1, 0, 18688, 0, 0, implicit $mode, implicit $exec			$vgpr2 = V_FMA_F16_gfx9_e64 0, $vgpr0, 0, $vgpr1, 0, 18688, 0, 0, implicit $mode, implicit $exec
	SI_RETURN implicit $vgpr2			SI_RETURN implicit $vgpr2
	...			...

llvm/test/CodeGen/AMDGPU/spill-vector-superclass.ll

	; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -stop-after=greedy,1 -verify-machineinstrs -o - %s \| FileCheck -check-prefix=GCN %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -stop-after=greedy,1 -verify-machineinstrs -o - %s \| FileCheck -check-prefix=GCN %s
	; Convert AV spills into VGPR spills by introducing appropriate copies in between.			; Convert AV spills into VGPR spills by introducing appropriate copies in between.

	define amdgpu_kernel void @test_spill_av_class(<4 x i32> %arg) #0 {			define amdgpu_kernel void @test_spill_av_class(<4 x i32> %arg) #0 {
	; GCN-LABEL: name: test_spill_av_class			; GCN-LABEL: name: test_spill_av_class
	; GCN: bb.0 (%ir-block.0):			; GCN: bb.0 (%ir-block.0):
	; GCN-NEXT: liveins: $sgpr4_sgpr5			; GCN-NEXT: liveins: $sgpr4_sgpr5
	; GCN-NEXT: {{ $}}			; GCN-NEXT: {{ $}}
	; GCN-NEXT: renamable $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM killed renamable $sgpr4_sgpr5, 0, 0 :: (dereferenceable invariant load (s128) from %ir.arg.kernarg.offset.cast, addrspace 4)			; GCN-NEXT: renamable $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM killed renamable $sgpr4_sgpr5, 0, 0 :: (dereferenceable invariant load (s128) from %ir.arg.kernarg.offset.cast, addrspace 4)
	; GCN-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1, implicit $exec			; GCN-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
	; GCN-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 2, implicit $exec			; GCN-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 2, implicit $exec
	; GCN-NEXT: [[COPY:%[0-9]+]]:areg_128 = COPY killed renamable $sgpr0_sgpr1_sgpr2_sgpr3			; GCN-NEXT: [[COPY:%[0-9]+]]:areg_128 = COPY killed renamable $sgpr0_sgpr1_sgpr2_sgpr3
	; GCN-NEXT: [[V_MFMA_I32_4X4X4I8_e64_:%[0-9]+]]:areg_128 = V_MFMA_I32_4X4X4I8_e64 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], [[COPY]], 0, 0, 0, implicit $mode, implicit $exec			; GCN-NEXT: [[V_MFMA_I32_4X4X4I8_e64_:%[0-9]+]]:areg_128 = V_MFMA_I32_4X4X4I8_e64 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], [[COPY]], 0, 0, 0, implicit $mode, implicit $exec
	; GCN-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 1835018 / regdef:VGPR_32 */, def undef %22.sub0			; GCN-NEXT: INLINEASM &"; def $0", 1 /* sideeffect attdialect /, 1966090 / regdef:VGPR_32 */, def undef %22.sub0
	; GCN-NEXT: undef %24.sub0:av_64 = COPY %22.sub0			; GCN-NEXT: undef %24.sub0:av_64 = COPY %22.sub0
	; GCN-NEXT: SI_SPILL_AV64_SAVE %24, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, align 4, addrspace 5)			; GCN-NEXT: SI_SPILL_AV64_SAVE %24, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, align 4, addrspace 5)
	; GCN-NEXT: [[COPY1:%[0-9]+]]:vreg_128 = COPY [[V_MFMA_I32_4X4X4I8_e64_]]			; GCN-NEXT: [[COPY1:%[0-9]+]]:vreg_128 = COPY [[V_MFMA_I32_4X4X4I8_e64_]]
	; GCN-NEXT: GLOBAL_STORE_DWORDX4 undef %16:vreg_64, [[COPY1]], 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)			; GCN-NEXT: GLOBAL_STORE_DWORDX4 undef %16:vreg_64, [[COPY1]], 0, 0, implicit $exec :: (volatile store (s128) into `<4 x i32> addrspace(1)* undef`, addrspace 1)
	; GCN-NEXT: [[SI_SPILL_AV64_RESTORE:%[0-9]+]]:av_64 = SI_SPILL_AV64_RESTORE %stack.0, $sgpr32, 0, implicit $exec :: (load (s64) from %stack.0, align 4, addrspace 5)			; GCN-NEXT: [[SI_SPILL_AV64_RESTORE:%[0-9]+]]:av_64 = SI_SPILL_AV64_RESTORE %stack.0, $sgpr32, 0, implicit $exec :: (load (s64) from %stack.0, align 4, addrspace 5)
	; GCN-NEXT: undef %23.sub0:vreg_64 = COPY [[SI_SPILL_AV64_RESTORE]].sub0			; GCN-NEXT: undef %23.sub0:vreg_64 = COPY [[SI_SPILL_AV64_RESTORE]].sub0
	; GCN-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 2949129 / reguse:VReg_64 */, %23			; GCN-NEXT: INLINEASM &"; use $0", 1 /* sideeffect attdialect /, 3211273 / reguse:VReg_64 */, %23
	; GCN-NEXT: S_ENDPGM 0			; GCN-NEXT: S_ENDPGM 0
	%v0 = call i32 asm sideeffect "; def $0", "=v"()			%v0 = call i32 asm sideeffect "; def $0", "=v"()
	%tmp = insertelement <2 x i32> undef, i32 %v0, i32 0			%tmp = insertelement <2 x i32> undef, i32 %v0, i32 0
	%mai = tail call <4 x i32> @llvm.amdgcn.mfma.i32.4x4x4i8(i32 1, i32 2, <4 x i32> %arg, i32 0, i32 0, i32 0)			%mai = tail call <4 x i32> @llvm.amdgcn.mfma.i32.4x4x4i8(i32 1, i32 2, <4 x i32> %arg, i32 0, i32 0, i32 0)
	store volatile <4 x i32> %mai, <4 x i32> addrspace(1)* undef			store volatile <4 x i32> %mai, <4 x i32> addrspace(1)* undef
	call void asm sideeffect "; use $0", "v"(<2 x i32> %tmp);			call void asm sideeffect "; use $0", "v"(<2 x i32> %tmp);
	ret void			ret void
	}			}

	declare <4 x i32> @llvm.amdgcn.mfma.i32.4x4x4i8(i32, i32, <4 x i32>, i32, i32, i32)			declare <4 x i32> @llvm.amdgcn.mfma.i32.4x4x4i8(i32, i32, <4 x i32>, i32, i32, i32)

	attributes #0 = { nounwind "amdgpu-num-vgpr"="5" }			attributes #0 = { nounwind "amdgpu-num-vgpr"="5" }

llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll

	Show All 16 Lines
	; GFX10-NEXT: s_waitcnt_vscnt null, 0x0			; GFX10-NEXT: s_waitcnt_vscnt null, 0x0
	; GFX10-NEXT: v_fma_f16 v0, v0, v1, v2			; GFX10-NEXT: v_fma_f16 v0, v0, v1, v2
	; GFX10-NEXT: s_setpc_b64 s[30:31]			; GFX10-NEXT: s_setpc_b64 s[30:31]
	;			;
	; GFX11-LABEL: v_constained_fma_f16_fpexcept_strict:			; GFX11-LABEL: v_constained_fma_f16_fpexcept_strict:
	; GFX11: ; %bb.0:			; GFX11: ; %bb.0:
	; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)			; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	; GFX11-NEXT: s_waitcnt_vscnt null, 0x0			; GFX11-NEXT: s_waitcnt_vscnt null, 0x0
	; GFX11-NEXT: v_fma_f16 v0, v0, v1, v2			; GFX11-NEXT: v_fma_f16 v0, v1, v0, v2
	; GFX11-NEXT: s_setpc_b64 s[30:31]			; GFX11-NEXT: s_setpc_b64 s[30:31]
	%val = call half @llvm.experimental.constrained.fma.f16(half %x, half %y, half %z, metadata !"round.tonearest", metadata !"fpexcept.strict")			%val = call half @llvm.experimental.constrained.fma.f16(half %x, half %y, half %z, metadata !"round.tonearest", metadata !"fpexcept.strict")
	ret half %val			ret half %val
	}			}

	define <2 x half> @v_constained_fma_v2f16_fpexcept_strict(<2 x half> %x, <2 x half> %y, <2 x half> %z) #0 {			define <2 x half> @v_constained_fma_v2f16_fpexcept_strict(<2 x half> %x, <2 x half> %y, <2 x half> %z) #0 {
	; GFX9-LABEL: v_constained_fma_v2f16_fpexcept_strict:			; GFX9-LABEL: v_constained_fma_v2f16_fpexcept_strict:
	; GFX9: ; %bb.0:			; GFX9: ; %bb.0:
	▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
	; GFX10-NEXT: v_fma_f16 v1, v1, v3, v5			; GFX10-NEXT: v_fma_f16 v1, v1, v3, v5
	; GFX10-NEXT: s_setpc_b64 s[30:31]			; GFX10-NEXT: s_setpc_b64 s[30:31]
	;			;
	; GFX11-LABEL: v_constained_fma_v3f16_fpexcept_strict:			; GFX11-LABEL: v_constained_fma_v3f16_fpexcept_strict:
	; GFX11: ; %bb.0:			; GFX11: ; %bb.0:
	; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)			; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	; GFX11-NEXT: s_waitcnt_vscnt null, 0x0			; GFX11-NEXT: s_waitcnt_vscnt null, 0x0
	; GFX11-NEXT: v_pk_fma_f16 v0, v0, v2, v4			; GFX11-NEXT: v_pk_fma_f16 v0, v0, v2, v4
	; GFX11-NEXT: v_fma_f16 v1, v1, v3, v5			; GFX11-NEXT: v_fma_f16 v1, v3, v1, v5
	; GFX11-NEXT: s_setpc_b64 s[30:31]			; GFX11-NEXT: s_setpc_b64 s[30:31]
	%val = call <3 x half> @llvm.experimental.constrained.fma.v3f16(<3 x half> %x, <3 x half> %y, <3 x half> %z, metadata !"round.tonearest", metadata !"fpexcept.strict")			%val = call <3 x half> @llvm.experimental.constrained.fma.v3f16(<3 x half> %x, <3 x half> %y, <3 x half> %z, metadata !"round.tonearest", metadata !"fpexcept.strict")
	ret <3 x half> %val			ret <3 x half> %val
	}			}

	define <4 x half> @v_constained_fma_v4f16_fpexcept_strict(<4 x half> %x, <4 x half> %y, <4 x half> %z) #0 {			define <4 x half> @v_constained_fma_v4f16_fpexcept_strict(<4 x half> %x, <4 x half> %y, <4 x half> %z) #0 {
	; GFX9-LABEL: v_constained_fma_v4f16_fpexcept_strict:			; GFX9-LABEL: v_constained_fma_v4f16_fpexcept_strict:
	; GFX9: ; %bb.0:			; GFX9: ; %bb.0:
	▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
	; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)			; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	; GFX11-NEXT: s_waitcnt_vscnt null, 0x0			; GFX11-NEXT: s_waitcnt_vscnt null, 0x0
	; GFX11-NEXT: v_lshrrev_b32_e32 v6, 16, v5			; GFX11-NEXT: v_lshrrev_b32_e32 v6, 16, v5
	; GFX11-NEXT: v_lshrrev_b32_e32 v7, 16, v3			; GFX11-NEXT: v_lshrrev_b32_e32 v7, 16, v3
	; GFX11-NEXT: v_lshrrev_b32_e32 v8, 16, v1			; GFX11-NEXT: v_lshrrev_b32_e32 v8, 16, v1
	; GFX11-NEXT: v_lshrrev_b32_e32 v9, 16, v4			; GFX11-NEXT: v_lshrrev_b32_e32 v9, 16, v4
	; GFX11-NEXT: v_lshrrev_b32_e32 v10, 16, v2			; GFX11-NEXT: v_lshrrev_b32_e32 v10, 16, v2
	; GFX11-NEXT: v_lshrrev_b32_e32 v11, 16, v0			; GFX11-NEXT: v_lshrrev_b32_e32 v11, 16, v0
	; GFX11-NEXT: v_fmac_f16_e32 v4, v0, v2			; GFX11-NEXT: v_fmac_f16_e64 v4, v0, v2
	; GFX11-NEXT: v_fmac_f16_e32 v5, v1, v3			; GFX11-NEXT: v_fmac_f16_e64 v5, v1, v3
	; GFX11-NEXT: v_fmac_f16_e32 v6, v8, v7			; GFX11-NEXT: v_fmac_f16_e64 v6, v8, v7
	; GFX11-NEXT: v_fmac_f16_e32 v9, v11, v10			; GFX11-NEXT: v_fmac_f16_e64 v9, v11, v10
	; GFX11-NEXT: v_and_b32_e32 v0, 0xffff, v4			; GFX11-NEXT: v_and_b32_e32 v0, 0xffff, v4
	; GFX11-NEXT: v_and_b32_e32 v1, 0xffff, v5			; GFX11-NEXT: v_and_b32_e32 v1, 0xffff, v5
	; GFX11-NEXT: v_lshl_or_b32 v0, v9, 16, v0			; GFX11-NEXT: v_lshl_or_b32 v0, v9, 16, v0
	; GFX11-NEXT: v_lshl_or_b32 v1, v6, 16, v1			; GFX11-NEXT: v_lshl_or_b32 v1, v6, 16, v1
	; GFX11-NEXT: s_setpc_b64 s[30:31]			; GFX11-NEXT: s_setpc_b64 s[30:31]
	%val = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> %x, <4 x half> %y, <4 x half> %z, metadata !"round.tonearest", metadata !"fpexcept.strict")			%val = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> %x, <4 x half> %y, <4 x half> %z, metadata !"round.tonearest", metadata !"fpexcept.strict")
	ret <4 x half> %val			ret <4 x half> %val
	}			}
	▲ Show 20 Lines • Show All 124 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/true16-ra-f128-fail.mir

This file was added.

				# RUN: not llc -march=amdgcn -mcpu=gfx1100 -debug-only=regalloc -start-before=greedy,0 -stop-after=virtregrewriter,1 -verify-machineinstrs -o /dev/null %s 2>&1 \| FileCheck --check-prefixes=CHECK %s
				# REQUIRES: asserts

				--- \|
				define amdgpu_ps void @e32() {
				ret void
				}
				...


				---
				name: e32
				tracksRegLiveness: true
				machineFunctionInfo:
				stackPtrOffsetReg: $sgpr32

				body: \|
				bb.0:
				liveins: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15, $vgpr16_vgpr17_vgpr18_vgpr19_vgpr20_vgpr21_vgpr22_vgpr23_vgpr24_vgpr25_vgpr26_vgpr27_vgpr28_vgpr29_vgpr30_vgpr31, $vgpr32_vgpr33_vgpr34_vgpr35_vgpr36_vgpr37_vgpr38_vgpr39_vgpr40_vgpr41_vgpr42_vgpr43_vgpr44_vgpr45_vgpr46_vgpr47, $vgpr48_vgpr49_vgpr50_vgpr51_vgpr52_vgpr53_vgpr54_vgpr55_vgpr56_vgpr57_vgpr58_vgpr59_vgpr60_vgpr61_vgpr62_vgpr63, $vgpr64_vgpr65_vgpr66_vgpr67_vgpr68_vgpr69_vgpr70_vgpr71_vgpr72_vgpr73_vgpr74_vgpr75_vgpr76_vgpr77_vgpr78_vgpr79, $vgpr80_vgpr81_vgpr82_vgpr83_vgpr84_vgpr85_vgpr86_vgpr87_vgpr88_vgpr89_vgpr90_vgpr91_vgpr92_vgpr93_vgpr94_vgpr95, $vgpr96_vgpr97_vgpr98_vgpr99_vgpr100_vgpr101_vgpr102_vgpr103_vgpr104_vgpr105_vgpr106_vgpr107_vgpr108_vgpr109_vgpr110_vgpr111, $vgpr112, $vgpr113, $vgpr114, $vgpr115, $vgpr116, $vgpr117, $vgpr118, $vgpr119, $vgpr120, $vgpr121, $vgpr122, $vgpr123, $vgpr124, $vgpr125, $vgpr126, $vgpr127

				; CHECK: error: ran out of registers during register allocation
				; CHECK: [[REG1:vgpr[0-9]+]] = V_ADD_F16_t16_e32
				; CHECK: SI_SPILL_V32_SAVE $[[REG1]]
				%0:vgpr_32_lo128 = V_ADD_F16_t16_e32 $vgpr0, $vgpr1, implicit $exec, implicit $mode
				S_NOP 0, implicit $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17_vgpr18_vgpr19_vgpr20_vgpr21_vgpr22_vgpr23_vgpr24_vgpr25_vgpr26_vgpr27_vgpr28_vgpr29_vgpr30_vgpr31
				S_NOP 0, implicit $vgpr32_vgpr33_vgpr34_vgpr35_vgpr36_vgpr37_vgpr38_vgpr39_vgpr40_vgpr41_vgpr42_vgpr43_vgpr44_vgpr45_vgpr46_vgpr47
				S_NOP 0, implicit $vgpr48_vgpr49_vgpr50_vgpr51_vgpr52_vgpr53_vgpr54_vgpr55_vgpr56_vgpr57_vgpr58_vgpr59_vgpr60_vgpr61_vgpr62_vgpr63
				S_NOP 0, implicit $vgpr64_vgpr65_vgpr66_vgpr67_vgpr68_vgpr69_vgpr70_vgpr71_vgpr72_vgpr73_vgpr74_vgpr75_vgpr76_vgpr77_vgpr78_vgpr79
				S_NOP 0, implicit $vgpr80_vgpr81_vgpr82_vgpr83_vgpr84_vgpr85_vgpr86_vgpr87_vgpr88_vgpr89_vgpr90_vgpr91_vgpr92_vgpr93_vgpr94_vgpr95
				S_NOP 0, implicit $vgpr96_vgpr97_vgpr98_vgpr99_vgpr100_vgpr101_vgpr102_vgpr103_vgpr104_vgpr105_vgpr106_vgpr107_vgpr108_vgpr109_vgpr110_vgpr111
				S_NOP 0, implicit $vgpr112, implicit $vgpr113, implicit $vgpr114, implicit $vgpr115, implicit $vgpr116, implicit $vgpr117, implicit $vgpr118, implicit $vgpr119, implicit $vgpr120, implicit $vgpr121, implicit $vgpr122, implicit $vgpr123, implicit $vgpr124, implicit $vgpr125, implicit $vgpr126, implicit $vgpr127
				S_ENDPGM 0, implicit %0
				...

llvm/test/CodeGen/AMDGPU/true16-ra-pre-gfx11-regression-test.mir

This file was added.

				# RUN: llc -march=amdgcn -mcpu=gfx1010 -start-before=greedy,0 -stop-after=virtregrewriter,1 -verify-machineinstrs -o - %s \| FileCheck --check-prefixes=GCN %s

				--- \|
				define amdgpu_ps void @e32() #0 {
				ret void
				}

				define amdgpu_ps void @e64() #0 {
				ret void
				}

				...


				---
				name: e32
				tracksRegLiveness: true

				body: \|
				bb.0:
				liveins: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15, $vgpr16_vgpr17_vgpr18_vgpr19_vgpr20_vgpr21_vgpr22_vgpr23_vgpr24_vgpr25_vgpr26_vgpr27_vgpr28_vgpr29_vgpr30_vgpr31, $vgpr32_vgpr33_vgpr34_vgpr35_vgpr36_vgpr37_vgpr38_vgpr39_vgpr40_vgpr41_vgpr42_vgpr43_vgpr44_vgpr45_vgpr46_vgpr47, $vgpr48_vgpr49_vgpr50_vgpr51_vgpr52_vgpr53_vgpr54_vgpr55_vgpr56_vgpr57_vgpr58_vgpr59_vgpr60_vgpr61_vgpr62_vgpr63, $vgpr64_vgpr65_vgpr66_vgpr67_vgpr68_vgpr69_vgpr70_vgpr71_vgpr72_vgpr73_vgpr74_vgpr75_vgpr76_vgpr77_vgpr78_vgpr79, $vgpr80_vgpr81_vgpr82_vgpr83_vgpr84_vgpr85_vgpr86_vgpr87_vgpr88_vgpr89_vgpr90_vgpr91_vgpr92_vgpr93_vgpr94_vgpr95, $vgpr96_vgpr97_vgpr98_vgpr99_vgpr100_vgpr101_vgpr102_vgpr103_vgpr104_vgpr105_vgpr106_vgpr107_vgpr108_vgpr109_vgpr110_vgpr111, $vgpr112, $vgpr113, $vgpr114, $vgpr115, $vgpr116, $vgpr117, $vgpr118, $vgpr119, $vgpr120, $vgpr121, $vgpr122, $vgpr123, $vgpr124, $vgpr125, $vgpr126, $vgpr127

				; GCN-LABEL: name: e32
				; GCN: renamable $vgpr128 = V_ADD_F16_e32 $vgpr0, $vgpr1, implicit $exec, implicit $mode
				%0:vgpr_32 = V_ADD_F16_e32 $vgpr0, $vgpr1, implicit $exec, implicit $mode
				S_NOP 0, implicit $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17_vgpr18_vgpr19_vgpr20_vgpr21_vgpr22_vgpr23_vgpr24_vgpr25_vgpr26_vgpr27_vgpr28_vgpr29_vgpr30_vgpr31
				S_NOP 0, implicit $vgpr32_vgpr33_vgpr34_vgpr35_vgpr36_vgpr37_vgpr38_vgpr39_vgpr40_vgpr41_vgpr42_vgpr43_vgpr44_vgpr45_vgpr46_vgpr47
				S_NOP 0, implicit $vgpr48_vgpr49_vgpr50_vgpr51_vgpr52_vgpr53_vgpr54_vgpr55_vgpr56_vgpr57_vgpr58_vgpr59_vgpr60_vgpr61_vgpr62_vgpr63
				S_NOP 0, implicit $vgpr64_vgpr65_vgpr66_vgpr67_vgpr68_vgpr69_vgpr70_vgpr71_vgpr72_vgpr73_vgpr74_vgpr75_vgpr76_vgpr77_vgpr78_vgpr79
				S_NOP 0, implicit $vgpr80_vgpr81_vgpr82_vgpr83_vgpr84_vgpr85_vgpr86_vgpr87_vgpr88_vgpr89_vgpr90_vgpr91_vgpr92_vgpr93_vgpr94_vgpr95
				S_NOP 0, implicit $vgpr96_vgpr97_vgpr98_vgpr99_vgpr100_vgpr101_vgpr102_vgpr103_vgpr104_vgpr105_vgpr106_vgpr107_vgpr108_vgpr109_vgpr110_vgpr111
				S_NOP 0, implicit $vgpr112, implicit $vgpr113, implicit $vgpr114, implicit $vgpr115, implicit $vgpr116, implicit $vgpr117, implicit $vgpr118, implicit $vgpr119, implicit $vgpr120, implicit $vgpr121, implicit $vgpr122, implicit $vgpr123, implicit $vgpr124, implicit $vgpr125, implicit $vgpr126, implicit $vgpr127
				S_ENDPGM 0, implicit %0
				...

				---
				name: e64
				tracksRegLiveness: true

				body: \|
				bb.0:
				liveins: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15, $vgpr16_vgpr17_vgpr18_vgpr19_vgpr20_vgpr21_vgpr22_vgpr23_vgpr24_vgpr25_vgpr26_vgpr27_vgpr28_vgpr29_vgpr30_vgpr31, $vgpr32_vgpr33_vgpr34_vgpr35_vgpr36_vgpr37_vgpr38_vgpr39_vgpr40_vgpr41_vgpr42_vgpr43_vgpr44_vgpr45_vgpr46_vgpr47, $vgpr48_vgpr49_vgpr50_vgpr51_vgpr52_vgpr53_vgpr54_vgpr55_vgpr56_vgpr57_vgpr58_vgpr59_vgpr60_vgpr61_vgpr62_vgpr63, $vgpr64_vgpr65_vgpr66_vgpr67_vgpr68_vgpr69_vgpr70_vgpr71_vgpr72_vgpr73_vgpr74_vgpr75_vgpr76_vgpr77_vgpr78_vgpr79, $vgpr80_vgpr81_vgpr82_vgpr83_vgpr84_vgpr85_vgpr86_vgpr87_vgpr88_vgpr89_vgpr90_vgpr91_vgpr92_vgpr93_vgpr94_vgpr95, $vgpr96_vgpr97_vgpr98_vgpr99_vgpr100_vgpr101_vgpr102_vgpr103_vgpr104_vgpr105_vgpr106_vgpr107_vgpr108_vgpr109_vgpr110_vgpr111, $vgpr112, $vgpr113, $vgpr114, $vgpr115, $vgpr116, $vgpr117, $vgpr118, $vgpr119, $vgpr120, $vgpr121, $vgpr122, $vgpr123, $vgpr124, $vgpr125, $vgpr126, $vgpr127

				; GCN-LABEL: name: e64
				; GCN: renamable $vgpr128 = V_ADD_F16_e64 0, $vgpr0, 0, $vgpr1, 0, 0, implicit $exec, implicit $mode
				%0:vgpr_32 = V_ADD_F16_e64 0, $vgpr0, 0, $vgpr1, 0, 0, implicit $exec, implicit $mode
				S_NOP 0, implicit $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17_vgpr18_vgpr19_vgpr20_vgpr21_vgpr22_vgpr23_vgpr24_vgpr25_vgpr26_vgpr27_vgpr28_vgpr29_vgpr30_vgpr31
				S_NOP 0, implicit $vgpr32_vgpr33_vgpr34_vgpr35_vgpr36_vgpr37_vgpr38_vgpr39_vgpr40_vgpr41_vgpr42_vgpr43_vgpr44_vgpr45_vgpr46_vgpr47
				S_NOP 0, implicit $vgpr48_vgpr49_vgpr50_vgpr51_vgpr52_vgpr53_vgpr54_vgpr55_vgpr56_vgpr57_vgpr58_vgpr59_vgpr60_vgpr61_vgpr62_vgpr63
				S_NOP 0, implicit $vgpr64_vgpr65_vgpr66_vgpr67_vgpr68_vgpr69_vgpr70_vgpr71_vgpr72_vgpr73_vgpr74_vgpr75_vgpr76_vgpr77_vgpr78_vgpr79
				S_NOP 0, implicit $vgpr80_vgpr81_vgpr82_vgpr83_vgpr84_vgpr85_vgpr86_vgpr87_vgpr88_vgpr89_vgpr90_vgpr91_vgpr92_vgpr93_vgpr94_vgpr95
				S_NOP 0, implicit $vgpr96_vgpr97_vgpr98_vgpr99_vgpr100_vgpr101_vgpr102_vgpr103_vgpr104_vgpr105_vgpr106_vgpr107_vgpr108_vgpr109_vgpr110_vgpr111
				S_NOP 0, implicit $vgpr112, implicit $vgpr113, implicit $vgpr114, implicit $vgpr115, implicit $vgpr116, implicit $vgpr117, implicit $vgpr118, implicit $vgpr119, implicit $vgpr120, implicit $vgpr121, implicit $vgpr122, implicit $vgpr123, implicit $vgpr124, implicit $vgpr125, implicit $vgpr126, implicit $vgpr127
				S_ENDPGM 0, implicit %0
				...

llvm/test/CodeGen/AMDGPU/twoaddr-fma.mir

# RUN: llc -march=amdgcn -mcpu=gfx1010 %s -run-pass twoaddressinstruction -verify-machineinstrs -o - \| FileCheck -check-prefix=GCN %s		# RUN: llc -march=amdgcn -mcpu=gfx1010 %s -run-pass twoaddressinstruction -verify-machineinstrs -o - \| FileCheck --check-prefixes=GCN %s
# RUN: llc -march=amdgcn -mcpu=gfx1100 %s -run-pass twoaddressinstruction -verify-machineinstrs -o - \| FileCheck -check-prefix=GCN %s		# RUN: llc -march=amdgcn -mcpu=gfx1100 %s -run-pass twoaddressinstruction -verify-machineinstrs -o - \| FileCheck --check-prefixes=GCN %s

# GCN-LABEL: name: test_fmamk_reg_imm_f32		# GCN-LABEL: name: test_fmamk_reg_imm_f32
# GCN: %2:vgpr_32 = IMPLICIT_DEF		# GCN: %2:vgpr_32 = IMPLICIT_DEF
# GCN-NOT: V_MOV_B32		# GCN-NOT: V_MOV_B32
# GCN: V_FMAMK_F32 killed %0.sub0, 1078523331, killed %1, implicit $mode, implicit $exec		# GCN: V_FMAMK_F32 killed %0.sub0, 1078523331, killed %1, implicit $mode, implicit $exec
---		---
name: test_fmamk_reg_imm_f32		name: test_fmamk_reg_imm_f32
registers:		registers:
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	body: \|
bb.0:		bb.0:

%0 = IMPLICIT_DEF		%0 = IMPLICIT_DEF
%1 = V_MOV_B32_e32 1078523331, implicit $exec		%1 = V_MOV_B32_e32 1078523331, implicit $exec
%2 = V_FMAC_F32_e32 killed %0.sub0, %0.sub1, %1, implicit $mode, implicit $exec		%2 = V_FMAC_F32_e32 killed %0.sub0, %0.sub1, %1, implicit $mode, implicit $exec

...		...

# GCN-LABEL: name: test_fmamk_reg_imm_f16
# GCN: %2:vgpr_32 = IMPLICIT_DEF
# GCN-NOT: V_MOV_B32
# GCN: V_FMAMK_F16 killed %0.sub0, 1078523331, killed %1, implicit $mode, implicit $exec
---
name: test_fmamk_reg_imm_f16
registers:
- { id: 0, class: vreg_64 }
- { id: 1, class: vgpr_32 }
- { id: 2, class: vgpr_32 }
- { id: 3, class: vgpr_32 }
body: \|
bb.0:

%0 = IMPLICIT_DEF
%1 = COPY %0.sub1
%2 = V_MOV_B32_e32 1078523331, implicit $exec
%3 = V_FMAC_F16_e32 killed %0.sub0, %2, killed %1, implicit $mode, implicit $exec

...

# GCN-LABEL: name: test_fmamk_imm_reg_f16
# GCN: %2:vgpr_32 = IMPLICIT_DEF
# GCN-NOT: V_MOV_B32
# GCN: V_FMAMK_F16 killed %0.sub0, 1078523331, killed %1, implicit $mode, implicit $exec
---
name: test_fmamk_imm_reg_f16
registers:
- { id: 0, class: vreg_64 }
- { id: 1, class: vgpr_32 }
- { id: 2, class: vgpr_32 }
- { id: 3, class: vgpr_32 }
body: \|
bb.0:

%0 = IMPLICIT_DEF
%1 = COPY %0.sub1
%2 = V_MOV_B32_e32 1078523331, implicit $exec
%3 = V_FMAC_F16_e32 %2, killed %0.sub0, killed %1, implicit $mode, implicit $exec

...

# GCN-LABEL: name: test_fmaak_f16
# GCN: %1:vgpr_32 = IMPLICIT_DEF
# GCN-NOT: V_MOV_B32
# GCN: V_FMAAK_F16 killed %0.sub0, %0.sub1, 1078523331, implicit $mode, implicit $exec
---
name: test_fmaak_f16
registers:
- { id: 0, class: vreg_64 }
- { id: 1, class: vgpr_32 }
- { id: 2, class: vgpr_32 }
body: \|
bb.0:

%0 = IMPLICIT_DEF
%1 = V_MOV_B32_e32 1078523331, implicit $exec
%2 = V_FMAC_F16_e32 killed %0.sub0, %0.sub1, %1, implicit $mode, implicit $exec
...

# GCN-LABEL: name: test_fmaak_sgpr_src0_f32		# GCN-LABEL: name: test_fmaak_sgpr_src0_f32
# GCN: %1:vgpr_32 = IMPLICIT_DEF		# GCN: %1:vgpr_32 = IMPLICIT_DEF
# GCN-NOT: V_MOV_B32		# GCN-NOT: V_MOV_B32
# GCN: %3:vgpr_32 = V_FMAMK_F32 killed %0, 1078523331, %2, implicit $mode, implicit $exec		# GCN: %3:vgpr_32 = V_FMAMK_F32 killed %0, 1078523331, %2, implicit $mode, implicit $exec

---		---
name: test_fmaak_sgpr_src0_f32		name: test_fmaak_sgpr_src0_f32
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	body: \|
bb.0:		bb.0:

%0 = V_MOV_B32_e32 1078523331, implicit $exec		%0 = V_MOV_B32_e32 1078523331, implicit $exec
%1 = IMPLICIT_DEF		%1 = IMPLICIT_DEF
%2 = V_FMAC_F32_e32 %stack.0, %0, %1, implicit $mode, implicit $exec		%2 = V_FMAC_F32_e32 %stack.0, %0, %1, implicit $mode, implicit $exec

...		...

# GCN-LABEL: name: test_fmaak_inline_literal_f16
# GCN: %1:vgpr_32 = IMPLICIT_DEF
# GCN-NOT: V_MOV_B32
# GCN: %2:vgpr_32 = V_FMAAK_F16 16384, killed %0, 49664, implicit $mode, implicit $exec

---
name: test_fmaak_inline_literal_f16
tracksRegLiveness: true
liveins:
- { reg: '$vgpr0', virtual-reg: '%0' }
body: \|
bb.0:
liveins: $vgpr0

%0:vgpr_32 = COPY killed $vgpr0

%1:vgpr_32 = V_MOV_B32_e32 49664, implicit $exec
%2:vgpr_32 = V_FMAC_F16_e32 16384, killed %0, %1, implicit $mode, implicit $exec
S_ENDPGM 0

...

# GCN-LABEL: name: test_fmamk_reg_imm_f32_2_folds		# GCN-LABEL: name: test_fmamk_reg_imm_f32_2_folds
# GCN: %2:vgpr_32 = IMPLICIT_DEF		# GCN: %2:vgpr_32 = IMPLICIT_DEF
# GCN-NOT: V_MOV_B32		# GCN-NOT: V_MOV_B32
# GCN: V_FMAMK_F32 %0.sub0, 1078523331, %1, implicit $mode, implicit $exec		# GCN: V_FMAMK_F32 %0.sub0, 1078523331, %1, implicit $mode, implicit $exec
# GCN: V_FMAMK_F32 killed %0.sub0, 1078523331, killed %1, implicit $mode, implicit $exec		# GCN: V_FMAMK_F32 killed %0.sub0, 1078523331, killed %1, implicit $mode, implicit $exec
---		---
name: test_fmamk_reg_imm_f32_2_folds		name: test_fmamk_reg_imm_f32_2_folds
▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/vopc_dpp.mir

Show All 12 Lines	bb.0:
; GCN: liveins: $vgpr0, $vgpr1, $vgpr2		; GCN: liveins: $vgpr0, $vgpr1, $vgpr2
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GCN-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2		; GCN-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GCN-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF		; GCN-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
; GCN-NEXT: V_CMP_LT_F32_e32_dpp 0, [[COPY1]], 0, [[COPY]], 1, 15, 15, 1, implicit-def $vcc, implicit $mode, implicit $exec		; GCN-NEXT: V_CMP_LT_F32_e32_dpp 0, [[COPY1]], 0, [[COPY]], 1, 15, 15, 1, implicit-def $vcc, implicit $mode, implicit $exec
; GCN-NEXT: [[V_MOV_B32_dpp:%[0-9]+]]:vgpr_32 = V_MOV_B32_dpp [[DEF]], [[COPY1]], 1, 15, 15, 1, implicit $exec		; GCN-NEXT: [[V_MOV_B32_dpp:%[0-9]+]]:vgpr_32 = V_MOV_B32_dpp [[DEF]], [[COPY1]], 1, 15, 15, 1, implicit $exec
; GCN-NEXT: V_CMPX_EQ_I16_e32 [[V_MOV_B32_dpp]], [[COPY]], implicit-def $exec, implicit-def $vcc, implicit $mode, implicit $exec		; GCN-NEXT: V_CMPX_EQ_I16_t16_nosdst_e64 [[V_MOV_B32_dpp]], [[COPY]], implicit-def $exec, implicit-def $vcc, implicit $mode, implicit $exec
; GCN-NEXT: V_CMP_CLASS_F16_e32_dpp 0, [[COPY1]], [[COPY]], 1, 15, 15, 1, implicit-def $vcc, implicit $exec		; GCN-NEXT: [[V_CMP_CLASS_F16_t16_e64_dpp:%[0-9]+]]:sgpr_32 = V_CMP_CLASS_F16_t16_e64_dpp 0, [[COPY1]], [[COPY]], 1, 15, 15, 1, implicit $exec
; GCN-NEXT: [[V_CMP_GE_F16_e64_dpp:%[0-9]+]]:sgpr_32 = V_CMP_GE_F16_e64_dpp 1, [[COPY1]], 0, [[COPY]], 1, 1, 15, 15, 1, implicit $mode, implicit $exec		; GCN-NEXT: [[V_CMP_GE_F16_t16_e64_dpp:%[0-9]+]]:sgpr_32 = V_CMP_GE_F16_t16_e64_dpp 1, [[COPY1]], 0, [[COPY]], 1, 1, 15, 15, 1, implicit $mode, implicit $exec
; GCN-NEXT: [[V_MOV_B32_dpp1:%[0-9]+]]:vgpr_32 = V_MOV_B32_dpp [[DEF]], [[COPY1]], 1, 15, 15, 1, implicit $exec		; GCN-NEXT: [[V_MOV_B32_dpp1:%[0-9]+]]:vgpr_32 = V_MOV_B32_dpp [[DEF]], [[COPY1]], 1, 15, 15, 1, implicit $exec
; GCN-NEXT: V_CMPX_GT_U32_nosdst_e64 [[V_MOV_B32_dpp1]], [[COPY]], implicit-def $exec, implicit $mode, implicit $exec		; GCN-NEXT: V_CMPX_GT_U32_nosdst_e64 [[V_MOV_B32_dpp1]], [[COPY]], implicit-def $exec, implicit $mode, implicit $exec
; GCN-NEXT: V_CMP_CLASS_F32_e32_dpp 2, [[COPY1]], [[COPY]], 1, 15, 15, 1, implicit-def $vcc, implicit $exec		; GCN-NEXT: V_CMP_CLASS_F32_e32_dpp 2, [[COPY1]], [[COPY]], 1, 15, 15, 1, implicit-def $vcc, implicit $exec
; GCN-NEXT: V_CMP_NGE_F16_e32_dpp 0, [[COPY1]], 0, [[COPY]], 1, 15, 15, 1, implicit-def $vcc, implicit $mode, implicit $exec		; GCN-NEXT: V_CMP_NGE_F32_e32_dpp 0, [[COPY1]], 0, [[COPY]], 1, 15, 15, 1, implicit-def $vcc, implicit $mode, implicit $exec
; GCN-NEXT: [[V_CMP_NGE_F16_e64_dpp:%[0-9]+]]:sgpr_32 = V_CMP_NGE_F16_e64_dpp 0, [[COPY1]], 0, [[COPY]], 0, 1, 15, 15, 1, implicit $mode, implicit $exec		; GCN-NEXT: [[V_MOV_B32_dpp2:%[0-9]+]]:vgpr_32 = V_MOV_B32_dpp [[DEF]], [[COPY1]], 1, 15, 15, 1, implicit $exec
; GCN-NEXT: [[S_AND_B32_:%[0-9]+]]:sgpr_32 = S_AND_B32 [[V_CMP_NGE_F16_e64_dpp]], 10101, implicit-def $scc		; GCN-NEXT: [[V_CMP_NGE_F16_t16_e64_:%[0-9]+]]:sgpr_32 = V_CMP_NGE_F16_t16_e64 0, [[V_CMP_NGE_F16_t16_e64_]], 0, [[COPY]], 0, implicit $mode, implicit $exec
		; GCN-NEXT: [[V_CMP_NGE_F32_e64_dpp:%[0-9]+]]:sgpr_32 = V_CMP_NGE_F32_e64_dpp 0, [[COPY1]], 0, [[COPY]], 0, 1, 15, 15, 1, implicit $mode, implicit $exec
		; GCN-NEXT: [[S_AND_B32_:%[0-9]+]]:sgpr_32 = S_AND_B32 [[V_CMP_NGE_F32_e64_dpp]], 10101, implicit-def $scc
; GCN-NEXT: V_CMP_GT_I32_e32_dpp [[COPY1]], [[COPY]], 1, 15, 15, 1, implicit-def $vcc, implicit $exec		; GCN-NEXT: V_CMP_GT_I32_e32_dpp [[COPY1]], [[COPY]], 1, 15, 15, 1, implicit-def $vcc, implicit $exec
%0:vgpr_32 = COPY $vgpr0		%0:vgpr_32 = COPY $vgpr0
%1:vgpr_32 = COPY $vgpr1		%1:vgpr_32 = COPY $vgpr1
%2:vgpr_32 = COPY $vgpr2		%2:vgpr_32 = COPY $vgpr2
%3:vgpr_32 = IMPLICIT_DEF		%3:vgpr_32 = IMPLICIT_DEF

%4:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec		%4:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec
V_CMP_LT_F32_e32 %4, %0, implicit-def $vcc, implicit $mode, implicit $exec		V_CMP_LT_F32_e32 %4, %0, implicit-def $vcc, implicit $mode, implicit $exec

; unsafe to combine cmpx		; unsafe to combine cmpx
%5:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec		%5:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec
V_CMPX_EQ_I16_e32 %5, %0, implicit-def $exec, implicit-def $vcc, implicit $mode, implicit $exec		V_CMPX_EQ_I16_t16_nosdst_e64 %5, %0, implicit-def $exec, implicit-def $vcc, implicit $mode, implicit $exec

%6:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec		%6:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec
V_CMP_CLASS_F16_e32 %6, %0, implicit-def $vcc, implicit $mode, implicit $exec		%7:sgpr_32 = V_CMP_CLASS_F16_t16_e64 0, %6, %0, implicit-def $vcc, implicit $mode, implicit $exec

%7:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec		%8:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec
%8:sgpr_32 = V_CMP_GE_F16_e64 1, %7, 0, %0, 1, implicit $mode, implicit $exec		%9:sgpr_32 = V_CMP_GE_F16_t16_e64 1, %8, 0, %0, 1, implicit $mode, implicit $exec

; unsafe to combine cmpx		; unsafe to combine cmpx
%9:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec		%10:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec
V_CMPX_GT_U32_nosdst_e64 %9, %0, implicit-def $exec, implicit $mode, implicit $exec		V_CMPX_GT_U32_nosdst_e64 %10, %0, implicit-def $exec, implicit $mode, implicit $exec

%11:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec		%11:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec
%12:sgpr_32 = V_CMP_CLASS_F32_e64 2, %11, %0, implicit $mode, implicit $exec		%12:sgpr_32 = V_CMP_CLASS_F32_e64 2, %11, %0, implicit $mode, implicit $exec

; shrink		; shrink
%13:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec		%13:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec
%14:sgpr_32 = V_CMP_NGE_F16_e64 0, %13, 0, %0, 0, implicit $mode, implicit $exec		%14:sgpr_32 = V_CMP_NGE_F32_e64 0, %13, 0, %0, 0, implicit $mode, implicit $exec

; do not shrink, sdst used		; do not shrink True16 instructions
%15:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec		%15:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec
%16:sgpr_32 = V_CMP_NGE_F16_e64 0, %15, 0, %0, 0, implicit $mode, implicit $exec		%16:sgpr_32 = V_CMP_NGE_F16_t16_e64 0, %16, 0, %0, 0, implicit $mode, implicit $exec
%17:sgpr_32 = S_AND_B32 %16, 10101, implicit-def $scc
		; do not shrink, sdst used
		%17:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec
		%18:sgpr_32 = V_CMP_NGE_F32_e64 0, %17, 0, %0, 0, implicit $mode, implicit $exec
		%19:sgpr_32 = S_AND_B32 %18, 10101, implicit-def $scc

; commute		; commute
%18:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec		%20:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 15, 15, 1, implicit $exec
V_CMP_LT_I32_e32 %0, %18, implicit-def $vcc, implicit $exec		V_CMP_LT_I32_e32 %0, %20, implicit-def $vcc, implicit $exec

...		...
---		---

name: mask_not_full		name: mask_not_full
tracksRegLiveness: true		tracksRegLiveness: true
body: \|		body: \|
bb.0:		bb.0:
liveins: $vgpr0, $vgpr1, $vgpr2		liveins: $vgpr0, $vgpr1, $vgpr2

; GCN-LABEL: name: mask_not_full		; GCN-LABEL: name: mask_not_full
; GCN: liveins: $vgpr0, $vgpr1, $vgpr2		; GCN: liveins: $vgpr0, $vgpr1, $vgpr2
; GCN-NEXT: {{ $}}		; GCN-NEXT: {{ $}}
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0		; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1		; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GCN-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF		; GCN-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
; GCN-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec		; GCN-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GCN-NEXT: [[V_MOV_B32_dpp:%[0-9]+]]:vgpr_32 = V_MOV_B32_dpp [[DEF]], [[COPY1]], 1, 15, 14, 1, implicit $exec		; GCN-NEXT: [[V_MOV_B32_dpp:%[0-9]+]]:vgpr_32 = V_MOV_B32_dpp [[DEF]], [[COPY1]], 1, 15, 14, 1, implicit $exec
; GCN-NEXT: V_CMP_CLASS_F16_e32 [[V_MOV_B32_dpp]], [[COPY]], implicit-def $vcc, implicit $mode, implicit $exec		; GCN-NEXT: [[V_CMP_CLASS_F16_t16_e64_:%[0-9]+]]:sgpr_32 = V_CMP_CLASS_F16_t16_e64 0, [[V_MOV_B32_dpp]], [[COPY]], implicit-def $vcc, implicit $mode, implicit $exec
; GCN-NEXT: [[V_MOV_B32_dpp1:%[0-9]+]]:vgpr_32 = V_MOV_B32_dpp [[V_MOV_B32_e32_]], [[COPY1]], 1, 13, 15, 1, implicit $exec		; GCN-NEXT: [[V_MOV_B32_dpp1:%[0-9]+]]:vgpr_32 = V_MOV_B32_dpp [[V_MOV_B32_e32_]], [[COPY1]], 1, 13, 15, 1, implicit $exec
; GCN-NEXT: [[V_CMP_GE_F16_e64_:%[0-9]+]]:sgpr_32 = V_CMP_GE_F16_e64 1, [[V_MOV_B32_dpp1]], 0, [[COPY]], 1, implicit $mode, implicit $exec		; GCN-NEXT: [[V_CMP_GE_F32_e64_:%[0-9]+]]:sgpr_32 = V_CMP_GE_F32_e64 1, [[V_MOV_B32_dpp1]], 0, [[COPY]], 1, implicit $mode, implicit $exec
%0:vgpr_32 = COPY $vgpr0		%0:vgpr_32 = COPY $vgpr0
%1:vgpr_32 = COPY $vgpr1		%1:vgpr_32 = COPY $vgpr1
%2:vgpr_32 = IMPLICIT_DEF		%2:vgpr_32 = IMPLICIT_DEF
%3:vgpr_32 = V_MOV_B32_e32 0, implicit $exec		%3:vgpr_32 = V_MOV_B32_e32 0, implicit $exec

; Do not combine VOPC when row_mask or bank_mask is not 0xf		; Do not combine VOPC when row_mask or bank_mask is not 0xf
; All cases are covered by generic rules for creating DPP instructions		; All cases are covered by generic rules for creating DPP instructions
%4:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 15, 14, 1, implicit $exec		%4:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 15, 14, 1, implicit $exec
V_CMP_CLASS_F16_e32 %4, %0, implicit-def $vcc, implicit $mode, implicit $exec		%99:sgpr_32 = V_CMP_CLASS_F16_t16_e64 0, %4, %0, implicit-def $vcc, implicit $mode, implicit $exec

%5:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 13, 15, 1, implicit $exec		%5:vgpr_32 = V_MOV_B32_dpp %3, %1, 1, 13, 15, 1, implicit $exec
%6:sgpr_32 = V_CMP_GE_F16_e64 1, %5, 0, %0, 1, implicit $mode, implicit $exec		%6:sgpr_32 = V_CMP_GE_F32_e64 1, %5, 0, %0, 1, implicit $mode, implicit $exec

...		...

llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_err.s

This file was added.

				// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize32,-wavefrontsize64 -show-encoding %s 2>&1 \| FileCheck --check-prefix=GFX11 --implicit-check-not=error %s
				// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=-wavefrontsize32,+wavefrontsize64 -show-encoding %s 2>&1 \| FileCheck --check-prefix=GFX11 --implicit-check-not=error %s

				v_ceil_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_ceil_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_ceil_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_cos_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_cos_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_cos_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_f16_f32_e32 v128, 0xaf123456
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_f16_f32_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_f16_f32_e32 v255, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_f16_i16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_f16_i16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_f16_i16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_f16_u16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_f16_u16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_f16_u16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_f32_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_i16_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_i16_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_i16_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_i32_i16_e32 v5, v199
				// GFX11: error: invalid operand for instruction

				v_cvt_norm_i16_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_norm_i16_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_norm_i16_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_norm_u16_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_norm_u16_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_norm_u16_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_u16_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_u16_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_u16_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_cvt_u32_u16_e32 v5, v199
				// GFX11: error: invalid operand for instruction

				v_exp_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_exp_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_exp_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_floor_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_floor_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_floor_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_fract_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_fract_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_fract_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_frexp_exp_i16_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_frexp_exp_i16_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_frexp_exp_i16_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_frexp_mant_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_frexp_mant_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_frexp_mant_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_log_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_log_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_log_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_not_b16_e32 v128, 0xfe0b
				// GFX11: error: invalid operand for instruction

				v_not_b16_e32 v255, v1
				// GFX11: error: invalid operand for instruction

				v_not_b16_e32 v5, v199
				// GFX11: error: invalid operand for instruction

				v_rcp_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_rcp_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_rcp_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_rndne_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_rndne_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_rndne_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_rsq_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_rsq_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_rsq_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_sin_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_sin_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_sin_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_sqrt_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_sqrt_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_sqrt_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_trunc_f16_e32 v128, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_trunc_f16_e32 v255, v1
				// GFX11: error: operands are not valid for this GPU or mode

				v_trunc_f16_e32 v5, v199
				// GFX11: error: operands are not valid for this GPU or mode

				v_ceil_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_ceil_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cos_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cos_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_f32_e32 v128, 0xaf123456 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_f32_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_f32_e32 v255, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_i16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_i16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_u16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_u16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f32_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_i16_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_i16_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_i32_i16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_norm_i16_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_norm_i16_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_norm_u16_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_norm_u16_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_u16_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_u16_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_u32_u16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_exp_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_exp_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_floor_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_floor_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_fract_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_fract_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_frexp_exp_i16_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_frexp_exp_i16_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_frexp_mant_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_frexp_mant_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_log_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_log_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_not_b16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_not_b16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rcp_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rcp_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rndne_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rndne_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rsq_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rsq_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_sin_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_sin_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_sqrt_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_sqrt_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_trunc_f16_e32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_trunc_f16_e32 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_ceil_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_ceil_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cos_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cos_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_f32_e32 v128, 0xaf123456 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_f32_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_f32_e32 v255, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_i16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_i16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_u16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f16_u16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_f32_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_i16_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_i16_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_i32_i16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_norm_i16_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_norm_i16_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_norm_u16_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_norm_u16_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_u16_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_u16_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cvt_u32_u16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_exp_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_exp_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_floor_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_floor_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_fract_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_fract_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_frexp_exp_i16_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_frexp_exp_i16_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_frexp_mant_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_frexp_mant_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_log_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_log_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_not_b16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_not_b16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rcp_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rcp_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rndne_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rndne_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rsq_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_rsq_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_sin_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_sin_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_sqrt_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_sqrt_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_trunc_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_trunc_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_promote.s

This file was added.

				// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize32,-wavefrontsize64 -show-encoding %s \| FileCheck --check-prefix=GFX11 --implicit-check-not=_e32 %s
				// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=-wavefrontsize32,+wavefrontsize64 -show-encoding %s \| FileCheck --check-prefix=GFX11 --implicit-check-not=_e32 %s

				v_ceil_f16 v128, 0xfe0b
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, -1
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, 0.5
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, exec_hi
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, exec_lo
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, m0
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, null
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, s1
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, s105
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, src_scc
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, ttmp15
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, v1
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, v127
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, vcc_hi
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, vcc_lo
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v5, v199
				// GFX11: v_ceil_f16_e64

				v_cos_f16 v128, 0xfe0b
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, -1
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, 0.5
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, exec_hi
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, exec_lo
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, m0
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, null
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, s1
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, s105
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, src_scc
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, ttmp15
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, v1
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, v127
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, vcc_hi
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, vcc_lo
				// GFX11: v_cos_f16_e64

				v_cos_f16 v5, v199
				// GFX11: v_cos_f16_e64

				v_cvt_f16_f32 v128, 0xaf123456
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, -1
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, 0.5
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, exec_hi
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, exec_lo
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, m0
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, null
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, s1
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, s105
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, src_scc
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, ttmp15
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, v1
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, v255
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, vcc_hi
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, vcc_lo
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_i16 v128, 0xfe0b
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, -1
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, 0.5
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, exec_hi
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, exec_lo
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, m0
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, null
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, s1
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, s105
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, src_scc
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, ttmp15
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, v1
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, v127
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, vcc_hi
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, vcc_lo
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v5, v199
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_u16 v128, 0xfe0b
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, -1
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, 0.5
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, exec_hi
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, exec_lo
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, m0
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, null
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, s1
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, s105
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, src_scc
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, ttmp15
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, v1
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, v127
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, vcc_hi
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, vcc_lo
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v5, v199
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f32_f16 v5, v199
				// GFX11: v_cvt_f32_f16_e64

				v_cvt_i16_f16 v128, 0xfe0b
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, -1
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, 0.5
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, exec_hi
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, exec_lo
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, m0
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, null
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, s1
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, s105
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, src_scc
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, ttmp15
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, v1
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, v127
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, vcc_hi
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, vcc_lo
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v5, v199
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i32_i16 v5, v199
				// GFX11: v_cvt_i32_i16_e64

				v_cvt_norm_i16_f16 v128, 0xfe0b
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, -1
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, 0.5
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, exec_hi
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, exec_lo
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, m0
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, null
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, s1
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, s105
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, src_scc
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, ttmp15
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, v1
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, v127
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, vcc_hi
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, vcc_lo
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v5, v199
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_u16_f16 v128, 0xfe0b
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, -1
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, 0.5
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, exec_hi
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, exec_lo
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, m0
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, null
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, s1
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, s105
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, src_scc
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, ttmp15
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, v1
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, v127
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, vcc_hi
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, vcc_lo
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v5, v199
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_u16_f16 v128, 0xfe0b
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, -1
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, 0.5
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, exec_hi
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, exec_lo
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, m0
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, null
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, s1
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, s105
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, src_scc
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, ttmp15
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, v1
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, v127
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, vcc_hi
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, vcc_lo
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v5, v199
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u32_u16 v5, v199
				// GFX11: v_cvt_u32_u16_e64

				v_exp_f16 v128, 0xfe0b
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, -1
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, 0.5
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, exec_hi
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, exec_lo
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, m0
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, null
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, s1
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, s105
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, src_scc
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, ttmp15
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, v1
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, v127
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, vcc_hi
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, vcc_lo
				// GFX11: v_exp_f16_e64

				v_exp_f16 v5, v199
				// GFX11: v_exp_f16_e64

				v_floor_f16 v128, 0xfe0b
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, -1
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, 0.5
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, exec_hi
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, exec_lo
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, m0
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, null
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, s1
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, s105
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, src_scc
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, ttmp15
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, v1
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, v127
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, vcc_hi
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, vcc_lo
				// GFX11: v_floor_f16_e64

				v_floor_f16 v5, v199
				// GFX11: v_floor_f16_e64

				v_fract_f16 v128, 0xfe0b
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, -1
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, 0.5
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, exec_hi
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, exec_lo
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, m0
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, null
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, s1
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, s105
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, src_scc
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, ttmp15
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, v1
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, v127
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, vcc_hi
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, vcc_lo
				// GFX11: v_fract_f16_e64

				v_fract_f16 v5, v199
				// GFX11: v_fract_f16_e64

				v_frexp_exp_i16_f16 v128, 0xfe0b
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, -1
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, 0.5
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, exec_hi
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, exec_lo
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, m0
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, null
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, s1
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, s105
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, src_scc
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, ttmp15
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, v1
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, v127
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, vcc_hi
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, vcc_lo
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v5, v199
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_mant_f16 v128, 0xfe0b
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, -1
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, 0.5
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, exec_hi
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, exec_lo
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, m0
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, null
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, s1
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, s105
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, src_scc
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, ttmp15
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, v1
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, v127
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, vcc_hi
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, vcc_lo
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v5, v199
				// GFX11: v_frexp_mant_f16_e64

				v_log_f16 v128, 0xfe0b
				// GFX11: v_log_f16_e64

				v_log_f16 v255, -1
				// GFX11: v_log_f16_e64

				v_log_f16 v255, 0.5
				// GFX11: v_log_f16_e64

				v_log_f16 v255, exec_hi
				// GFX11: v_log_f16_e64

				v_log_f16 v255, exec_lo
				// GFX11: v_log_f16_e64

				v_log_f16 v255, m0
				// GFX11: v_log_f16_e64

				v_log_f16 v255, null
				// GFX11: v_log_f16_e64

				v_log_f16 v255, s1
				// GFX11: v_log_f16_e64

				v_log_f16 v255, s105
				// GFX11: v_log_f16_e64

				v_log_f16 v255, src_scc
				// GFX11: v_log_f16_e64

				v_log_f16 v255, ttmp15
				// GFX11: v_log_f16_e64

				v_log_f16 v255, v1
				// GFX11: v_log_f16_e64

				v_log_f16 v255, v127
				// GFX11: v_log_f16_e64

				v_log_f16 v255, vcc_hi
				// GFX11: v_log_f16_e64

				v_log_f16 v255, vcc_lo
				// GFX11: v_log_f16_e64

				v_log_f16 v5, v199
				// GFX11: v_log_f16_e64

				v_not_b16 v128, 0xfe0b
				// GFX11: v_not_b16_e64

				v_not_b16 v255, -1
				// GFX11: v_not_b16_e64

				v_not_b16 v255, 0.5
				// GFX11: v_not_b16_e64

				v_not_b16 v255, exec_hi
				// GFX11: v_not_b16_e64

				v_not_b16 v255, exec_lo
				// GFX11: v_not_b16_e64

				v_not_b16 v255, m0
				// GFX11: v_not_b16_e64

				v_not_b16 v255, null
				// GFX11: v_not_b16_e64

				v_not_b16 v255, s1
				// GFX11: v_not_b16_e64

				v_not_b16 v255, s105
				// GFX11: v_not_b16_e64

				v_not_b16 v255, src_scc
				// GFX11: v_not_b16_e64

				v_not_b16 v255, ttmp15
				// GFX11: v_not_b16_e64

				v_not_b16 v255, v1
				// GFX11: v_not_b16_e64

				v_not_b16 v255, v127
				// GFX11: v_not_b16_e64

				v_not_b16 v255, vcc_hi
				// GFX11: v_not_b16_e64

				v_not_b16 v255, vcc_lo
				// GFX11: v_not_b16_e64

				v_not_b16 v5, v199
				// GFX11: v_not_b16_e64

				v_rcp_f16 v128, 0xfe0b
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, -1
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, 0.5
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, exec_hi
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, exec_lo
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, m0
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, null
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, s1
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, s105
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, src_scc
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, ttmp15
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, v1
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, v127
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, vcc_hi
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, vcc_lo
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v5, v199
				// GFX11: v_rcp_f16_e64

				v_rndne_f16 v128, 0xfe0b
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, -1
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, 0.5
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, exec_hi
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, exec_lo
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, m0
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, null
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, s1
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, s105
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, src_scc
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, ttmp15
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, v1
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, v127
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, vcc_hi
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, vcc_lo
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v5, v199
				// GFX11: v_rndne_f16_e64

				v_rsq_f16 v128, 0xfe0b
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, -1
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, 0.5
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, exec_hi
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, exec_lo
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, m0
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, null
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, s1
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, s105
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, src_scc
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, ttmp15
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, v1
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, v127
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, vcc_hi
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, vcc_lo
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v5, v199
				// GFX11: v_rsq_f16_e64

				v_sin_f16 v128, 0xfe0b
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, -1
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, 0.5
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, exec_hi
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, exec_lo
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, m0
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, null
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, s1
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, s105
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, src_scc
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, ttmp15
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, v1
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, v127
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, vcc_hi
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, vcc_lo
				// GFX11: v_sin_f16_e64

				v_sin_f16 v5, v199
				// GFX11: v_sin_f16_e64

				v_sqrt_f16 v128, 0xfe0b
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, -1
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, 0.5
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, exec_hi
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, exec_lo
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, m0
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, null
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, s1
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, s105
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, src_scc
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, ttmp15
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, v1
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, v127
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, vcc_hi
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, vcc_lo
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v5, v199
				// GFX11: v_sqrt_f16_e64

				v_trunc_f16 v128, 0xfe0b
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, -1
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, 0.5
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, exec_hi
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, exec_lo
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, m0
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, null
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, s1
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, s105
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, src_scc
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, ttmp15
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, v1
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, v127
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, vcc_hi
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, vcc_lo
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v5, v199
				// GFX11: v_trunc_f16_e64

				v_ceil_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_ceil_f16_e64

				v_cos_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_cos_f16_e64

				v_cos_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_cos_f16_e64

				v_cvt_f16_f32 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_i16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_u16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f32_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_f32_f16_e64

				v_cvt_i16_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i32_i16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_i32_i16_e64

				v_cvt_norm_i16_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_u16_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_u16_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u32_u16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_cvt_u32_u16_e64

				v_exp_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_exp_f16_e64

				v_exp_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_exp_f16_e64

				v_floor_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_floor_f16_e64

				v_floor_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_floor_f16_e64

				v_fract_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_fract_f16_e64

				v_fract_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_fract_f16_e64

				v_frexp_exp_i16_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_mant_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_frexp_mant_f16_e64

				v_log_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_log_f16_e64

				v_log_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_log_f16_e64

				v_log_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_log_f16_e64

				v_not_b16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_not_b16_e64

				v_not_b16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_not_b16_e64

				v_not_b16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_not_b16_e64

				v_rcp_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_rcp_f16_e64

				v_rndne_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_rndne_f16_e64

				v_rsq_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_rsq_f16_e64

				v_sin_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_sin_f16_e64

				v_sin_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_sin_f16_e64

				v_sqrt_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_sqrt_f16_e64

				v_trunc_f16 v255, v1 quad_perm:[3,2,1,0]
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, v127 quad_perm:[3,2,1,0]
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v5, v199 quad_perm:[3,2,1,0]
				// GFX11: v_trunc_f16_e64

				v_ceil_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_ceil_f16_e64

				v_ceil_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_ceil_f16_e64

				v_cos_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cos_f16_e64

				v_cos_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cos_f16_e64

				v_cos_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cos_f16_e64

				v_cvt_f16_f32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_f32 v255, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_f16_f32_e64

				v_cvt_f16_i16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_i16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_f16_i16_e64

				v_cvt_f16_u16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f16_u16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_f16_u16_e64

				v_cvt_f32_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_f32_f16_e64

				v_cvt_i16_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i16_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_i16_f16_e64

				v_cvt_i32_i16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_i32_i16_e64

				v_cvt_norm_i16_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_i16_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_norm_i16_f16_e64

				v_cvt_norm_u16_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_norm_u16_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_norm_u16_f16_e64

				v_cvt_u16_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u16_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_u16_f16_e64

				v_cvt_u32_u16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cvt_u32_u16_e64

				v_exp_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_exp_f16_e64

				v_exp_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_exp_f16_e64

				v_exp_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_exp_f16_e64

				v_floor_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_floor_f16_e64

				v_floor_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_floor_f16_e64

				v_floor_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_floor_f16_e64

				v_fract_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_fract_f16_e64

				v_fract_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_fract_f16_e64

				v_fract_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_fract_f16_e64

				v_frexp_exp_i16_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_exp_i16_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_frexp_exp_i16_f16_e64

				v_frexp_mant_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_frexp_mant_f16_e64

				v_frexp_mant_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_frexp_mant_f16_e64

				v_log_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_log_f16_e64

				v_log_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_log_f16_e64

				v_log_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_log_f16_e64

				v_not_b16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_not_b16_e64

				v_not_b16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_not_b16_e64

				v_not_b16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_not_b16_e64

				v_rcp_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_rcp_f16_e64

				v_rcp_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_rcp_f16_e64

				v_rndne_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_rndne_f16_e64

				v_rndne_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_rndne_f16_e64

				v_rsq_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_rsq_f16_e64

				v_rsq_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_rsq_f16_e64

				v_sin_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_sin_f16_e64

				v_sin_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_sin_f16_e64

				v_sin_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_sin_f16_e64

				v_sqrt_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_sqrt_f16_e64

				v_sqrt_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_sqrt_f16_e64

				v_trunc_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_trunc_f16_e64

				v_trunc_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_trunc_f16_e64

llvm/test/MC/AMDGPU/gfx11_asm_vop2_t16_err.s

This file was added.

				// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize32,-wavefrontsize64 -show-encoding %s 2>&1 \| FileCheck --check-prefix=GFX11 --implicit-check-not=error %s
				// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=-wavefrontsize32,+wavefrontsize64 -show-encoding %s 2>&1 \| FileCheck --check-prefix=GFX11 --implicit-check-not=error %s

				v_add_f16_e32 v255, v1, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmaak_f16_e32 v255, v1, v2, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmac_f16_e32 v255, v1, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmamk_f16_e32 v255, v1, 0xfe0b, v3
				// GFX11: error: operands are not valid for this GPU or mode

				v_ldexp_f16_e32 v255, v1, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_max_f16_e32 v255, v1, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_min_f16_e32 v255, v1, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_mul_f16_e32 v255, v1, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_sub_f16_e32 v255, v1, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_subrev_f16_e32 v255, v1, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_add_f16_e32 v5, v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmaak_f16_e32 v5, v255, v2, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmac_f16_e32 v5, v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmamk_f16_e32 v5, v255, 0xfe0b, v3
				// GFX11: error: operands are not valid for this GPU or mode

				v_ldexp_f16_e32 v5, v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_max_f16_e32 v5, v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_min_f16_e32 v5, v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_mul_f16_e32 v5, v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_sub_f16_e32 v5, v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_subrev_f16_e32 v5, v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_add_f16_e32 v5, v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmaak_f16_e32 v5, v1, v255, 0xfe0b
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmac_f16_e32 v5, v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmamk_f16_e32 v5, v1, 0xfe0b, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_max_f16_e32 v5, v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_min_f16_e32 v5, v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_mul_f16_e32 v5, v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_sub_f16_e32 v5, v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_subrev_f16_e32 v5, v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_add_f16_dpp v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmac_f16_dpp v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_ldexp_f16_dpp v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_max_f16_dpp v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_min_f16_dpp v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_mul_f16_dpp v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_sub_f16_dpp v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_subrev_f16_dpp v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_add_f16_dpp v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmac_f16_dpp v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_ldexp_f16_dpp v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_max_f16_dpp v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_min_f16_dpp v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_mul_f16_dpp v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_sub_f16_dpp v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_subrev_f16_dpp v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_add_f16_dpp v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmac_f16_dpp v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_max_f16_dpp v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_min_f16_dpp v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_mul_f16_dpp v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_sub_f16_dpp v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_subrev_f16_dpp v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_add_f16_dpp v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmac_f16_dpp v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_ldexp_f16_dpp v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_max_f16_dpp v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_min_f16_dpp v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_mul_f16_dpp v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_sub_f16_dpp v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_subrev_f16_dpp v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_add_f16_dpp v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmac_f16_dpp v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_ldexp_f16_dpp v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_max_f16_dpp v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_min_f16_dpp v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_mul_f16_dpp v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_sub_f16_dpp v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_subrev_f16_dpp v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_add_f16_dpp v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_fmac_f16_dpp v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_max_f16_dpp v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_min_f16_dpp v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_mul_f16_dpp v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_sub_f16_dpp v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

				v_subrev_f16_dpp v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: operands are not valid for this GPU or mode

llvm/test/MC/AMDGPU/gfx11_asm_vop2_t16_promote.s

This file was added.

				// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize32,-wavefrontsize64 -show-encoding %s 2>&1 \| FileCheck --check-prefix=GFX11 --implicit-check-not=_e32 %s
				// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=-wavefrontsize32,+wavefrontsize64 -show-encoding %s 2>&1 \| FileCheck --check-prefix=GFX11 --implicit-check-not=_e32 %s

				v_add_f16 v255, v1, v2
				// GFX11: v_add_f16_e64

				v_fmac_f16 v255, v1, v2
				// GFX11: v_fmac_f16_e64

				v_ldexp_f16 v255, v1, v2
				// GFX11: v_ldexp_f16_e64

				v_max_f16 v255, v1, v2
				// GFX11: v_max_f16_e64

				v_min_f16 v255, v1, v2
				// GFX11: v_min_f16_e64

				v_mul_f16 v255, v1, v2
				// GFX11: v_mul_f16_e64

				v_sub_f16 v255, v1, v2
				// GFX11: v_sub_f16_e64

				v_subrev_f16 v255, v1, v2
				// GFX11: v_subrev_f16_e64

				v_add_f16 v5, v255, v2
				// GFX11: v_add_f16_e64

				v_fmac_f16 v5, v255, v2
				// GFX11: v_fmac_f16_e64

				v_ldexp_f16 v5, v255, v2
				// GFX11: v_ldexp_f16_e64

				v_max_f16 v5, v255, v2
				// GFX11: v_max_f16_e64

				v_min_f16 v5, v255, v2
				// GFX11: v_min_f16_e64

				v_mul_f16 v5, v255, v2
				// GFX11: v_mul_f16_e64

				v_sub_f16 v5, v255, v2
				// GFX11: v_sub_f16_e64

				v_subrev_f16 v5, v255, v2
				// GFX11: v_subrev_f16_e64

				v_add_f16 v5, v1, v255
				// GFX11: v_add_f16_e64

				v_fmac_f16 v5, v1, v255
				// GFX11: v_fmac_f16_e64

				v_max_f16 v5, v1, v255
				// GFX11: v_max_f16_e64

				v_min_f16 v5, v1, v255
				// GFX11: v_min_f16_e64

				v_mul_f16 v5, v1, v255
				// GFX11: v_mul_f16_e64

				v_sub_f16 v5, v1, v255
				// GFX11: v_sub_f16_e64

				v_subrev_f16 v5, v1, v255
				// GFX11: v_subrev_f16_e64

				v_add_f16 v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: v_add_f16_e64

				v_ldexp_f16 v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: v_ldexp_f16_e64

				v_max_f16 v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: v_max_f16_e64

				v_min_f16 v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: v_min_f16_e64

				v_mul_f16 v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: v_mul_f16_e64

				v_sub_f16 v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: v_sub_f16_e64

				v_subrev_f16 v255, v1, v2 quad_perm:[3,2,1,0]
				// GFX11: v_subrev_f16_e64

				v_add_f16 v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_add_f16_e64

				v_ldexp_f16 v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_ldexp_f16_e64

				v_max_f16 v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_max_f16_e64

				v_min_f16 v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_min_f16_e64

				v_mul_f16 v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_mul_f16_e64

				v_sub_f16 v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_sub_f16_e64

				v_subrev_f16 v5, v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_subrev_f16_e64

				v_add_f16 v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_add_f16_e64

				v_max_f16 v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_max_f16_e64

				v_min_f16 v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_min_f16_e64

				v_mul_f16 v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_mul_f16_e64

				v_sub_f16 v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_sub_f16_e64

				v_subrev_f16 v5, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_subrev_f16_e64

				v_add_f16 v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_add_f16_e64

				v_ldexp_f16 v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_ldexp_f16_e64

				v_max_f16 v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_max_f16_e64

				v_min_f16 v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_min_f16_e64

				v_mul_f16 v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_mul_f16_e64

				v_sub_f16 v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_sub_f16_e64

				v_subrev_f16 v255, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_subrev_f16_e64

				v_add_f16 v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_add_f16_e64

				v_ldexp_f16 v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_ldexp_f16_e64

				v_max_f16 v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_max_f16_e64

				v_min_f16 v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_min_f16_e64

				v_mul_f16 v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_mul_f16_e64

				v_sub_f16 v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_sub_f16_e64

				v_subrev_f16 v5, v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_subrev_f16_e64

				v_add_f16 v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_add_f16_e64

				v_max_f16 v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_max_f16_e64

				v_min_f16 v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_min_f16_e64

				v_mul_f16 v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_mul_f16_e64

				v_sub_f16 v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_sub_f16_e64

				v_subrev_f16 v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_subrev_f16_e64

llvm/test/MC/AMDGPU/gfx11_asm_vopc_t16_err.s

This file was added.

				// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize64 -show-encoding %s 2>&1 \| FileCheck --check-prefix=GFX11 --implicit-check-not=error %s

				v_cmp_class_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, v127, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, vcc_hi, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, vcc_lo, v255
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, v128, v2
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_class_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_i16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_eq_u16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_f_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_i16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ge_u16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_i16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_gt_u16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_i16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_le_u16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lg_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_i16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_lt_u16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_i16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ne_u16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_neq_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nge_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_ngt_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nle_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlg_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_nlt_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_o_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_t_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_tru_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmp_u_f16_e32 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

llvm/test/MC/AMDGPU/gfx11_asm_vopc_t16_promote.s

This file was added.

				// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize64 -show-encoding %s 2>&1 \| FileCheck --check-prefix=GFX11 %s

				v_cmp_class_f16 vcc, v1, v255
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc, v127, v255
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_class_f16_e64

				v_cmp_eq_f16 vcc, v1, v255
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc, v127, v255
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_i16 vcc, v1, v255
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc, v127, v255
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc, vcc_hi, v255
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc, vcc_lo, v255
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, v1, v255
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, v127, v255
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_u16 vcc, v1, v255
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc, v127, v255
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc, vcc_hi, v255
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc, vcc_lo, v255
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, v1, v255
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, v127, v255
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_f_f16 vcc, v1, v255
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc, v127, v255
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_f_f16_e64

				v_cmp_ge_f16 vcc, v1, v255
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc, v127, v255
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_i16 vcc, v1, v255
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc, v127, v255
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc, vcc_hi, v255
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc, vcc_lo, v255
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, v1, v255
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, v127, v255
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_u16 vcc, v1, v255
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc, v127, v255
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc, vcc_hi, v255
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc, vcc_lo, v255
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, v1, v255
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, v127, v255
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_gt_f16 vcc, v1, v255
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc, v127, v255
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_i16 vcc, v1, v255
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc, v127, v255
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc, vcc_hi, v255
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc, vcc_lo, v255
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, v1, v255
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, v127, v255
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_u16 vcc, v1, v255
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc, v127, v255
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc, vcc_hi, v255
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc, vcc_lo, v255
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, v1, v255
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, v127, v255
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_le_f16 vcc, v1, v255
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc, v127, v255
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_i16 vcc, v1, v255
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc, v127, v255
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc, vcc_hi, v255
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc, vcc_lo, v255
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, v1, v255
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, v127, v255
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_u16 vcc, v1, v255
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc, v127, v255
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc, vcc_hi, v255
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc, vcc_lo, v255
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, v1, v255
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, v127, v255
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_le_u16_e64

				v_cmp_lg_f16 vcc, v1, v255
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc, v127, v255
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lt_f16 vcc, v1, v255
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc, v127, v255
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_i16 vcc, v1, v255
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc, v127, v255
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc, vcc_hi, v255
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc, vcc_lo, v255
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, v1, v255
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, v127, v255
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_u16 vcc, v1, v255
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc, v127, v255
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc, vcc_hi, v255
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc, vcc_lo, v255
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, v1, v255
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, v127, v255
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_ne_i16 vcc, v1, v255
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc, v127, v255
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc, vcc_hi, v255
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc, vcc_lo, v255
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, v1, v255
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, v127, v255
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_u16 vcc, v1, v255
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc, v127, v255
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc, vcc_hi, v255
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc, vcc_lo, v255
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, v1, v255
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, v127, v255
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_neq_f16 vcc, v1, v255
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc, v127, v255
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_nge_f16 vcc, v1, v255
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc, v127, v255
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_ngt_f16 vcc, v1, v255
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc, v127, v255
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_nle_f16 vcc, v1, v255
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc, v127, v255
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nlg_f16 vcc, v1, v255
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc, v127, v255
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlt_f16 vcc, v1, v255
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc, v127, v255
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_o_f16 vcc, v1, v255
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc, v127, v255
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_o_f16_e64

				v_cmp_t_f16 vcc, v1, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc, v127, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, v1, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, v127, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_t_f16_e64

				v_cmp_u_f16 vcc, v1, v255
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc, v127, v255
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc, vcc_hi, v255
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc, vcc_lo, v255
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, v1, v255
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, v127, v255
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, vcc_hi, v255
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, vcc_lo, v255
				// GFX11: v_cmp_u_f16_e64

				v_cmp_class_f16 vcc, v128, v2
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_class_f16_e64

				v_cmp_eq_f16 vcc, v128, v2
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_i16 vcc, v128, v2
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, v128, v2
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_u16 vcc, v128, v2
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, v128, v2
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_f_f16 vcc, v128, v2
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_f_f16_e64

				v_cmp_ge_f16 vcc, v128, v2
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_i16 vcc, v128, v2
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, v128, v2
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_u16 vcc, v128, v2
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, v128, v2
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_gt_f16 vcc, v128, v2
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_i16 vcc, v128, v2
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, v128, v2
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_u16 vcc, v128, v2
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, v128, v2
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_le_f16 vcc, v128, v2
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_i16 vcc, v128, v2
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, v128, v2
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_u16 vcc, v128, v2
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, v128, v2
				// GFX11: v_cmp_le_u16_e64

				v_cmp_lg_f16 vcc, v128, v2
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lt_f16 vcc, v128, v2
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_i16 vcc, v128, v2
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, v128, v2
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_u16 vcc, v128, v2
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, v128, v2
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_ne_i16 vcc, v128, v2
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, v128, v2
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_u16 vcc, v128, v2
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, v128, v2
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_neq_f16 vcc, v128, v2
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_nge_f16 vcc, v128, v2
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_ngt_f16 vcc, v128, v2
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_nle_f16 vcc, v128, v2
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nlg_f16 vcc, v128, v2
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlt_f16 vcc, v128, v2
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_o_f16 vcc, v128, v2
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_o_f16_e64

				v_cmp_t_f16 vcc, v128, v2
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, v128, v2
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_t_f16_e64

				v_cmp_u_f16 vcc, v128, v2
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, v128, v2
				// GFX11: v_cmp_u_f16_e64

				v_cmp_class_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_class_f16_e64

				v_cmp_eq_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_i16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_u16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_f_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_ge_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_i16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_u16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_gt_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_i16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_u16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_le_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_i16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_u16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_lg_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lt_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_i16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_u16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_ne_i16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_u16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_neq_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_nge_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_ngt_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_nle_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nlg_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlt_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_o_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_t_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_u_f16 vcc, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, v127, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_class_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_class_f16_e64

				v_cmp_eq_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_i16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_u16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_f_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_ge_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_i16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_u16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_gt_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_i16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_u16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_le_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_i16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_u16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_lg_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lt_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_i16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_u16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_ne_i16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_u16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_neq_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_nge_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_ngt_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_nle_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nlg_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlt_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_o_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_t_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_u_f16 vcc, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, v128, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_class_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_class_f16_e64

				v_cmp_eq_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_i16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_u16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_f_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_ge_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_i16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_u16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_gt_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_i16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_u16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_le_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_i16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_u16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_lg_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lt_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_i16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_u16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_ne_i16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_u16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_neq_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_nge_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_ngt_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_nle_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nlg_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlt_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_o_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_t_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_u_f16 vcc, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, v127, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_class_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_class_f16_e64

				v_cmp_class_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_class_f16_e64

				v_cmp_eq_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_f16_e64

				v_cmp_eq_i16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_i16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_i16_e64

				v_cmp_eq_u16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_eq_u16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_eq_u16_e64

				v_cmp_f_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_f_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_f_f16_e64

				v_cmp_ge_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_f16_e64

				v_cmp_ge_i16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_i16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_i16_e64

				v_cmp_ge_u16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_ge_u16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ge_u16_e64

				v_cmp_gt_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_f16_e64

				v_cmp_gt_i16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_i16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_i16_e64

				v_cmp_gt_u16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_gt_u16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_gt_u16_e64

				v_cmp_le_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_f16_e64

				v_cmp_le_i16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_i16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_i16_e64

				v_cmp_le_u16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_le_u16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_le_u16_e64

				v_cmp_lg_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lg_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lg_f16_e64

				v_cmp_lt_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_f16_e64

				v_cmp_lt_i16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_i16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_i16_e64

				v_cmp_lt_u16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_lt_u16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_lt_u16_e64

				v_cmp_ne_i16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_i16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_i16_e64

				v_cmp_ne_u16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_ne_u16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ne_u16_e64

				v_cmp_neq_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_neq_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_neq_f16_e64

				v_cmp_nge_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_nge_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nge_f16_e64

				v_cmp_ngt_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_ngt_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_ngt_f16_e64

				v_cmp_nle_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nle_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nle_f16_e64

				v_cmp_nlg_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlg_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlg_f16_e64

				v_cmp_nlt_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_nlt_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_nlt_f16_e64

				v_cmp_o_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_o_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_o_f16_e64

				v_cmp_t_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_t_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_tru_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_t_f16_e64

				v_cmp_u_f16 vcc, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

				v_cmp_u_f16 vcc_lo, v128, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmp_u_f16_e64

llvm/test/MC/AMDGPU/gfx11_asm_vopcx_t16_err.s

This file was added.

				// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize64 -show-encoding %s 2>&1 \| FileCheck --check-prefix=GFX11 --implicit-check-not=error %s

				v_cmpx_class_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_eq_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_eq_i16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_eq_u16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_f_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ge_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ge_i16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ge_u16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_gt_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_gt_i16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_gt_u16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_le_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_le_i16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_le_u16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_lg_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_lt_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_lt_i16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_lt_u16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ne_i16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ne_u16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_neq_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_nge_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ngt_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_nle_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_nlg_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_nlt_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_o_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_t_f16_e32 v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmpx_tru_f16_e32 v1, v255
				// GFX11: error: invalid operand for instruction

				v_cmpx_u_f16_e32 v1, v255
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_class_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_eq_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_eq_i16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_eq_u16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_f_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ge_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ge_i16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ge_u16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_gt_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_gt_i16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_gt_u16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_le_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_le_i16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_le_u16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_lg_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_lt_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_lt_i16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_lt_u16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ne_i16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ne_u16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_neq_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_nge_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_ngt_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_nle_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_nlg_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_nlt_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_o_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_t_f16_e32 v255, v2
				// GFX11: error: invalid operand for instruction

				v_cmpx_tru_f16_e32 v255, v2
				// GFX11: error: invalid operand for instruction

				v_cmpx_u_f16_e32 v255, v2
				// GFX11: error: operands are not valid for this GPU or mode

				v_cmpx_class_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_i16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_u16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_f_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_i16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_u16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_i16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_u16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_i16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_u16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lg_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_i16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_u16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ne_i16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ne_u16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_neq_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nge_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ngt_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nle_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nlg_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nlt_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_o_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_t_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_tru_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_u_f16_e32 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_class_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_i16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_u16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_f_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_i16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_u16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_i16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_u16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_i16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_u16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lg_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_i16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_u16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ne_i16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ne_u16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_neq_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nge_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ngt_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nle_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nlg_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nlt_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_o_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_t_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_tru_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_u_f16_e32 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_class_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_i16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_u16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_f_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_i16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_u16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_i16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_u16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_i16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_u16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lg_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_i16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_u16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ne_i16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ne_u16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_neq_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nge_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ngt_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nle_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nlg_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nlt_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_o_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_t_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_tru_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_u_f16_e32 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_class_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_i16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_eq_u16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_f_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_i16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ge_u16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_i16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_gt_u16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_i16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_le_u16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lg_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_i16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_lt_u16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ne_i16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ne_u16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_neq_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nge_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_ngt_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nle_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nlg_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_nlt_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_o_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_t_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_tru_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

				v_cmpx_u_f16_e32 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: error: invalid operand for instruction

llvm/test/MC/AMDGPU/gfx11_asm_vopcx_t16_promote.s

This file was added.

				// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize64 -show-encoding %s 2>&1 \| FileCheck --check-prefix=GFX11 %s

				v_cmpx_class_f16 v1, v255
				// GFX11: v_cmpx_class_f16_e64

				v_cmpx_eq_f16 v1, v255
				// GFX11: v_cmpx_eq_f16_e64

				v_cmpx_eq_i16 v1, v255
				// GFX11: v_cmpx_eq_i16_e64

				v_cmpx_eq_u16 v1, v255
				// GFX11: v_cmpx_eq_u16_e64

				v_cmpx_f_f16 v1, v255
				// GFX11: v_cmpx_f_f16_e64

				v_cmpx_ge_f16 v1, v255
				// GFX11: v_cmpx_ge_f16_e64

				v_cmpx_ge_i16 v1, v255
				// GFX11: v_cmpx_ge_i16_e64

				v_cmpx_ge_u16 v1, v255
				// GFX11: v_cmpx_ge_u16_e64

				v_cmpx_gt_f16 v1, v255
				// GFX11: v_cmpx_gt_f16_e64

				v_cmpx_gt_i16 v1, v255
				// GFX11: v_cmpx_gt_i16_e64

				v_cmpx_gt_u16 v1, v255
				// GFX11: v_cmpx_gt_u16_e64

				v_cmpx_le_f16 v1, v255
				// GFX11: v_cmpx_le_f16_e64

				v_cmpx_le_i16 v1, v255
				// GFX11: v_cmpx_le_i16_e64

				v_cmpx_le_u16 v1, v255
				// GFX11: v_cmpx_le_u16_e64

				v_cmpx_lg_f16 v1, v255
				// GFX11: v_cmpx_lg_f16_e64

				v_cmpx_lt_f16 v1, v255
				// GFX11: v_cmpx_lt_f16_e64

				v_cmpx_lt_i16 v1, v255
				// GFX11: v_cmpx_lt_i16_e64

				v_cmpx_lt_u16 v1, v255
				// GFX11: v_cmpx_lt_u16_e64

				v_cmpx_ne_i16 v1, v255
				// GFX11: v_cmpx_ne_i16_e64

				v_cmpx_ne_u16 v1, v255
				// GFX11: v_cmpx_ne_u16_e64

				v_cmpx_neq_f16 v1, v255
				// GFX11: v_cmpx_neq_f16_e64

				v_cmpx_nge_f16 v1, v255
				// GFX11: v_cmpx_nge_f16_e64

				v_cmpx_ngt_f16 v1, v255
				// GFX11: v_cmpx_ngt_f16_e64

				v_cmpx_nle_f16 v1, v255
				// GFX11: v_cmpx_nle_f16_e64

				v_cmpx_nlg_f16 v1, v255
				// GFX11: v_cmpx_nlg_f16_e64

				v_cmpx_nlt_f16 v1, v255
				// GFX11: v_cmpx_nlt_f16_e64

				v_cmpx_o_f16 v1, v255
				// GFX11: v_cmpx_o_f16_e64

				v_cmpx_t_f16 v1, v255
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_tru_f16 v1, v255
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_u_f16 v1, v255
				// GFX11: v_cmpx_u_f16_e64

				v_cmpx_class_f16 v255, v2
				// GFX11: v_cmpx_class_f16_e64

				v_cmpx_eq_f16 v255, v2
				// GFX11: v_cmpx_eq_f16_e64

				v_cmpx_eq_i16 v255, v2
				// GFX11: v_cmpx_eq_i16_e64

				v_cmpx_eq_u16 v255, v2
				// GFX11: v_cmpx_eq_u16_e64

				v_cmpx_f_f16 v255, v2
				// GFX11: v_cmpx_f_f16_e64

				v_cmpx_ge_f16 v255, v2
				// GFX11: v_cmpx_ge_f16_e64

				v_cmpx_ge_i16 v255, v2
				// GFX11: v_cmpx_ge_i16_e64

				v_cmpx_ge_u16 v255, v2
				// GFX11: v_cmpx_ge_u16_e64

				v_cmpx_gt_f16 v255, v2
				// GFX11: v_cmpx_gt_f16_e64

				v_cmpx_gt_i16 v255, v2
				// GFX11: v_cmpx_gt_i16_e64

				v_cmpx_gt_u16 v255, v2
				// GFX11: v_cmpx_gt_u16_e64

				v_cmpx_le_f16 v255, v2
				// GFX11: v_cmpx_le_f16_e64

				v_cmpx_le_i16 v255, v2
				// GFX11: v_cmpx_le_i16_e64

				v_cmpx_le_u16 v255, v2
				// GFX11: v_cmpx_le_u16_e64

				v_cmpx_lg_f16 v255, v2
				// GFX11: v_cmpx_lg_f16_e64

				v_cmpx_lt_f16 v255, v2
				// GFX11: v_cmpx_lt_f16_e64

				v_cmpx_lt_i16 v255, v2
				// GFX11: v_cmpx_lt_i16_e64

				v_cmpx_lt_u16 v255, v2
				// GFX11: v_cmpx_lt_u16_e64

				v_cmpx_ne_i16 v255, v2
				// GFX11: v_cmpx_ne_i16_e64

				v_cmpx_ne_u16 v255, v2
				// GFX11: v_cmpx_ne_u16_e64

				v_cmpx_neq_f16 v255, v2
				// GFX11: v_cmpx_neq_f16_e64

				v_cmpx_nge_f16 v255, v2
				// GFX11: v_cmpx_nge_f16_e64

				v_cmpx_ngt_f16 v255, v2
				// GFX11: v_cmpx_ngt_f16_e64

				v_cmpx_nle_f16 v255, v2
				// GFX11: v_cmpx_nle_f16_e64

				v_cmpx_nlg_f16 v255, v2
				// GFX11: v_cmpx_nlg_f16_e64

				v_cmpx_nlt_f16 v255, v2
				// GFX11: v_cmpx_nlt_f16_e64

				v_cmpx_o_f16 v255, v2
				// GFX11: v_cmpx_o_f16_e64

				v_cmpx_t_f16 v255, v2
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_tru_f16 v255, v2
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_u_f16 v255, v2
				// GFX11: v_cmpx_u_f16_e64

				v_cmpx_class_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_class_f16_e64

				v_cmpx_eq_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_eq_f16_e64

				v_cmpx_eq_i16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_eq_i16_e64

				v_cmpx_eq_u16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_eq_u16_e64

				v_cmpx_f_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_f_f16_e64

				v_cmpx_ge_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ge_f16_e64

				v_cmpx_ge_i16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ge_i16_e64

				v_cmpx_ge_u16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ge_u16_e64

				v_cmpx_gt_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_gt_f16_e64

				v_cmpx_gt_i16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_gt_i16_e64

				v_cmpx_gt_u16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_gt_u16_e64

				v_cmpx_le_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_le_f16_e64

				v_cmpx_le_i16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_le_i16_e64

				v_cmpx_le_u16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_le_u16_e64

				v_cmpx_lg_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_lg_f16_e64

				v_cmpx_lt_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_lt_f16_e64

				v_cmpx_lt_i16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_lt_i16_e64

				v_cmpx_lt_u16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_lt_u16_e64

				v_cmpx_ne_i16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ne_i16_e64

				v_cmpx_ne_u16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ne_u16_e64

				v_cmpx_neq_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_neq_f16_e64

				v_cmpx_nge_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_nge_f16_e64

				v_cmpx_ngt_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ngt_f16_e64

				v_cmpx_nle_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_nle_f16_e64

				v_cmpx_nlg_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_nlg_f16_e64

				v_cmpx_nlt_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_nlt_f16_e64

				v_cmpx_o_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_o_f16_e64

				v_cmpx_t_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_tru_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_u_f16 v1, v255 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_u_f16_e64

				v_cmpx_class_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_class_f16_e64

				v_cmpx_eq_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_eq_f16_e64

				v_cmpx_eq_i16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_eq_i16_e64

				v_cmpx_eq_u16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_eq_u16_e64

				v_cmpx_f_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_f_f16_e64

				v_cmpx_ge_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ge_f16_e64

				v_cmpx_ge_i16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ge_i16_e64

				v_cmpx_ge_u16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ge_u16_e64

				v_cmpx_gt_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_gt_f16_e64

				v_cmpx_gt_i16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_gt_i16_e64

				v_cmpx_gt_u16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_gt_u16_e64

				v_cmpx_le_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_le_f16_e64

				v_cmpx_le_i16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_le_i16_e64

				v_cmpx_le_u16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_le_u16_e64

				v_cmpx_lg_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_lg_f16_e64

				v_cmpx_lt_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_lt_f16_e64

				v_cmpx_lt_i16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_lt_i16_e64

				v_cmpx_lt_u16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_lt_u16_e64

				v_cmpx_ne_i16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ne_i16_e64

				v_cmpx_ne_u16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ne_u16_e64

				v_cmpx_neq_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_neq_f16_e64

				v_cmpx_nge_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_nge_f16_e64

				v_cmpx_ngt_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_ngt_f16_e64

				v_cmpx_nle_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_nle_f16_e64

				v_cmpx_nlg_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_nlg_f16_e64

				v_cmpx_nlt_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_nlt_f16_e64

				v_cmpx_o_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_o_f16_e64

				v_cmpx_t_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_tru_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_u_f16 v255, v2 quad_perm:[3,2,1,0]
				// GFX11: v_cmpx_u_f16_e64

				v_cmpx_class_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_class_f16_e64

				v_cmpx_eq_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_eq_f16_e64

				v_cmpx_eq_i16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_eq_i16_e64

				v_cmpx_eq_u16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_eq_u16_e64

				v_cmpx_f_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_f_f16_e64

				v_cmpx_ge_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ge_f16_e64

				v_cmpx_ge_i16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ge_i16_e64

				v_cmpx_ge_u16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ge_u16_e64

				v_cmpx_gt_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_gt_f16_e64

				v_cmpx_gt_i16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_gt_i16_e64

				v_cmpx_gt_u16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_gt_u16_e64

				v_cmpx_le_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_le_f16_e64

				v_cmpx_le_i16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_le_i16_e64

				v_cmpx_le_u16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_le_u16_e64

				v_cmpx_lg_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_lg_f16_e64

				v_cmpx_lt_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_lt_f16_e64

				v_cmpx_lt_i16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_lt_i16_e64

				v_cmpx_lt_u16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_lt_u16_e64

				v_cmpx_ne_i16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ne_i16_e64

				v_cmpx_ne_u16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ne_u16_e64

				v_cmpx_neq_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_neq_f16_e64

				v_cmpx_nge_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_nge_f16_e64

				v_cmpx_ngt_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ngt_f16_e64

				v_cmpx_nle_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_nle_f16_e64

				v_cmpx_nlg_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_nlg_f16_e64

				v_cmpx_nlt_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_nlt_f16_e64

				v_cmpx_o_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_o_f16_e64

				v_cmpx_t_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_tru_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_u_f16 v1, v255 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_u_f16_e64

				v_cmpx_class_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_class_f16_e64

				v_cmpx_eq_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_eq_f16_e64

				v_cmpx_eq_i16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_eq_i16_e64

				v_cmpx_eq_u16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_eq_u16_e64

				v_cmpx_f_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_f_f16_e64

				v_cmpx_ge_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ge_f16_e64

				v_cmpx_ge_i16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ge_i16_e64

				v_cmpx_ge_u16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ge_u16_e64

				v_cmpx_gt_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_gt_f16_e64

				v_cmpx_gt_i16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_gt_i16_e64

				v_cmpx_gt_u16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_gt_u16_e64

				v_cmpx_le_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_le_f16_e64

				v_cmpx_le_i16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_le_i16_e64

				v_cmpx_le_u16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_le_u16_e64

				v_cmpx_lg_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_lg_f16_e64

				v_cmpx_lt_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_lt_f16_e64

				v_cmpx_lt_i16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_lt_i16_e64

				v_cmpx_lt_u16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_lt_u16_e64

				v_cmpx_ne_i16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ne_i16_e64

				v_cmpx_ne_u16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ne_u16_e64

				v_cmpx_neq_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_neq_f16_e64

				v_cmpx_nge_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_nge_f16_e64

				v_cmpx_ngt_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_ngt_f16_e64

				v_cmpx_nle_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_nle_f16_e64

				v_cmpx_nlg_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_nlg_f16_e64

				v_cmpx_nlt_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_nlt_f16_e64

				v_cmpx_o_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_o_f16_e64

				v_cmpx_t_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_tru_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_t_f16_e64

				v_cmpx_u_f16 v255, v2 dpp8:[7,6,5,4,3,2,1,0]
				// GFX11: v_cmpx_u_f16_e64

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,CClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 460250

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp

llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

llvm/lib/Target/AMDGPU/SIInstrInfo.td

llvm/lib/Target/AMDGPU/SIInstructions.td

llvm/lib/Target/AMDGPU/SIRegisterInfo.td

llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp

llvm/lib/Target/AMDGPU/VOP1Instructions.td

llvm/lib/Target/AMDGPU/VOP2Instructions.td

llvm/lib/Target/AMDGPU/VOP3Instructions.td

llvm/lib/Target/AMDGPU/VOPCInstructions.td

llvm/lib/Target/AMDGPU/VOPInstructions.td

llvm/test/CodeGen/AMDGPU/GlobalISel/fma.ll

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fcanonicalize.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fcmp.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fmaxnum-ieee.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fmaxnum.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fminnum-ieee.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fminnum.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fptosi.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fptoui.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-icmp.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-smed3.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-umed3.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sitofp.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-uitofp.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll

llvm/test/CodeGen/AMDGPU/attr-amdgpu-flat-work-group-size-vgpr-limit.ll

llvm/test/CodeGen/AMDGPU/coalescer-early-clobber-subreg.mir

llvm/test/CodeGen/AMDGPU/gfx10-shrink-mad-fma.mir

llvm/test/CodeGen/AMDGPU/gfx10-twoaddr-fma.mir

llvm/test/CodeGen/AMDGPU/gfx11-twoaddr-fma.mir

llvm/test/CodeGen/AMDGPU/inline-asm.i128.ll

llvm/test/CodeGen/AMDGPU/partial-regcopy-and-spill-missed-at-regalloc.ll

llvm/test/CodeGen/AMDGPU/preserve-hi16.ll

llvm/test/CodeGen/AMDGPU/shrink-mad-fma.mir

llvm/test/CodeGen/AMDGPU/spill-vector-superclass.ll

llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll

llvm/test/CodeGen/AMDGPU/true16-ra-f128-fail.mir

llvm/test/CodeGen/AMDGPU/true16-ra-pre-gfx11-regression-test.mir

llvm/test/CodeGen/AMDGPU/twoaddr-fma.mir

llvm/test/CodeGen/AMDGPU/vopc_dpp.mir

llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_err.s

llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_promote.s

llvm/test/MC/AMDGPU/gfx11_asm_vop2_t16_err.s

llvm/test/MC/AMDGPU/gfx11_asm_vop2_t16_promote.s

llvm/test/MC/AMDGPU/gfx11_asm_vopc_t16_err.s

llvm/test/MC/AMDGPU/gfx11_asm_vopc_t16_promote.s

llvm/test/MC/AMDGPU/gfx11_asm_vopcx_t16_err.s

llvm/test/MC/AMDGPU/gfx11_asm_vopcx_t16_promote.s

[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C
ClosedPublic