This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][True16] Support emitting copies between different register sizes.
ClosedPublic

Authored by kosarev on Jul 24 2023, 4:12 AM.

Download Raw Diff

Details

Reviewers

arsenm
rampitec
Joe_Nash
foad

Commits

rG758df22bcf21: [AMDGPU][True16] Support emitting copies between different register sizes.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

kosarev created this revision.Jul 24 2023, 4:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2023, 4:12 AM

Herald added subscribers: StephenFan, kerbowa, hiraditya and 5 others. · View Herald Transcript

kosarev requested review of this revision.Jul 24 2023, 4:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2023, 4:12 AM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

kosarev added a parent revision: D156104: [AMDGPU] Switch to using real True16 operands..Jul 24 2023, 4:23 AM

kosarev added a child revision: D156106: [AMDGPU] Test codegen'ing True16 additions..

arsenm added inline comments.Jul 24 2023, 6:15 AM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
742–746	Is this actually reachable? I forget how exactly we ended up with this partial 16-bit register thing

Harbormaster completed remote builds in B247621: Diff 543469.Jul 24 2023, 7:38 AM

rampitec added inline comments.Jul 24 2023, 10:52 AM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
743	Are these tabs?

kosarev added inline comments.Jul 24 2023, 11:44 AM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
742–746	I'm going to try to understand more about what's going on here, but maybe @Joe_Nash already knows the answer as he was working on that bit.
743	Nope, just re-indendting with proper spaces.

Joe_Nash added inline comments.Jul 27 2023, 1:22 PM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
742–746	It is covered by lo16-32bit-physreg-copy.mir. In theory it is reachable on any target with 16 bit instructions (excluding GF11 with True 16 bit instructions). This code converts COPY %1:vgpr_32 = %2:vgpr_lo16 or COPY %1:vgpr_lo16 = %2:vgpr_32 into v_mov_b32 on those targets. On GFX11 it converts to v_mov_b16 Now as to whether the code sequence in that mir test (or any use of a VGPR_LO16 register) is produced in any legitimate shader, I don't think so. That probably needs to be verified empirically on a test corpus, but then this code and I think VGPR_LO16/ VGPR_HI16 can be removed.

Joe_Nash mentioned this in D156102: [AMDGPU] Don't suppress printing the .l and .h register suffixes..Jul 27 2023, 1:40 PM

Support generating differently-sized register transfers.

They are called copies, not transfers. How about: "Support emitting copies between different register sizes"?

kosarev removed a child revision: D156106: [AMDGPU] Test codegen'ing True16 additions..Jul 28 2023, 11:52 AM

kosarev added a child revision: D156529: [AMDGPU][True16] Pre-commit addition tests..

kosarev edited parent revisions, added: D156782: [AMDGPU] Test disassembling of some basic True16 VOP2 instructions.; removed: D156104: [AMDGPU] Switch to using real True16 operands..Aug 1 2023, 2:49 AM

Change the commit title as suggested and remove the unused code handling
the non-True16 case along with the test covering it.

kosarev retitled this revision from [AMDGPU][True16] Support generating differently-sized register transfers. to [AMDGPU][True16] Support emitting copies between different register sizes..Aug 1 2023, 4:57 AM

Harbormaster completed remote builds in B249453: Diff 546005.Aug 1 2023, 4:57 AM

In D156105#4550060, @kosarev wrote:

Change the commit title as suggested and remove the unused code handling
the non-True16 case along with the test covering it.

Can you please do this case removal as a separate patch?

Restore the non-True16 case branch to then remove it with a separate patch.

kosarev added a child revision: D156985: [AMDGPU] Remove the support for non-True16 copies between different register sizes..Aug 3 2023, 3:45 AM

kosarev removed a child revision: D156529: [AMDGPU][True16] Pre-commit addition tests..

Harbormaster completed remote builds in B250014: Diff 546788.Aug 3 2023, 6:54 AM

Ping.

LGTM

This revision is now accepted and ready to land.Aug 22 2023, 6:31 AM

Closed by commit rG758df22bcf21: [AMDGPU][True16] Support emitting copies between different register sizes. (authored by kosarev). · Explain WhySep 26 2023, 4:15 AM

This revision was automatically updated to reflect the committed changes.

kosarev added a commit: rG758df22bcf21: [AMDGPU][True16] Support emitting copies between different register sizes..

Revision Contents

Path

Size

llvm/

lib/

Target/

AMDGPU/

SIInstrInfo.cpp

77 lines

VOP1Instructions.td

2 lines

Diff 557352

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 718 Lines • ▼ Show 20 Lines	if (KillSrc)
LastMI->addRegisterKilled(SrcReg, &RI);		LastMI->addRegisterKilled(SrcReg, &RI);
}		}

void SIInstrInfo::copyPhysReg(MachineBasicBlock &MBB,		void SIInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MI,		MachineBasicBlock::iterator MI,
const DebugLoc &DL, MCRegister DestReg,		const DebugLoc &DL, MCRegister DestReg,
MCRegister SrcReg, bool KillSrc) const {		MCRegister SrcReg, bool KillSrc) const {
const TargetRegisterClass *RC = RI.getPhysRegBaseClass(DestReg);		const TargetRegisterClass *RC = RI.getPhysRegBaseClass(DestReg);
		unsigned Size = RI.getRegSizeInBits(*RC);
		const TargetRegisterClass *SrcRC = RI.getPhysRegBaseClass(SrcReg);
		unsigned SrcSize = RI.getRegSizeInBits(*SrcRC);

// FIXME: This is hack to resolve copies between 16 bit and 32 bit		// The rest of copyPhysReg assumes Src and Dst size are the same size.
// registers until all patterns are fixed.		// TODO-GFX11_16BIT If all true 16 bit instruction patterns are completed can
if (Fix16BitCopies &&		// we remove Fix16BitCopies and this code block?
((RI.getRegSizeInBits(*RC) == 16) ^		if (Fix16BitCopies) {
(RI.getRegSizeInBits(*RI.getPhysRegBaseClass(SrcReg)) == 16))) {		if (((Size == 16) != (SrcSize == 16))) {
MCRegister &RegToFix = (RI.getRegSizeInBits(*RC) == 16) ? DestReg : SrcReg;		if (ST.hasTrue16BitInsts()) {
		// Non-VGPR Src and Dst will later be expanded back to 32 bits.
		MCRegister &RegToFix = (Size == 32) ? DestReg : SrcReg;
		MCRegister SubReg = RI.getSubReg(RegToFix, AMDGPU::lo16);
		RegToFix = SubReg;
		} else {
		MCRegister &RegToFix = (Size == 16) ? DestReg : SrcReg;
MCRegister Super = RI.get32BitRegister(RegToFix);		MCRegister Super = RI.get32BitRegister(RegToFix);
		rampitecUnsubmitted Not Done Reply Inline Actions Are these tabs? rampitec: Are these tabs?
		kosarevAuthorUnsubmitted Done Reply Inline Actions Nope, just re-indendting with proper spaces. kosarev: Nope, just re-indendting with proper spaces.
assert(RI.getSubReg(Super, AMDGPU::lo16) == RegToFix);		assert(RI.getSubReg(Super, AMDGPU::lo16) == RegToFix \|\|
		RI.getSubReg(Super, AMDGPU::hi16) == RegToFix);
RegToFix = Super;		RegToFix = Super;
		arsenmUnsubmitted Not Done Reply Inline Actions Is this actually reachable? I forget how exactly we ended up with this partial 16-bit register thing arsenm: Is this actually reachable? I forget how exactly we ended up with this partial 16-bit register…
		kosarevAuthorUnsubmitted Done Reply Inline Actions I'm going to try to understand more about what's going on here, but maybe @Joe_Nash already knows the answer as he was working on that bit. kosarev: I'm going to try to understand more about what's going on here, but maybe @Joe_Nash already…
		Joe_NashUnsubmitted Not Done Reply Inline Actions It is covered by lo16-32bit-physreg-copy.mir. In theory it is reachable on any target with 16 bit instructions (excluding GF11 with True 16 bit instructions). This code converts COPY %1:vgpr_32 = %2:vgpr_lo16 or COPY %1:vgpr_lo16 = %2:vgpr_32 into v_mov_b32 on those targets. On GFX11 it converts to v_mov_b16 Now as to whether the code sequence in that mir test (or any use of a VGPR_LO16 register) is produced in any legitimate shader, I don't think so. That probably needs to be verified empirically on a test corpus, but then this code and I think VGPR_LO16/ VGPR_HI16 can be removed. Joe_Nash: It is covered by lo16-32bit-physreg-copy.mir. In theory it is reachable on any target with 16…
		}

if (DestReg == SrcReg) {		if (DestReg == SrcReg) {
// Insert empty bundle since ExpandPostRA expects an instruction here.		// Identity copy. Insert empty bundle since ExpandPostRA expects an
		// instruction here.
BuildMI(MBB, MI, DL, get(AMDGPU::BUNDLE));		BuildMI(MBB, MI, DL, get(AMDGPU::BUNDLE));
return;		return;
}		}

RC = RI.getPhysRegBaseClass(DestReg);		RC = RI.getPhysRegBaseClass(DestReg);
		Size = RI.getRegSizeInBits(*RC);
		SrcRC = RI.getPhysRegBaseClass(SrcReg);
		SrcSize = RI.getRegSizeInBits(*SrcRC);
		}
}		}

if (RC == &AMDGPU::VGPR_32RegClass) {		if (RC == &AMDGPU::VGPR_32RegClass) {
assert(AMDGPU::VGPR_32RegClass.contains(SrcReg) \|\|		assert(AMDGPU::VGPR_32RegClass.contains(SrcReg) \|\|
AMDGPU::SReg_32RegClass.contains(SrcReg) \|\|		AMDGPU::SReg_32RegClass.contains(SrcReg) \|\|
AMDGPU::AGPR_32RegClass.contains(SrcReg));		AMDGPU::AGPR_32RegClass.contains(SrcReg));
unsigned Opc = AMDGPU::AGPR_32RegClass.contains(SrcReg) ?		unsigned Opc = AMDGPU::AGPR_32RegClass.contains(SrcReg) ?
AMDGPU::V_ACCVGPR_READ_B32_e64 : AMDGPU::V_MOV_B32_e32;		AMDGPU::V_ACCVGPR_READ_B32_e64 : AMDGPU::V_MOV_B32_e32;
▲ Show 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	if (RC == &AMDGPU::AGPR_32RegClass) {
// FIXME: Pass should maintain scavenger to avoid scan through the block on		// FIXME: Pass should maintain scavenger to avoid scan through the block on
// every AGPR spill.		// every AGPR spill.
RegScavenger RS;		RegScavenger RS;
const bool Overlap = RI.regsOverlap(SrcReg, DestReg);		const bool Overlap = RI.regsOverlap(SrcReg, DestReg);
indirectCopyToAGPR(*this, MBB, MI, DL, DestReg, SrcReg, KillSrc, RS, Overlap);		indirectCopyToAGPR(*this, MBB, MI, DL, DestReg, SrcReg, KillSrc, RS, Overlap);
return;		return;
}		}

const unsigned Size = RI.getRegSizeInBits(*RC);
if (Size == 16) {		if (Size == 16) {
assert(AMDGPU::VGPR_LO16RegClass.contains(SrcReg) \|\|		assert(AMDGPU::VGPR_16RegClass.contains(SrcReg) \|\|
AMDGPU::VGPR_HI16RegClass.contains(SrcReg) \|\|
AMDGPU::SReg_LO16RegClass.contains(SrcReg) \|\|		AMDGPU::SReg_LO16RegClass.contains(SrcReg) \|\|
AMDGPU::AGPR_LO16RegClass.contains(SrcReg));		AMDGPU::AGPR_LO16RegClass.contains(SrcReg));

bool IsSGPRDst = AMDGPU::SReg_LO16RegClass.contains(DestReg);		bool IsSGPRDst = AMDGPU::SReg_LO16RegClass.contains(DestReg);
bool IsSGPRSrc = AMDGPU::SReg_LO16RegClass.contains(SrcReg);		bool IsSGPRSrc = AMDGPU::SReg_LO16RegClass.contains(SrcReg);
bool IsAGPRDst = AMDGPU::AGPR_LO16RegClass.contains(DestReg);		bool IsAGPRDst = AMDGPU::AGPR_LO16RegClass.contains(DestReg);
bool IsAGPRSrc = AMDGPU::AGPR_LO16RegClass.contains(SrcReg);		bool IsAGPRSrc = AMDGPU::AGPR_LO16RegClass.contains(SrcReg);
bool DstLow = AMDGPU::VGPR_LO16RegClass.contains(DestReg) \|\|		bool DstLow = AMDGPU::VGPR_LO16RegClass.contains(DestReg) \|\|
Show All 21 Lines	if (IsAGPRDst \|\| IsAGPRSrc) {
reportIllegalCopy(this, MBB, MI, DL, DestReg, SrcReg, KillSrc,		reportIllegalCopy(this, MBB, MI, DL, DestReg, SrcReg, KillSrc,
"Cannot use hi16 subreg with an AGPR!");		"Cannot use hi16 subreg with an AGPR!");
}		}

copyPhysReg(MBB, MI, DL, NewDestReg, NewSrcReg, KillSrc);		copyPhysReg(MBB, MI, DL, NewDestReg, NewSrcReg, KillSrc);
return;		return;
}		}

		if (ST.hasTrue16BitInsts()) {
		if (IsSGPRSrc) {
		assert(SrcLow);
		SrcReg = NewSrcReg;
		}
		// Use the smaller instruction encoding if possible.
		if (AMDGPU::VGPR_16_Lo128RegClass.contains(DestReg) &&
		(IsSGPRSrc \|\| AMDGPU::VGPR_16_Lo128RegClass.contains(SrcReg))) {
		BuildMI(MBB, MI, DL, get(AMDGPU::V_MOV_B16_t16_e32), DestReg)
		.addReg(SrcReg);
		} else {
		BuildMI(MBB, MI, DL, get(AMDGPU::V_MOV_B16_t16_e64), DestReg)
		.addImm(0) // src0_modifiers
		.addReg(SrcReg)
		.addImm(0); // op_sel
		}
		return;
		}

if (IsSGPRSrc && !ST.hasSDWAScalar()) {		if (IsSGPRSrc && !ST.hasSDWAScalar()) {
if (!DstLow \|\| !SrcLow) {		if (!DstLow \|\| !SrcLow) {
reportIllegalCopy(this, MBB, MI, DL, DestReg, SrcReg, KillSrc,		reportIllegalCopy(this, MBB, MI, DL, DestReg, SrcReg, KillSrc,
"Cannot use hi16 subreg on VI!");		"Cannot use hi16 subreg on VI!");
}		}

BuildMI(MBB, MI, DL, get(AMDGPU::V_MOV_B32_e32), NewDestReg)		BuildMI(MBB, MI, DL, get(AMDGPU::V_MOV_B32_e32), NewDestReg)
.addReg(NewSrcReg, getKillRegState(KillSrc));		.addReg(NewSrcReg, getKillRegState(KillSrc));
Show All 10 Lines	auto MIB = BuildMI(MBB, MI, DL, get(AMDGPU::V_MOV_B32_sdwa), NewDestReg)
.addImm(SrcLow ? AMDGPU::SDWA::SdwaSel::WORD_0		.addImm(SrcLow ? AMDGPU::SDWA::SdwaSel::WORD_0
: AMDGPU::SDWA::SdwaSel::WORD_1)		: AMDGPU::SDWA::SdwaSel::WORD_1)
.addReg(NewDestReg, RegState::Implicit \| RegState::Undef);		.addReg(NewDestReg, RegState::Implicit \| RegState::Undef);
// First implicit operand is $exec.		// First implicit operand is $exec.
MIB->tieOperands(0, MIB->getNumOperands() - 1);		MIB->tieOperands(0, MIB->getNumOperands() - 1);
return;		return;
}		}

const TargetRegisterClass *SrcRC = RI.getPhysRegBaseClass(SrcReg);
if (RC == RI.getVGPR64Class() && (SrcRC == RC \|\| RI.isSGPRClass(SrcRC))) {		if (RC == RI.getVGPR64Class() && (SrcRC == RC \|\| RI.isSGPRClass(SrcRC))) {
if (ST.hasMovB64()) {		if (ST.hasMovB64()) {
BuildMI(MBB, MI, DL, get(AMDGPU::V_MOV_B64_e32), DestReg)		BuildMI(MBB, MI, DL, get(AMDGPU::V_MOV_B64_e32), DestReg)
.addReg(SrcReg, getKillRegState(KillSrc));		.addReg(SrcReg, getKillRegState(KillSrc));
return;		return;
}		}
if (ST.hasPkMovB32()) {		if (ST.hasPkMovB32()) {
BuildMI(MBB, MI, DL, get(AMDGPU::V_PK_MOV_B32), DestReg)		BuildMI(MBB, MI, DL, get(AMDGPU::V_PK_MOV_B32), DestReg)
▲ Show 20 Lines • Show All 339 Lines • ▼ Show 20 Lines	Register SIInstrInfo::insertNE(MachineBasicBlock *MBB,

return Reg;		return Reg;
}		}

unsigned SIInstrInfo::getMovOpcode(const TargetRegisterClass *DstRC) const {		unsigned SIInstrInfo::getMovOpcode(const TargetRegisterClass *DstRC) const {

if (RI.isAGPRClass(DstRC))		if (RI.isAGPRClass(DstRC))
return AMDGPU::COPY;		return AMDGPU::COPY;
if (RI.getRegSizeInBits(*DstRC) == 32) {		if (RI.getRegSizeInBits(*DstRC) == 16) {
		// Assume hi bits are unneeded. Only _e64 true16 instructions are legal
		// before RA.
		return RI.isSGPRClass(DstRC) ? AMDGPU::COPY : AMDGPU::V_MOV_B16_t16_e64;
		} else if (RI.getRegSizeInBits(*DstRC) == 32) {
return RI.isSGPRClass(DstRC) ? AMDGPU::S_MOV_B32 : AMDGPU::V_MOV_B32_e32;		return RI.isSGPRClass(DstRC) ? AMDGPU::S_MOV_B32 : AMDGPU::V_MOV_B32_e32;
} else if (RI.getRegSizeInBits(*DstRC) == 64 && RI.isSGPRClass(DstRC)) {		} else if (RI.getRegSizeInBits(*DstRC) == 64 && RI.isSGPRClass(DstRC)) {
return AMDGPU::S_MOV_B64;		return AMDGPU::S_MOV_B64;
} else if (RI.getRegSizeInBits(*DstRC) == 64 && !RI.isSGPRClass(DstRC)) {		} else if (RI.getRegSizeInBits(*DstRC) == 64 && !RI.isSGPRClass(DstRC)) {
return AMDGPU::V_MOV_B64_PSEUDO;		return AMDGPU::V_MOV_B64_PSEUDO;
}		}
return AMDGPU::COPY;		return AMDGPU::COPY;
}		}
▲ Show 20 Lines • Show All 7,873 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/VOP1Instructions.td

Show First 20 Lines • Show All 650 Lines • ▼ Show 20 Lines
}		}

let SubtargetPredicate = isGFX11Plus in {		let SubtargetPredicate = isGFX11Plus in {
// Restrict src0 to be VGPR		// Restrict src0 to be VGPR
def V_PERMLANE64_B32 : VOP1_Pseudo<"v_permlane64_b32", VOP_MOVRELS,		def V_PERMLANE64_B32 : VOP1_Pseudo<"v_permlane64_b32", VOP_MOVRELS,
getVOP1Pat64<int_amdgcn_permlane64,		getVOP1Pat64<int_amdgcn_permlane64,
VOP_MOVRELS>.ret,		VOP_MOVRELS>.ret,
/VOP1Only=/ 1>;		/VOP1Only=/ 1>;
		defm V_MOV_B16_t16 : VOP1Inst<"v_mov_b16_t16", VOPProfile_True16<VOP_I16_I16>>;
defm V_NOT_B16 : VOP1Inst_t16<"v_not_b16", VOP_I16_I16>;		defm V_NOT_B16 : VOP1Inst_t16<"v_not_b16", VOP_I16_I16>;
defm V_CVT_I32_I16 : VOP1Inst_t16<"v_cvt_i32_i16", VOP_I32_I16>;		defm V_CVT_I32_I16 : VOP1Inst_t16<"v_cvt_i32_i16", VOP_I32_I16>;
defm V_CVT_U32_U16 : VOP1Inst_t16<"v_cvt_u32_u16", VOP_I32_I16>;		defm V_CVT_U32_U16 : VOP1Inst_t16<"v_cvt_u32_u16", VOP_I32_I16>;
} // End SubtargetPredicate = isGFX11Plus		} // End SubtargetPredicate = isGFX11Plus

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Target-specific instruction encodings.		// Target-specific instruction encodings.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	defm V_CVT_FLOOR_I32_F32 : VOP1_Real_FULL_with_name_gfx11<0x00d,
"V_CVT_FLR_I32_F32", "v_cvt_floor_i32_f32">;		"V_CVT_FLR_I32_F32", "v_cvt_floor_i32_f32">;
defm V_CLZ_I32_U32 : VOP1_Real_FULL_with_name_gfx11<0x039,		defm V_CLZ_I32_U32 : VOP1_Real_FULL_with_name_gfx11<0x039,
"V_FFBH_U32", "v_clz_i32_u32">;		"V_FFBH_U32", "v_clz_i32_u32">;
defm V_CTZ_I32_B32 : VOP1_Real_FULL_with_name_gfx11<0x03a,		defm V_CTZ_I32_B32 : VOP1_Real_FULL_with_name_gfx11<0x03a,
"V_FFBL_B32", "v_ctz_i32_b32">;		"V_FFBL_B32", "v_ctz_i32_b32">;
defm V_CLS_I32 : VOP1_Real_FULL_with_name_gfx11<0x03b,		defm V_CLS_I32 : VOP1_Real_FULL_with_name_gfx11<0x03b,
"V_FFBH_I32", "v_cls_i32">;		"V_FFBH_I32", "v_cls_i32">;
defm V_PERMLANE64_B32 : VOP1Only_Real_gfx11<0x067>;		defm V_PERMLANE64_B32 : VOP1Only_Real_gfx11<0x067>;
		defm V_MOV_B16_t16 : VOP1_Real_FULL_t16_gfx11<0x01c, "v_mov_b16">;
defm V_NOT_B16_t16 : VOP1_Real_FULL_t16_gfx11<0x069, "v_not_b16">;		defm V_NOT_B16_t16 : VOP1_Real_FULL_t16_gfx11<0x069, "v_not_b16">;
defm V_CVT_I32_I16_t16 : VOP1_Real_FULL_t16_gfx11<0x06a, "v_cvt_i32_i16">;		defm V_CVT_I32_I16_t16 : VOP1_Real_FULL_t16_gfx11<0x06a, "v_cvt_i32_i16">;
defm V_CVT_U32_U16_t16 : VOP1_Real_FULL_t16_gfx11<0x06b, "v_cvt_u32_u16">;		defm V_CVT_U32_U16_t16 : VOP1_Real_FULL_t16_gfx11<0x06b, "v_cvt_u32_u16">;

defm V_CVT_F16_U16_t16 : VOP1_Real_FULL_t16_gfx11<0x050, "v_cvt_f16_u16">;		defm V_CVT_F16_U16_t16 : VOP1_Real_FULL_t16_gfx11<0x050, "v_cvt_f16_u16">;
defm V_CVT_F16_I16_t16 : VOP1_Real_FULL_t16_gfx11<0x051, "v_cvt_f16_i16">;		defm V_CVT_F16_I16_t16 : VOP1_Real_FULL_t16_gfx11<0x051, "v_cvt_f16_i16">;
defm V_CVT_U16_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x052, "v_cvt_u16_f16">;		defm V_CVT_U16_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x052, "v_cvt_u16_f16">;
defm V_CVT_I16_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x053, "v_cvt_i16_f16">;		defm V_CVT_I16_F16_t16 : VOP1_Real_FULL_t16_gfx11<0x053, "v_cvt_i16_f16">;
▲ Show 20 Lines • Show All 506 Lines • Show Last 20 Lines