This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][MC] Corrected decoding of 16-bit literals
ClosedPublic

Authored by dp on Jul 18 2020, 6:43 AM.

Download Raw Diff

Details

Reviewers

arsenm
rampitec

Commits

rG0b8fd77ad953: [AMDGPU][MC] Corrected decoding of 16-bit literals

Summary

16-bit literals are encoded as 32-bit values. If high 16-bits of the value is 0xFFFF, the decoded instruction cannot be reassembled.
For example, the following code

0xff,0x04,0x04,0x52,0xcd,0xab,0xff,0xff

is currently decoded as

v_mul_lo_u16_e32 v2, 0xffffabcd, v2

However this literal is actually a 64-bit constant 0x00000000ffffabcd which violates requirements described in the documentation - the truncation is not safe.

This change corrects decoding to make reassembly possible.

Codegen should probably be corrected as well to zero-extend 16 bit values.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dp created this revision.Jul 18 2020, 6:43 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2020, 6:43 AM

Herald added subscribers: llvm-commits, kerbowa, hiraditya and 8 others. · View Herald Transcript

arsenm added inline comments.Jul 20 2020, 6:24 AM

llvm/test/CodeGen/AMDGPU/add.i16.ll
39	We currently don't match source modifiers on integer instructions, so there's no situation where a - should appear here

dp marked 2 inline comments as done.Jul 20 2020, 6:58 AM

dp added inline comments.

llvm/test/CodeGen/AMDGPU/add.i16.ll
39	This is not a modifier. '-' is a part of constant. It is handled uniformly for integer and fp operands. See a comment at AMDGPUAsmParser::parseSP3NegModifier: Currently this modifier is allowed in the following context: 1. Before a register, e.g. "-v0", "-v[...]" or "-[v0,v1]". 2. Before an 'abs' modifier: -abs(...) 3. Before an SP3 'abs' modifier: -\|...\| In all other cases "-" is handled as a part of an expression that follows the sign. Do you prefer 0xFFFFFFFFFFFFFCB3?

arsenm added inline comments.Jul 20 2020, 7:00 AM

llvm/test/CodeGen/AMDGPU/add.i16.ll
39	Yes, but truncated to the 32-bit value, so 0xffffcb3

dp marked 3 inline comments as done.Jul 20 2020, 7:15 AM

dp added inline comments.

llvm/test/CodeGen/AMDGPU/add.i16.ll
39	Why do you want it to be truncated to 32 bits? Note that all assembler constants are 64 bit. Currently we have uniform truncation rules which work for 16-bit and 32-bit literals. They have been working fine for several years. See https://llvm.org/docs/AMDGPUOperandSyntax.html#conversion-of-integer-values Do you propose to have separate truncation rules for 16 bit values? This does not seem logical and changing the rules may break existing code.

dp marked 2 inline comments as done.Jul 20 2020, 7:18 AM

dp added inline comments.

llvm/test/CodeGen/AMDGPU/add.i16.ll
39	Did you mean 'truncated to 16 bits'?

arsenm added inline comments.Jul 20 2020, 7:19 AM

llvm/test/CodeGen/AMDGPU/add.i16.ll
39	Because this is encoded as a 32-bit value, so it shouldn't print 16 digits. We don't print signs on hex values anywhere

dp marked an inline comment as done.Jul 20 2020, 7:33 AM

dp added inline comments.

llvm/test/CodeGen/AMDGPU/add.i16.ll
39	Assembler currently enables truncation in two cases: when truncated bits are all 0. when truncated bits are all 1 and the value after truncation has its MSB bit set. Do you suggest to change rules for 16-bit literals to silently truncate low 32 bits w/o any checks?

arsenm added inline comments.Jul 20 2020, 7:58 AM

llvm/test/CodeGen/AMDGPU/add.i16.ll
39	I think we're talking about different things. The asm parser should probably error if you can't truncate to 32-bits without losing bits. I'm saying the printer should never print something that looks like a 8 byte value by printing more than 8 digits, and should not print a - in front of a hex value

dp marked an inline comment as done.Jul 20 2020, 8:54 AM

dp added inline comments.

llvm/test/CodeGen/AMDGPU/add.i16.ll
39	Currently clBuildProgram generates output which cannot be assembled. Either asm parser or printer needs to be corrected. I believe the printer should print a 16-bit value, not a 32-bit one. (Otherwise why do we pretend this is a 16-bit operand?) High 16 bits are unused and truncated by parser (except for special cases like packed operands). We are printing high 16 bits only to show what was actually encoded. So when we have non-zero high 16 bits, this is a corner case. I do not think we should mutilate assembler truncation rules for the sake of handling a special case. I do not like both long hex literals and negative hex literals but this is a lesser evil for my taste.

arsenm added inline comments.Jul 20 2020, 9:17 AM

llvm/test/CodeGen/AMDGPU/add.i16.ll
39	16-bit/4 digits is fine, but having a sign doesn't make sense here

Corrected after a discussion with Matt:

added 16-bit truncation;
printed as unsigned.

Are there any more issues with this change?

arsenm accepted this revision.Jul 21 2020, 3:58 PM

This revision is now accepted and ready to land.Jul 21 2020, 3:58 PM

Closed by commit rG0b8fd77ad953: [AMDGPU][MC] Corrected decoding of 16-bit literals (authored by dp). · Explain WhyJul 22 2020, 7:21 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

Target/

AMDGPU/

MCTargetDesc/

AMDGPUInstPrinter.h

2 lines

AMDGPUInstPrinter.cpp

14 lines

test/

CodeGen/

AMDGPU/

2 lines

2 lines

14 lines

2 lines

2 lines

MC/

Disassembler/

AMDGPU/

literal16_vi.txt

10 lines

vop1.txt

2 lines

vop3-literal.txt

9 lines

Diff 279818

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h

Show First 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	private:
void printVOPDst(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,		void printVOPDst(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,
raw_ostream &O);		raw_ostream &O);
void printVINTRPDst(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,		void printVINTRPDst(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,
raw_ostream &O);		raw_ostream &O);
void printImmediateInt16(uint32_t Imm, const MCSubtargetInfo &STI,		void printImmediateInt16(uint32_t Imm, const MCSubtargetInfo &STI,
raw_ostream &O);		raw_ostream &O);
void printImmediate16(uint32_t Imm, const MCSubtargetInfo &STI,		void printImmediate16(uint32_t Imm, const MCSubtargetInfo &STI,
raw_ostream &O);		raw_ostream &O);
void printImmediateIntV216(uint32_t Imm, const MCSubtargetInfo &STI,
raw_ostream &O);
void printImmediateV216(uint32_t Imm, const MCSubtargetInfo &STI,		void printImmediateV216(uint32_t Imm, const MCSubtargetInfo &STI,
raw_ostream &O);		raw_ostream &O);
void printImmediate32(uint32_t Imm, const MCSubtargetInfo &STI,		void printImmediate32(uint32_t Imm, const MCSubtargetInfo &STI,
raw_ostream &O);		raw_ostream &O);
void printImmediate64(uint64_t Imm, const MCSubtargetInfo &STI,		void printImmediate64(uint64_t Imm, const MCSubtargetInfo &STI,
raw_ostream &O);		raw_ostream &O);
void printOperand(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,		void printOperand(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,
raw_ostream &O);		raw_ostream &O);
▲ Show 20 Lines • Show All 158 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp

Show First 20 Lines • Show All 385 Lines • ▼ Show 20 Lines	void AMDGPUInstPrinter::printVINTRPDst(const MCInst *MI, unsigned OpNo,

printOperand(MI, OpNo, STI, O);		printOperand(MI, OpNo, STI, O);
}		}

void AMDGPUInstPrinter::printImmediateInt16(uint32_t Imm,		void AMDGPUInstPrinter::printImmediateInt16(uint32_t Imm,
const MCSubtargetInfo &STI,		const MCSubtargetInfo &STI,
raw_ostream &O) {		raw_ostream &O) {
int16_t SImm = static_cast<int16_t>(Imm);		int16_t SImm = static_cast<int16_t>(Imm);
if (isInlinableIntLiteral(SImm))		if (isInlinableIntLiteral(SImm)) {
O << SImm;		O << SImm;
else		} else {
O << formatHex(static_cast<uint64_t>(Imm));		uint64_t Imm16 = static_cast<uint16_t>(Imm);
		O << formatHex(Imm16);
		}
}		}

void AMDGPUInstPrinter::printImmediate16(uint32_t Imm,		void AMDGPUInstPrinter::printImmediate16(uint32_t Imm,
const MCSubtargetInfo &STI,		const MCSubtargetInfo &STI,
raw_ostream &O) {		raw_ostream &O) {
int16_t SImm = static_cast<int16_t>(Imm);		int16_t SImm = static_cast<int16_t>(Imm);
if (isInlinableIntLiteral(SImm)) {		if (isInlinableIntLiteral(SImm)) {
O << SImm;		O << SImm;
Show All 14 Lines	else if (Imm == 0xC000)
O<< "-2.0";		O<< "-2.0";
else if (Imm == 0x4400)		else if (Imm == 0x4400)
O<< "4.0";		O<< "4.0";
else if (Imm == 0xC400)		else if (Imm == 0xC400)
O<< "-4.0";		O<< "-4.0";
else if (Imm == 0x3118) {		else if (Imm == 0x3118) {
assert(STI.getFeatureBits()[AMDGPU::FeatureInv2PiInlineImm]);		assert(STI.getFeatureBits()[AMDGPU::FeatureInv2PiInlineImm]);
O << "0.15915494";		O << "0.15915494";
} else		} else {
O << formatHex(static_cast<uint64_t>(Imm));		uint64_t Imm16 = static_cast<uint16_t>(Imm);
		O << formatHex(Imm16);
		}
}		}

void AMDGPUInstPrinter::printImmediateV216(uint32_t Imm,		void AMDGPUInstPrinter::printImmediateV216(uint32_t Imm,
const MCSubtargetInfo &STI,		const MCSubtargetInfo &STI,
raw_ostream &O) {		raw_ostream &O) {
uint16_t Lo16 = static_cast<uint16_t>(Imm);		uint16_t Lo16 = static_cast<uint16_t>(Imm);
printImmediate16(Lo16, STI, O);		printImmediate16(Lo16, STI, O);
}		}
▲ Show 20 Lines • Show All 1,174 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/add.i16.ll

Show All 30 Lines	define amdgpu_kernel void @v_test_add_i16_constant(i16 addrspace(1)* %out, i16 addrspace(1)* %in0) #1 {
%add = add i16 %a, 123		%add = add i16 %a, 123
store i16 %add, i16 addrspace(1)* %out		store i16 %add, i16 addrspace(1)* %out
ret void		ret void
}		}

; FIXME: Need to handle non-uniform case for function below (load without gep).		; FIXME: Need to handle non-uniform case for function below (load without gep).
; GCN-LABEL: {{^}}v_test_add_i16_neg_constant:		; GCN-LABEL: {{^}}v_test_add_i16_neg_constant:
; VI: flat_load_ushort [[A:v[0-9]+]]		; VI: flat_load_ushort [[A:v[0-9]+]]
; VI: v_add_u16_e32 [[ADD:v[0-9]+]], 0xfffffcb3, [[A]]		; VI: v_add_u16_e32 [[ADD:v[0-9]+]], 0xfcb3, [[A]]
		arsenmUnsubmitted Done Reply Inline Actions We currently don't match source modifiers on integer instructions, so there's no situation where a - should appear here arsenm: We currently don't match source modifiers on integer instructions, so there's no situation…
		dpAuthorUnsubmitted Done Reply Inline Actions This is not a modifier. '-' is a part of constant. It is handled uniformly for integer and fp operands. See a comment at AMDGPUAsmParser::parseSP3NegModifier: Currently this modifier is allowed in the following context: 1. Before a register, e.g. "-v0", "-v[...]" or "-[v0,v1]". 2. Before an 'abs' modifier: -abs(...) 3. Before an SP3 'abs' modifier: -\|...\| In all other cases "-" is handled as a part of an expression that follows the sign. Do you prefer 0xFFFFFFFFFFFFFCB3? dp: This is not a modifier. '-' is a part of constant. It is handled uniformly for integer and fp…
		arsenmUnsubmitted Done Reply Inline Actions Yes, but truncated to the 32-bit value, so 0xffffcb3 arsenm: Yes, but truncated to the 32-bit value, so 0xffffcb3
		dpAuthorUnsubmitted Done Reply Inline Actions Why do you want it to be truncated to 32 bits? Note that all assembler constants are 64 bit. Currently we have uniform truncation rules which work for 16-bit and 32-bit literals. They have been working fine for several years. See https://llvm.org/docs/AMDGPUOperandSyntax.html#conversion-of-integer-values Do you propose to have separate truncation rules for 16 bit values? This does not seem logical and changing the rules may break existing code. dp: Why do you want it to be truncated to 32 bits? Note that all assembler constants are 64 bit.
		dpAuthorUnsubmitted Done Reply Inline Actions Did you mean 'truncated to 16 bits'? dp: Did you mean 'truncated to 16 bits'?
		arsenmUnsubmitted Not Done Reply Inline Actions Because this is encoded as a 32-bit value, so it shouldn't print 16 digits. We don't print signs on hex values anywhere arsenm: Because this is encoded as a 32-bit value, so it shouldn't print 16 digits. We don't print…
		dpAuthorUnsubmitted Done Reply Inline Actions Assembler currently enables truncation in two cases: when truncated bits are all 0. when truncated bits are all 1 and the value after truncation has its MSB bit set. Do you suggest to change rules for 16-bit literals to silently truncate low 32 bits w/o any checks? dp: Assembler currently enables truncation in two cases: - when truncated bits are all 0. - when…
		arsenmUnsubmitted Not Done Reply Inline Actions I think we're talking about different things. The asm parser should probably error if you can't truncate to 32-bits without losing bits. I'm saying the printer should never print something that looks like a 8 byte value by printing more than 8 digits, and should not print a - in front of a hex value arsenm: I think we're talking about different things. The asm parser should probably error if you can't…
		dpAuthorUnsubmitted Done Reply Inline Actions Currently clBuildProgram generates output which cannot be assembled. Either asm parser or printer needs to be corrected. I believe the printer should print a 16-bit value, not a 32-bit one. (Otherwise why do we pretend this is a 16-bit operand?) High 16 bits are unused and truncated by parser (except for special cases like packed operands). We are printing high 16 bits only to show what was actually encoded. So when we have non-zero high 16 bits, this is a corner case. I do not think we should mutilate assembler truncation rules for the sake of handling a special case. I do not like both long hex literals and negative hex literals but this is a lesser evil for my taste. dp: Currently clBuildProgram generates output which cannot be assembled. Either asm parser or…
		arsenmUnsubmitted Not Done Reply Inline Actions 16-bit/4 digits is fine, but having a sign doesn't make sense here arsenm: 16-bit/4 digits is fine, but having a sign doesn't make sense here
; VI-NEXT: buffer_store_short [[ADD]]		; VI-NEXT: buffer_store_short [[ADD]]
define amdgpu_kernel void @v_test_add_i16_neg_constant(i16 addrspace(1)* %out, i16 addrspace(1)* %in0) #1 {		define amdgpu_kernel void @v_test_add_i16_neg_constant(i16 addrspace(1)* %out, i16 addrspace(1)* %in0) #1 {
%tid = call i32 @llvm.amdgcn.workitem.id.x()		%tid = call i32 @llvm.amdgcn.workitem.id.x()
%gep.out = getelementptr inbounds i16, i16 addrspace(1)* %out, i32 %tid		%gep.out = getelementptr inbounds i16, i16 addrspace(1)* %out, i32 %tid
%gep.in0 = getelementptr inbounds i16, i16 addrspace(1)* %in0, i32 %tid		%gep.in0 = getelementptr inbounds i16, i16 addrspace(1)* %in0, i32 %tid
%a = load volatile i16, i16 addrspace(1)* %gep.in0		%a = load volatile i16, i16 addrspace(1)* %gep.in0
%add = add i16 %a, -845		%add = add i16 %a, -845
store i16 %add, i16 addrspace(1)* %out		store i16 %add, i16 addrspace(1)* %out
▲ Show 20 Lines • Show All 101 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/add.v2i16.ll

Show First 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @v_test_add_v2i16_constant(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in0) #1 {
ret void		ret void
}		}

; FIXME: Need to handle non-uniform case for function below (load without gep).		; FIXME: Need to handle non-uniform case for function below (load without gep).
; GCN-LABEL: {{^}}v_test_add_v2i16_neg_constant:		; GCN-LABEL: {{^}}v_test_add_v2i16_neg_constant:
; GFX9: s_mov_b32 [[CONST:s[0-9]+]], 0xfc21fcb3{{$}}		; GFX9: s_mov_b32 [[CONST:s[0-9]+]], 0xfc21fcb3{{$}}
; GFX9: v_pk_add_u16 v{{[0-9]+}}, v{{[0-9]+}}, [[CONST]]		; GFX9: v_pk_add_u16 v{{[0-9]+}}, v{{[0-9]+}}, [[CONST]]

; VI-DAG: v_add_u16_e32 v{{[0-9]+}}, 0xfffffcb3, v{{[0-9]+}}		; VI-DAG: v_add_u16_e32 v{{[0-9]+}}, 0xfcb3, v{{[0-9]+}}
; VI-DAG: v_mov_b32_e32 v[[SCONST:[0-9]+]], 0xfffffc21		; VI-DAG: v_mov_b32_e32 v[[SCONST:[0-9]+]], 0xfffffc21
; VI-DAG: v_add_u16_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v[[SCONST]] dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD		; VI-DAG: v_add_u16_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v[[SCONST]] dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD
define amdgpu_kernel void @v_test_add_v2i16_neg_constant(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in0) #1 {		define amdgpu_kernel void @v_test_add_v2i16_neg_constant(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in0) #1 {
%tid = call i32 @llvm.amdgcn.workitem.id.x()		%tid = call i32 @llvm.amdgcn.workitem.id.x()
%gep.out = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %out, i32 %tid		%gep.out = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %out, i32 %tid
%gep.in0 = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in0, i32 %tid		%gep.in0 = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in0, i32 %tid
%a = load volatile <2 x i16>, <2 x i16> addrspace(1)* %gep.in0		%a = load volatile <2 x i16>, <2 x i16> addrspace(1)* %gep.in0
%add = add <2 x i16> %a, <i16 -845, i16 -991>		%add = add <2 x i16> %a, <i16 -845, i16 -991>
▲ Show 20 Lines • Show All 186 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/imm16.ll

Show First 20 Lines • Show All 1,432 Lines • ▼ Show 20 Lines	; SI-NEXT: s_setpc_b64 s[30:31]
ret void		ret void
}		}

define void @mul_inline_imm_neg_0.5_i16(i16 addrspace(1)* %out, i16 %x) {		define void @mul_inline_imm_neg_0.5_i16(i16 addrspace(1)* %out, i16 %x) {
; GFX10-LABEL: mul_inline_imm_neg_0.5_i16:		; GFX10-LABEL: mul_inline_imm_neg_0.5_i16:
; GFX10: ; %bb.0:		; GFX10: ; %bb.0:
; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]		; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]		; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]
; GFX10-NEXT: v_mul_lo_u16_e64 v2, 0xffffb800, v2 ; encoding: [0x02,0x00,0x05,0xd7,0xff,0x04,0x02,0x00,0x00,0xb8,0xff,0xff]		; GFX10-NEXT: v_mul_lo_u16_e64 v2, 0xb800, v2 ; encoding: [0x02,0x00,0x05,0xd7,0xff,0x04,0x02,0x00,0x00,0xb8,0xff,0xff]
; GFX10-NEXT: ; implicit-def: $vcc_hi		; GFX10-NEXT: ; implicit-def: $vcc_hi
; GFX10-NEXT: global_store_short v[0:1], v2, off ; encoding: [0x00,0x80,0x68,0xdc,0x00,0x02,0x7d,0x00]		; GFX10-NEXT: global_store_short v[0:1], v2, off ; encoding: [0x00,0x80,0x68,0xdc,0x00,0x02,0x7d,0x00]
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]		; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]
; GFX10-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x20,0x80,0xbe]		; GFX10-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x20,0x80,0xbe]
;		;
; VI-LABEL: mul_inline_imm_neg_0.5_i16:		; VI-LABEL: mul_inline_imm_neg_0.5_i16:
; VI: ; %bb.0:		; VI: ; %bb.0:
; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]		; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
; VI-NEXT: v_mul_lo_u16_e32 v2, 0xffffb800, v2 ; encoding: [0xff,0x04,0x04,0x52,0x00,0xb8,0xff,0xff]		; VI-NEXT: v_mul_lo_u16_e32 v2, 0xb800, v2 ; encoding: [0xff,0x04,0x04,0x52,0x00,0xb8,0xff,0xff]
; VI-NEXT: flat_store_short v[0:1], v2 ; encoding: [0x00,0x00,0x68,0xdc,0x00,0x02,0x00,0x00]		; VI-NEXT: flat_store_short v[0:1], v2 ; encoding: [0x00,0x00,0x68,0xdc,0x00,0x02,0x00,0x00]
; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) ; encoding: [0x70,0x00,0x8c,0xbf]		; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) ; encoding: [0x70,0x00,0x8c,0xbf]
; VI-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x1d,0x80,0xbe]		; VI-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x1d,0x80,0xbe]
;		;
; SI-LABEL: mul_inline_imm_neg_0.5_i16:		; SI-LABEL: mul_inline_imm_neg_0.5_i16:
; SI: ; %bb.0:		; SI: ; %bb.0:
; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)		; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; SI-NEXT: s_mov_b32 s6, 0		; SI-NEXT: s_mov_b32 s6, 0
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	; SI-NEXT: s_setpc_b64 s[30:31]
ret void		ret void
}		}

define void @mul_inline_imm_neg_1.0_i16(i16 addrspace(1)* %out, i16 %x) {		define void @mul_inline_imm_neg_1.0_i16(i16 addrspace(1)* %out, i16 %x) {
; GFX10-LABEL: mul_inline_imm_neg_1.0_i16:		; GFX10-LABEL: mul_inline_imm_neg_1.0_i16:
; GFX10: ; %bb.0:		; GFX10: ; %bb.0:
; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]		; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]		; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]
; GFX10-NEXT: v_mul_lo_u16_e64 v2, 0xffffbc00, v2 ; encoding: [0x02,0x00,0x05,0xd7,0xff,0x04,0x02,0x00,0x00,0xbc,0xff,0xff]		; GFX10-NEXT: v_mul_lo_u16_e64 v2, 0xbc00, v2 ; encoding: [0x02,0x00,0x05,0xd7,0xff,0x04,0x02,0x00,0x00,0xbc,0xff,0xff]
; GFX10-NEXT: ; implicit-def: $vcc_hi		; GFX10-NEXT: ; implicit-def: $vcc_hi
; GFX10-NEXT: global_store_short v[0:1], v2, off ; encoding: [0x00,0x80,0x68,0xdc,0x00,0x02,0x7d,0x00]		; GFX10-NEXT: global_store_short v[0:1], v2, off ; encoding: [0x00,0x80,0x68,0xdc,0x00,0x02,0x7d,0x00]
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]		; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]
; GFX10-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x20,0x80,0xbe]		; GFX10-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x20,0x80,0xbe]
;		;
; VI-LABEL: mul_inline_imm_neg_1.0_i16:		; VI-LABEL: mul_inline_imm_neg_1.0_i16:
; VI: ; %bb.0:		; VI: ; %bb.0:
; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]		; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
; VI-NEXT: v_mul_lo_u16_e32 v2, 0xffffbc00, v2 ; encoding: [0xff,0x04,0x04,0x52,0x00,0xbc,0xff,0xff]		; VI-NEXT: v_mul_lo_u16_e32 v2, 0xbc00, v2 ; encoding: [0xff,0x04,0x04,0x52,0x00,0xbc,0xff,0xff]
; VI-NEXT: flat_store_short v[0:1], v2 ; encoding: [0x00,0x00,0x68,0xdc,0x00,0x02,0x00,0x00]		; VI-NEXT: flat_store_short v[0:1], v2 ; encoding: [0x00,0x00,0x68,0xdc,0x00,0x02,0x00,0x00]
; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) ; encoding: [0x70,0x00,0x8c,0xbf]		; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) ; encoding: [0x70,0x00,0x8c,0xbf]
; VI-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x1d,0x80,0xbe]		; VI-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x1d,0x80,0xbe]
;		;
; SI-LABEL: mul_inline_imm_neg_1.0_i16:		; SI-LABEL: mul_inline_imm_neg_1.0_i16:
; SI: ; %bb.0:		; SI: ; %bb.0:
; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)		; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; SI-NEXT: s_mov_b32 s6, 0		; SI-NEXT: s_mov_b32 s6, 0
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	; SI-NEXT: s_setpc_b64 s[30:31]
ret void		ret void
}		}

define void @shl_inline_imm_neg_2.0_i16(i16 addrspace(1)* %out, i16 %x) {		define void @shl_inline_imm_neg_2.0_i16(i16 addrspace(1)* %out, i16 %x) {
; GFX10-LABEL: shl_inline_imm_neg_2.0_i16:		; GFX10-LABEL: shl_inline_imm_neg_2.0_i16:
; GFX10: ; %bb.0:		; GFX10: ; %bb.0:
; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]		; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]		; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]
; GFX10-NEXT: v_lshlrev_b16_e64 v2, v2, 0xffffc000 ; encoding: [0x02,0x00,0x14,0xd7,0x02,0xff,0x01,0x00,0x00,0xc0,0xff,0xff]		; GFX10-NEXT: v_lshlrev_b16_e64 v2, v2, 0xc000 ; encoding: [0x02,0x00,0x14,0xd7,0x02,0xff,0x01,0x00,0x00,0xc0,0xff,0xff]
; GFX10-NEXT: ; implicit-def: $vcc_hi		; GFX10-NEXT: ; implicit-def: $vcc_hi
; GFX10-NEXT: global_store_short v[0:1], v2, off ; encoding: [0x00,0x80,0x68,0xdc,0x00,0x02,0x7d,0x00]		; GFX10-NEXT: global_store_short v[0:1], v2, off ; encoding: [0x00,0x80,0x68,0xdc,0x00,0x02,0x7d,0x00]
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]		; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]
; GFX10-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x20,0x80,0xbe]		; GFX10-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x20,0x80,0xbe]
;		;
; VI-LABEL: shl_inline_imm_neg_2.0_i16:		; VI-LABEL: shl_inline_imm_neg_2.0_i16:
; VI: ; %bb.0:		; VI: ; %bb.0:
; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]		; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	; SI-NEXT: s_setpc_b64 s[30:31]
ret void		ret void
}		}

define void @mul_inline_imm_neg_4.0_i16(i16 addrspace(1)* %out, i16 %x) {		define void @mul_inline_imm_neg_4.0_i16(i16 addrspace(1)* %out, i16 %x) {
; GFX10-LABEL: mul_inline_imm_neg_4.0_i16:		; GFX10-LABEL: mul_inline_imm_neg_4.0_i16:
; GFX10: ; %bb.0:		; GFX10: ; %bb.0:
; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]		; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]		; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]
; GFX10-NEXT: v_mul_lo_u16_e64 v2, 0xffffc400, v2 ; encoding: [0x02,0x00,0x05,0xd7,0xff,0x04,0x02,0x00,0x00,0xc4,0xff,0xff]		; GFX10-NEXT: v_mul_lo_u16_e64 v2, 0xc400, v2 ; encoding: [0x02,0x00,0x05,0xd7,0xff,0x04,0x02,0x00,0x00,0xc4,0xff,0xff]
; GFX10-NEXT: ; implicit-def: $vcc_hi		; GFX10-NEXT: ; implicit-def: $vcc_hi
; GFX10-NEXT: global_store_short v[0:1], v2, off ; encoding: [0x00,0x80,0x68,0xdc,0x00,0x02,0x7d,0x00]		; GFX10-NEXT: global_store_short v[0:1], v2, off ; encoding: [0x00,0x80,0x68,0xdc,0x00,0x02,0x7d,0x00]
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]		; GFX10-NEXT: s_waitcnt_vscnt null, 0x0 ; encoding: [0x00,0x00,0xfd,0xbb]
; GFX10-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x20,0x80,0xbe]		; GFX10-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x20,0x80,0xbe]
;		;
; VI-LABEL: mul_inline_imm_neg_4.0_i16:		; VI-LABEL: mul_inline_imm_neg_4.0_i16:
; VI: ; %bb.0:		; VI: ; %bb.0:
; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]		; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
; VI-NEXT: v_mul_lo_u16_e32 v2, 0xffffc400, v2 ; encoding: [0xff,0x04,0x04,0x52,0x00,0xc4,0xff,0xff]		; VI-NEXT: v_mul_lo_u16_e32 v2, 0xc400, v2 ; encoding: [0xff,0x04,0x04,0x52,0x00,0xc4,0xff,0xff]
; VI-NEXT: flat_store_short v[0:1], v2 ; encoding: [0x00,0x00,0x68,0xdc,0x00,0x02,0x00,0x00]		; VI-NEXT: flat_store_short v[0:1], v2 ; encoding: [0x00,0x00,0x68,0xdc,0x00,0x02,0x00,0x00]
; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) ; encoding: [0x70,0x00,0x8c,0xbf]		; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) ; encoding: [0x70,0x00,0x8c,0xbf]
; VI-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x1d,0x80,0xbe]		; VI-NEXT: s_setpc_b64 s[30:31] ; encoding: [0x1e,0x1d,0x80,0xbe]
;		;
; SI-LABEL: mul_inline_imm_neg_4.0_i16:		; SI-LABEL: mul_inline_imm_neg_4.0_i16:
; SI: ; %bb.0:		; SI: ; %bb.0:
; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)		; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; SI-NEXT: s_mov_b32 s6, 0		; SI-NEXT: s_mov_b32 s6, 0
▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/sub.i16.ll

Show All 16 Lines	define amdgpu_kernel void @v_test_sub_i16(i16 addrspace(1)* %out, i16 addrspace(1)* %in0, i16 addrspace(1)* %in1) #1 {
%add = sub i16 %a, %b		%add = sub i16 %a, %b
store i16 %add, i16 addrspace(1)* %out		store i16 %add, i16 addrspace(1)* %out
ret void		ret void
}		}

; FIXME: Need to handle non-uniform case for function below (load without gep).		; FIXME: Need to handle non-uniform case for function below (load without gep).
; GCN-LABEL: {{^}}v_test_sub_i16_constant:		; GCN-LABEL: {{^}}v_test_sub_i16_constant:
; VI: flat_load_ushort [[A:v[0-9]+]]		; VI: flat_load_ushort [[A:v[0-9]+]]
; VI: v_add_u16_e32 [[ADD:v[0-9]+]], 0xffffff85, [[A]]		; VI: v_add_u16_e32 [[ADD:v[0-9]+]], 0xff85, [[A]]
; VI-NEXT: buffer_store_short [[ADD]]		; VI-NEXT: buffer_store_short [[ADD]]
define amdgpu_kernel void @v_test_sub_i16_constant(i16 addrspace(1)* %out, i16 addrspace(1)* %in0) #1 {		define amdgpu_kernel void @v_test_sub_i16_constant(i16 addrspace(1)* %out, i16 addrspace(1)* %in0) #1 {
%tid = call i32 @llvm.amdgcn.workitem.id.x()		%tid = call i32 @llvm.amdgcn.workitem.id.x()
%gep.out = getelementptr inbounds i16, i16 addrspace(1)* %out, i32 %tid		%gep.out = getelementptr inbounds i16, i16 addrspace(1)* %out, i32 %tid
%gep.in0 = getelementptr inbounds i16, i16 addrspace(1)* %in0, i32 %tid		%gep.in0 = getelementptr inbounds i16, i16 addrspace(1)* %in0, i32 %tid
%a = load volatile i16, i16 addrspace(1)* %gep.in0		%a = load volatile i16, i16 addrspace(1)* %gep.in0
%add = sub i16 %a, 123		%add = sub i16 %a, 123
store i16 %add, i16 addrspace(1)* %out		store i16 %add, i16 addrspace(1)* %out
▲ Show 20 Lines • Show All 136 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/sub.v2i16.ll

	Show First 20 Lines • Show All 188 Lines • ▼ Show 20 Lines
	; VI-NEXT: v_mov_b32_e32 v1, s3			; VI-NEXT: v_mov_b32_e32 v1, s3
	; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v0			; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v0
	; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc			; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc
	; VI-NEXT: flat_load_dword v0, v[0:1]			; VI-NEXT: flat_load_dword v0, v[0:1]
	; VI-NEXT: v_mov_b32_e32 v1, 0xfffffe38			; VI-NEXT: v_mov_b32_e32 v1, 0xfffffe38
	; VI-NEXT: s_mov_b32 s3, 0xf000			; VI-NEXT: s_mov_b32 s3, 0xf000
	; VI-NEXT: s_mov_b32 s2, -1			; VI-NEXT: s_mov_b32 s2, -1
	; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)			; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
	; VI-NEXT: v_add_u16_e32 v2, 0xffffff85, v0			; VI-NEXT: v_add_u16_e32 v2, 0xff85, v0
	; VI-NEXT: v_add_u16_sdwa v0, v0, v1 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD			; VI-NEXT: v_add_u16_sdwa v0, v0, v1 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD
	; VI-NEXT: v_or_b32_e32 v0, v2, v0			; VI-NEXT: v_or_b32_e32 v0, v2, v0
	; VI-NEXT: buffer_store_dword v0, off, s[0:3], 0			; VI-NEXT: buffer_store_dword v0, off, s[0:3], 0
	; VI-NEXT: s_endpgm			; VI-NEXT: s_endpgm
	%tid = call i32 @llvm.amdgcn.workitem.id.x()			%tid = call i32 @llvm.amdgcn.workitem.id.x()
	%gep.out = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %out, i32 %tid			%gep.out = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %out, i32 %tid
	%gep.in0 = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in0, i32 %tid			%gep.in0 = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in0, i32 %tid
	%a = load volatile <2 x i16>, <2 x i16> addrspace(1)* %gep.in0			%a = load volatile <2 x i16>, <2 x i16> addrspace(1)* %gep.in0
	▲ Show 20 Lines • Show All 442 Lines • Show Last 20 Lines

llvm/test/MC/Disassembler/AMDGPU/literal16_vi.txt

	Show All 28 Lines

	# VI: v_add_f16_e32 v1, 0x41, v3 ; encoding: [0xff,0x06,0x02,0x3e,0x41,0x00,0x00,0x00]			# VI: v_add_f16_e32 v1, 0x41, v3 ; encoding: [0xff,0x06,0x02,0x3e,0x41,0x00,0x00,0x00]
	0xff 0x06 0x02 0x3e 0x41 0x00 0x00 0x00			0xff 0x06 0x02 0x3e 0x41 0x00 0x00 0x00

	# VI: v_add_f16_e32 v1, 0x100, v3 ; encoding: [0xff,0x06,0x02,0x3e,0x00,0x01,0x00,0x00]			# VI: v_add_f16_e32 v1, 0x100, v3 ; encoding: [0xff,0x06,0x02,0x3e,0x00,0x01,0x00,0x00]
	0xff 0x06 0x02 0x3e 0x00 0x01 0x00 0x00			0xff 0x06 0x02 0x3e 0x00 0x01 0x00 0x00

	# non-zero unused bits in constant			# non-zero unused bits in constant
	# VI: v_add_f16_e32 v1, 0x10041, v3 ; encoding: [0xff,0x06,0x02,0x3e,0x41,0x00,0x01,0x00]			# VI: v_add_f16_e32 v1, 0x41, v3 ; encoding: [0xff,0x06,0x02,0x3e,0x41,0x00,0x01,0x00]
	0xff 0x06 0x02 0x3e 0x41 0x00 0x01 0x00			0xff 0x06 0x02 0x3e 0x41 0x00 0x01 0x00

	# VI: v_add_f16_e32 v1, 0x1000041, v3 ; encoding: [0xff,0x06,0x02,0x3e,0x41,0x00,0x00,0x01]			# VI: v_add_f16_e32 v1, 0x41, v3 ; encoding: [0xff,0x06,0x02,0x3e,0x41,0x00,0x00,0x01]
	0xff 0x06 0x02 0x3e 0x41 0x00 0x00 0x01			0xff 0x06 0x02 0x3e 0x41 0x00 0x00 0x01

	# FIXME: This should be able to round trip with literal after instruction			# FIXME: This should be able to round trip with literal after instruction
	# VI: v_add_f16_e32 v1, 0, v3 ; encoding: [0x80,0x06,0x02,0x3e]			# VI: v_add_f16_e32 v1, 0, v3 ; encoding: [0x80,0x06,0x02,0x3e]
	0xff 0x06 0x02 0x3e 0x00 0x00 0x00 0x00			0xff 0x06 0x02 0x3e 0x00 0x00 0x00 0x00

				# VI: v_add_f16_e32 v1, 0xffcd, v3 ; encoding: [0xff,0x06,0x02,0x3e,0xcd,0xff,0xff,0xff]
				0xff 0x06 0x02 0x3e 0xcd 0xff 0xff 0xff

				# VI: v_mul_lo_u16_e32 v2, 0xffcd, v2 ; encoding: [0xff,0x04,0x04,0x52,0xcd,0xff,0xff,0xff]
				0xff 0x04 0x04 0x52 0xcd 0xff 0xff 0xff

	# VI: v_madmk_f16 v1, v2, 0x41, v3 ; encoding: [0x02,0x07,0x02,0x48,0x41,0x00,0x00,0x00]			# VI: v_madmk_f16 v1, v2, 0x41, v3 ; encoding: [0x02,0x07,0x02,0x48,0x41,0x00,0x00,0x00]
	0x02 0x07 0x02 0x48 0x41 0x00 0x00 0x00			0x02 0x07 0x02 0x48 0x41 0x00 0x00 0x00

	# VI: v_madmk_f16 v1, v2, 0x10041, v3 ; encoding: [0x02,0x07,0x02,0x48,0x41,0x00,0x01,0x00]			# VI: v_madmk_f16 v1, v2, 0x10041, v3 ; encoding: [0x02,0x07,0x02,0x48,0x41,0x00,0x01,0x00]
	0x02 0x07 0x02 0x48 0x41 0x00 0x01 0x00			0x02 0x07 0x02 0x48 0x41 0x00 0x01 0x00

	# VI: v_madmk_f16 v1, v2, 0x1000041, v3 ; encoding: [0x02,0x07,0x02,0x48,0x41,0x00,0x00,0x01]			# VI: v_madmk_f16 v1, v2, 0x1000041, v3 ; encoding: [0x02,0x07,0x02,0x48,0x41,0x00,0x00,0x01]
	0x02 0x07 0x02 0x48 0x41 0x00 0x00 0x01			0x02 0x07 0x02 0x48 0x41 0x00 0x00 0x01

llvm/test/MC/Disassembler/AMDGPU/vop1.txt

	Show First 20 Lines • Show All 237 Lines • ▼ Show 20 Lines
	0xff 0x02 0x04 0x7e 0x15 0xcd 0x5b 0x07			0xff 0x02 0x04 0x7e 0x15 0xcd 0x5b 0x07

	# CHECK: v_cvt_f32_u32_e32 v33, 0x4236b732 ; encoding: [0xff,0x0c,0x42,0x7e,0x32,0xb7,0x36,0x42]			# CHECK: v_cvt_f32_u32_e32 v33, 0x4236b732 ; encoding: [0xff,0x0c,0x42,0x7e,0x32,0xb7,0x36,0x42]
	0xff 0x0c 0x42 0x7e 0x32 0xb7 0x36 0x42			0xff 0x0c 0x42 0x7e 0x32 0xb7 0x36 0x42

	# CHECK: v_cvt_i32_f64_e32 v2, 0x4236b732 ; encoding: [0xff,0x06,0x04,0x7e,0x32,0xb7,0x36,0x42]			# CHECK: v_cvt_i32_f64_e32 v2, 0x4236b732 ; encoding: [0xff,0x06,0x04,0x7e,0x32,0xb7,0x36,0x42]
	0xff 0x06 0x04 0x7e 0x32 0xb7 0x36 0x42			0xff 0x06 0x04 0x7e 0x32 0xb7 0x36 0x42

	# CHECK: v_cvt_f16_u16_e32 v123, 0x3ade68b1 ; encoding: [0xff,0x72,0xf6,0x7e,0xb1,0x68,0xde,0x3a]			# CHECK: v_cvt_f16_u16_e32 v123, 0x68b1 ; encoding: [0xff,0x72,0xf6,0x7e,0xb1,0x68,0xde,0x3a]
	0xff 0x72 0xf6 0x7e 0xb1 0x68 0xde 0x3a			0xff 0x72 0xf6 0x7e 0xb1 0x68 0xde 0x3a

	# CHECK: v_cvt_f16_i16_e32 v123, 0x21c2 ; encoding: [0xff,0x74,0xf6,0x7e,0xc2,0x21,0x00,0x00]			# CHECK: v_cvt_f16_i16_e32 v123, 0x21c2 ; encoding: [0xff,0x74,0xf6,0x7e,0xc2,0x21,0x00,0x00]
	0xff 0x74 0xf6 0x7e 0xc2 0x21 0x00 0x00			0xff 0x74 0xf6 0x7e 0xc2 0x21 0x00 0x00

	# CHECK: v_cvt_u16_f16_e32 v123, 0x3f20 ; encoding: [0xff,0x76,0xf6,0x7e,0x20,0x3f,0x00,0x00]			# CHECK: v_cvt_u16_f16_e32 v123, 0x3f20 ; encoding: [0xff,0x76,0xf6,0x7e,0x20,0x3f,0x00,0x00]
	0xff 0x76 0xf6 0x7e 0x20 0x3f 0x00 0x00			0xff 0x76 0xf6 0x7e 0x20 0x3f 0x00 0x00

llvm/test/MC/Disassembler/AMDGPU/vop3-literal.txt

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	# GFX10: v_pk_add_u16 v1, -1, v2 ; encoding: [0x01,0x00,0x0a,0xcc,0xc1,0x04,0x02,0x18]			# GFX10: v_pk_add_u16 v1, -1, v2 ; encoding: [0x01,0x00,0x0a,0xcc,0xc1,0x04,0x02,0x18]
	0x01,0x00,0x0a,0xcc,0xc1,0x04,0x02,0x18			0x01,0x00,0x0a,0xcc,0xc1,0x04,0x02,0x18

	# GFX10: v_pk_add_u16 v1, -5, v2 ; encoding: [0x01,0x00,0x0a,0xcc,0xc5,0x04,0x02,0x18]			# GFX10: v_pk_add_u16 v1, -5, v2 ; encoding: [0x01,0x00,0x0a,0xcc,0xc5,0x04,0x02,0x18]
	0x01,0x00,0x0a,0xcc,0xc5,0x04,0x02,0x18			0x01,0x00,0x0a,0xcc,0xc5,0x04,0x02,0x18

	# GFX10: v_pk_add_u16 v1, 0xffffff9c, v2 ; encoding: [0x01,0x00,0x0a,0xcc,0xff,0x04,0x02,0x18,0x9c,0xff,0xff,0xff]			# GFX10: v_pk_add_u16 v1, 0xffffff9c, v2 ; encoding: [0x01,0x00,0x0a,0xcc,0xff,0x04,0x02,0x18,0x9c,0xff,0xff,0xff]
	0x01,0x00,0x0a,0xcc,0xff,0x04,0x02,0x18,0x9c,0xff,0xff,0xff			0x01,0x00,0x0a,0xcc,0xff,0x04,0x02,0x18,0x9c,0xff,0xff,0xff

				# GFX10: v_add_nc_i16 v5, v1, 0xcdab ; encoding: [0x05,0x00,0x0d,0xd7,0x01,0xff,0x01,0x00,0xab,0xcd,0xff,0xff]
				0x05,0x00,0x0d,0xd7,0x01,0xff,0x01,0x00,0xab,0xcd,0xff,0xff

				# GFX10: v_ceil_f16_e64 v255, 0xabcd clamp ; encoding: [0xff,0x80,0xdc,0xd5,0xff,0x00,0x00,0x00,0xcd,0xab,0xff,0xff]
				0xff,0x80,0xdc,0xd5,0xff,0x00,0x00,0x00,0xcd,0xab,0xff,0xff

				# GFX10: v_min_u16_e64 v5, v1, 0xabcd ; encoding: [0x05,0x00,0x0b,0xd7,0x01,0xff,0x01,0x00,0xcd,0xab,0xff,0xff]
				0x05,0x00,0x0b,0xd7,0x01,0xff,0x01,0x00,0xcd,0xab,0xff,0xff

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][MC] Corrected decoding of 16-bit literalsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 279818

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp

llvm/test/CodeGen/AMDGPU/add.i16.ll

llvm/test/CodeGen/AMDGPU/add.v2i16.ll

llvm/test/CodeGen/AMDGPU/imm16.ll

llvm/test/CodeGen/AMDGPU/sub.i16.ll

llvm/test/CodeGen/AMDGPU/sub.v2i16.ll

llvm/test/MC/Disassembler/AMDGPU/literal16_vi.txt

llvm/test/MC/Disassembler/AMDGPU/vop1.txt

llvm/test/MC/Disassembler/AMDGPU/vop3-literal.txt

[AMDGPU][MC] Corrected decoding of 16-bit literals
ClosedPublic