This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] refactor ds instruction definitions (proposal)
ClosedPublic

Authored by vpykhtin on Jul 19 2016, 10:51 AM.

Download Raw Diff

Details

Reviewers

artem.tamazov
nhaustov
• tstellarAMD
arsenm
SamWot

Commits

rG902db3101b2b: [AMDGPU] refactor DS instruction definitions. NFC.
rL277344: [AMDGPU] refactor DS instruction definitions. NFC.

Summary

Hi,

this is my first attempt on improve our td instruction definitions.

All DS related definitions are moved to the new DSInstructions.td. This is done to reduce the number of definitions we currently have in a single td file.

Multiclasses that define Pseudo, SI and VI instructions are removed. Instead there is DS_Pseudo instruction that is supposed to handle all CodeGen related things and carry some hint flags for MC layer things. It's counterpart DS_Real copies relevant data from origin DS_Pseudo. So typical definition consist of two stages:

// CodeGen part - DSInstructions.td

DS_INSTRUCTION : DS_PSEUDO<"DS_MNEMONIC", outs, ins, asmString>

// Assembler, disassembler, encoding part

// SIInstructions.td

DS_INSTRUCTION_si : DS_REAL < 0x123 /* SI opcode */, DS_INSTRUCTION /* origin pseudo op*/ >;

// VIInstructions.td

DS_INSTRUCTION_vi : DS_REAL < 0x456 /* VI opcode */, DS_INSTRUCTION /* origin pseudo op*/ >;

Having these things split allows to:

Simplify codegen related definitions and pseudo ops
No dummy "real" instructions. Previously if we had to define new VI instruction it created dummy SI instruction with 0 opcode.
Workaround flags such like DisableSIDecoder/DisableVIDecoder can be removed.
having all real instructions groupped the same way allows easily diff subtarget td files.
we can change real instruction naming to prefixing it with subtarget tag (example: DS_INST_SI to SI_DS_INST) so that all subtarget opcodes are groupped together. This would allow to use direct translation table PseudoOp <-> RealOp. Currently there're log N translation map.
I'm also thinking about the possibility to split subtarget generated tables so that there're no mixed subtarget instructions. This would allow to avoid subtarget predicate checks.

Currently I broke the following tests but I'll fix them tomorrow:

LLVM :: CodeGen/AMDGPU/amdgpu.private-memory.ll
LLVM :: CodeGen/AMDGPU/atomic_load_add.ll
LLVM :: CodeGen/AMDGPU/atomic_load_sub.ll
LLVM :: CodeGen/AMDGPU/extload.ll
LLVM :: CodeGen/AMDGPU/lds-oqap-crash.ll
LLVM :: CodeGen/AMDGPU/lds-output-queue.ll
LLVM :: CodeGen/AMDGPU/load-local-f32.ll
LLVM :: CodeGen/AMDGPU/load-local-f64.ll
LLVM :: CodeGen/AMDGPU/load-local-i1.ll
LLVM :: CodeGen/AMDGPU/load-local-i16.ll
LLVM :: CodeGen/AMDGPU/load-local-i32.ll
LLVM :: CodeGen/AMDGPU/load-local-i64.ll
LLVM :: CodeGen/AMDGPU/load-local-i8.ll
LLVM :: CodeGen/AMDGPU/local-atomics.ll
LLVM :: CodeGen/AMDGPU/local-memory.r600.ll
LLVM :: CodeGen/AMDGPU/private-memory-r600.ll
LLVM :: CodeGen/AMDGPU/store.ll

Diff Detail

Event Timeline

vpykhtin updated this revision to Diff 64517.Jul 19 2016, 10:51 AM

vpykhtin retitled this revision from to [AMDGPU] refactor ds instruction definitions (proposal).

vpykhtin updated this object.

vpykhtin added reviewers: arsenm, • tstellarAMD, nhaustov, SamWot, artem.tamazov.

vpykhtin set the repository for this revision to rL LLVM.

vpykhtin added a project: Restricted Project.

Herald added subscribers: kzhuravl, arsenm. · View Herald TranscriptJul 19 2016, 10:51 AM

I think in general this is a nice improvement. The main draw back is the duplicate definitions of the real instructions for SI/VI, but it seems like this is necessary in order to support the assembler/disassembler without more hacks. And I think overall this makes the .td files less complicated.

lib/Target/AMDGPU/DSInstructions.td
735–741	There's a few place like this with lots of extra whitespace that could be cleaned up.

In D22522#489593, @tstellarAMD wrote:

I think in general this is a nice improvement. The main draw back is the duplicate definitions of the real instructions for SI/VI, but it seems like this is necessary in order to support the assembler/disassembler without more hacks. And I think overall this makes the .td files less complicated.

I agree, I don't like the duplication either. On the other hand real definition part is really straightforward, I hope there will be no need to modify it often.

I generally like this changes but there are some drawbacks:

Pseudo instruction ideally should not contain AsmStrings, AsmMatcherConverters and other fields that are used only in MC layer. But I can't imagine how to move those fields to real instruction without breaking whole idea of this change and without huge code duplication.
There would be a lot of problems later with other types of instructions like VOP (sdwa and dpp including) or MUBUF atomics.

In D22522#489650, @SamWot wrote:

I generally like this changes but there are some drawbacks:

Pseudo instruction ideally should not contain AsmStrings, AsmMatcherConverters and other fields that are used only in MC layer. But I can't imagine how to move those fields to real instruction without breaking whole idea of this change and without huge code duplication.

Right, generally its a good idea to split CodeGen and MCLayer data and left Pseudo instruction for CodeGen and Real instruction for MCLayer info, however it would probably create another level of indirection to remove asm duplication.

One of my thoughts was to do asm parsing on Pseudo instructions and translate them into Real upon encoding. This way having asm string in Pseudo would signficantly reduce asm parsing tables.

There would be a lot of problems later with other types of instructions like VOP (sdwa and dpp including) or MUBUF atomics.

I'm actually interested in collecting those problems beforehand.

fixed test failures
removed extra spaces

Guys,

I would like to submit this, if there're no objections, to avoid potential merging. Lit tests are now passing, should we perform any additional testing on this?

SamWot accepted this revision.Jul 28 2016, 8:33 AM

SamWot edited edge metadata.

This revision is now accepted and ready to land.Jul 28 2016, 8:33 AM

In D22522#499272, @vpykhtin wrote:

Guys,

I would like to submit this, if there're no objections, to avoid potential merging. Lit tests are now passing, should we perform any additional testing on this?

Let me test it out on our graphics stack first.

Tests pass. LGTM.

Closed by commit rL277344: [AMDGPU] refactor DS instruction definitions. NFC. (authored by vpykhtin). · Explain WhyAug 1 2016, 7:29 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Target/

AMDGPU/

17 lines

894 lines

37 lines

262 lines

261 lines

20 lines

11 lines

test/

MC/

AMDGPU/

ds.s

19 lines

Disassembler/

AMDGPU/

ds_vi.txt

9 lines

Diff 65774

lib/Target/AMDGPU/CIInstructions.td

	//===-- CIInstructions.td - CI Instruction Defintions ---------------------===//			//===-- CIInstructions.td - CI Instruction Defintions ---------------------===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Instruction definitions for CI and newer.			// Instruction definitions for CI and newer.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Remaining instructions:			// Remaining instructions:
	// S_CBRANCH_CDBGUSER			// S_CBRANCH_CDBGUSER
	// S_CBRANCH_CDBGSYS			// S_CBRANCH_CDBGSYS
	// S_CBRANCH_CDBGSYS_OR_USER			// S_CBRANCH_CDBGSYS_OR_USER
	// S_CBRANCH_CDBGSYS_AND_USER			// S_CBRANCH_CDBGSYS_AND_USER
	// DS_NOP
	// DS_GWS_SEMA_RELEASE_ALL
	// DS_WRAP_RTN_B32
	// DS_CNDXCHG32_RTN_B64
	// DS_WRITE_B96
	// DS_WRITE_B128
	// DS_CONDXCHG32_RTN_B128
	// DS_READ_B96
	// DS_READ_B128
	// BUFFER_LOAD_DWORDX3			// BUFFER_LOAD_DWORDX3
	// BUFFER_STORE_DWORDX3			// BUFFER_STORE_DWORDX3

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VOP1 Instructions			// VOP1 Instructions
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	let SubtargetPredicate = isCIVI in {			let SubtargetPredicate = isCIVI in {
	▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	// XXX - Does this set VCC?			// XXX - Does this set VCC?
	defm V_MAD_I64_I32 : VOP3Inst <vop3<0x177>, "v_mad_i64_i32",			defm V_MAD_I64_I32 : VOP3Inst <vop3<0x177>, "v_mad_i64_i32",
	VOP_I64_I32_I32_I64			VOP_I64_I32_I32_I64
	>;			>;
	} // End isCommutable = 1			} // End isCommutable = 1


	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// DS Instructions
	//===----------------------------------------------------------------------===//
	defm DS_WRAP_RTN_F32 : DS_1A1D_RET <0x34, "ds_wrap_rtn_f32", VGPR_32, "ds_wrap_f32">;

	// DS_CONDXCHG32_RTN_B64
	// DS_CONDXCHG32_RTN_B128

	//===----------------------------------------------------------------------===//
	// SMRD Instructions			// SMRD Instructions
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	defm S_DCACHE_INV_VOL : SMRD_Inval <smrd<0x1d, 0x22>,			defm S_DCACHE_INV_VOL : SMRD_Inval <smrd<0x1d, 0x22>,
	"s_dcache_inv_vol", int_amdgcn_s_dcache_inv_vol>;			"s_dcache_inv_vol", int_amdgcn_s_dcache_inv_vol>;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// MUBUF Instructions			// MUBUF Instructions
	▲ Show 20 Lines • Show All 249 Lines • Show Last 20 Lines

lib/Target/AMDGPU/DSInstructions.td

This file was added.

				//===-- DSInstructions.td - DS Instruction Defintions ---------------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				class DS_Pseudo <string opName, dag outs, dag ins, string asmOps, list<dag> pattern=[]> :
				InstSI <outs, ins, "", pattern>,
				SIMCInstr <opName, SIEncodingFamily.NONE> {

				let SubtargetPredicate = isGCN;

				let LGKM_CNT = 1;
				let DS = 1;
				let UseNamedOperandTable = 1;
				let Uses = [M0, EXEC];

				// Most instruction load and store data, so set this as the default.
				let mayLoad = 1;
				let mayStore = 1;

				let hasSideEffects = 0;
				let SchedRW = [WriteLDS];

				let isPseudo = 1;
				let isCodeGenOnly = 1;

				let AsmMatchConverter = "cvtDS";

				string Mnemonic = opName;
				string AsmOperands = asmOps;

				// Well these bits a kind of hack because it would be more natural
				// to test "outs" and "ins" dags for the presence of particular operands
				bits<1> has_vdst = 1;
				bits<1> has_addr = 1;
				bits<1> has_data0 = 1;
				bits<1> has_data1 = 1;

				bits<1> has_offset = 1; // has "offset" that should be split to offset0,1
				bits<1> has_offset0 = 1;
				bits<1> has_offset1 = 1;

				bits<1> has_gds = 1;
				bits<1> gdsValue = 0; // if has_gds == 0 set gds to this value
				}

				class DS_Real <DS_Pseudo ds> :
				InstSI <ds.OutOperandList, ds.InOperandList, ds.Mnemonic # " " # ds.AsmOperands, []>,
				Enc64 {

				let isPseudo = 0;
				let isCodeGenOnly = 0;

				// copy relevant pseudo op flags
				let SubtargetPredicate = ds.SubtargetPredicate;
				let AsmMatchConverter = ds.AsmMatchConverter;

				// encoding fields
				bits<8> vdst;
				bits<1> gds;
				bits<8> addr;
				bits<8> data0;
				bits<8> data1;
				bits<8> offset0;
				bits<8> offset1;

				bits<16> offset;
				let offset0 = !if(ds.has_offset, offset{7-0}, ?);
				let offset1 = !if(ds.has_offset, offset{15-8}, ?);
				}


				// DS Pseudo instructions

				class DS_1A1D_NORET<string opName, RegisterClass rc = VGPR_32>
				: DS_Pseudo<opName,
				(outs),
				(ins VGPR_32:$addr, rc:$data0, offset:$offset, gds:$gds),
				"$addr, $data0$offset$gds">,
				AtomicNoRet<opName, 0> {

				let has_data1 = 0;
				let has_vdst = 0;
				}

				class DS_1A_Off8_NORET<string opName> : DS_Pseudo<opName,
				(outs),
				(ins VGPR_32:$addr, offset0:$offset0, offset1:$offset1, gds:$gds),
				"$addr $offset0$offset1$gds"> {

				let has_data0 = 0;
				let has_data1 = 0;
				let has_vdst = 0;
				let has_offset = 0;
				let AsmMatchConverter = "cvtDSOffset01";
				}

				class DS_1A2D_NORET<string opName, RegisterClass rc = VGPR_32>
				: DS_Pseudo<opName,
				(outs),
				(ins VGPR_32:$addr, rc:$data0, rc:$data1, offset:$offset, gds:$gds),
				"$addr, $data0, $data1"#"$offset"#"$gds">,
				AtomicNoRet<opName, 0> {

				let has_vdst = 0;
				}

				class DS_1A2D_Off8_NORET <string opName, RegisterClass rc = VGPR_32>
				: DS_Pseudo<opName,
				(outs),
				(ins VGPR_32:$addr, rc:$data0, rc:$data1,
				offset0:$offset0, offset1:$offset1, gds:$gds),
				"$addr, $data0, $data1$offset0$offset1$gds"> {

				let has_vdst = 0;
				let has_offset = 0;
				let AsmMatchConverter = "cvtDSOffset01";
				}

				class DS_1A1D_RET <string opName, RegisterClass rc = VGPR_32>
				: DS_Pseudo<opName,
				(outs rc:$vdst),
				(ins VGPR_32:$addr, rc:$data0, offset:$offset, gds:$gds),
				"$vdst, $addr, $data0$offset$gds"> {

				let hasPostISelHook = 1;
				let has_data1 = 0;
				}

				class DS_1A2D_RET<string opName,
				RegisterClass rc = VGPR_32,
				RegisterClass src = rc>
				: DS_Pseudo<opName,
				(outs rc:$vdst),
				(ins VGPR_32:$addr, src:$data0, src:$data1, offset:$offset, gds:$gds),
				"$vdst, $addr, $data0, $data1$offset$gds"> {

				let hasPostISelHook = 1;
				}

				class DS_1A_RET<string opName, RegisterClass rc = VGPR_32>
				: DS_Pseudo<opName,
				(outs rc:$vdst),
				(ins VGPR_32:$addr, offset:$offset, gds:$gds),
				"$vdst, $addr$offset$gds"> {

				let has_data0 = 0;
				let has_data1 = 0;
				}

				class DS_1A_Off8_RET <string opName, RegisterClass rc = VGPR_32>
				: DS_Pseudo<opName,
				(outs rc:$vdst),
				(ins VGPR_32:$addr, offset0:$offset0, offset1:$offset1, gds:$gds),
				"$vdst, $addr$offset0$offset1$gds"> {

				let has_offset = 0;
				let has_data0 = 0;
				let has_data1 = 0;
				let AsmMatchConverter = "cvtDSOffset01";
				}

				class DS_1A_RET_GDS <string opName> : DS_Pseudo<opName,
				(outs VGPR_32:$vdst),
				(ins VGPR_32:$addr, offset:$offset),
				"$vdst, $addr$offset gds"> {

				let has_data0 = 0;
				let has_data1 = 0;
				let has_gds = 0;
				let gdsValue = 1;
				}

				class DS_0A_RET <string opName> : DS_Pseudo<opName,
				(outs VGPR_32:$vdst),
				(ins offset:$offset, gds:$gds),
				"$vdst$offset$gds"> {

				let mayLoad = 1;
				let mayStore = 1;

				let has_addr = 0;
				let has_data0 = 0;
				let has_data1 = 0;
				}

				class DS_1A <string opName> : DS_Pseudo<opName,
				(outs),
				(ins VGPR_32:$addr, offset:$offset, gds:$gds),
				"$addr$offset$gds"> {

				let mayLoad = 1;
				let mayStore = 1;

				let has_vdst = 0;
				let has_data0 = 0;
				let has_data1 = 0;
				}

				class DS_1A_GDS <string opName> : DS_Pseudo<opName,
				(outs),
				(ins VGPR_32:$addr),
				"$addr gds"> {

				let has_vdst = 0;
				let has_data0 = 0;
				let has_data1 = 0;
				let has_offset = 0;
				let has_offset0 = 0;
				let has_offset1 = 0;

				let has_gds = 0;
				let gdsValue = 1;
				}

				class DS_1A1D_PERMUTE <string opName, SDPatternOperator node = null_frag>
				: DS_Pseudo<opName,
				(outs VGPR_32:$vdst),
				(ins VGPR_32:$addr, VGPR_32:$data0, offset:$offset),
				"$vdst, $addr, $data0$offset",
				[(set i32:$vdst,
				(node (DS1Addr1Offset i32:$addr, i16:$offset), i32:$data0))] > {

				let mayLoad = 0;
				let mayStore = 0;
				let isConvergent = 1;

				let has_data1 = 0;
				let has_gds = 0;
				}

				def DS_ADD_U32 : DS_1A1D_NORET<"ds_add_u32">;
				def DS_SUB_U32 : DS_1A1D_NORET<"ds_sub_u32">;
				def DS_RSUB_U32 : DS_1A1D_NORET<"ds_rsub_u32">;
				def DS_INC_U32 : DS_1A1D_NORET<"ds_inc_u32">;
				def DS_DEC_U32 : DS_1A1D_NORET<"ds_dec_u32">;
				def DS_MIN_I32 : DS_1A1D_NORET<"ds_min_i32">;
				def DS_MAX_I32 : DS_1A1D_NORET<"ds_max_i32">;
				def DS_MIN_U32 : DS_1A1D_NORET<"ds_min_u32">;
				def DS_MAX_U32 : DS_1A1D_NORET<"ds_max_u32">;
				def DS_AND_B32 : DS_1A1D_NORET<"ds_and_b32">;
				def DS_OR_B32 : DS_1A1D_NORET<"ds_or_b32">;
				def DS_XOR_B32 : DS_1A1D_NORET<"ds_xor_b32">;

				let mayLoad = 0 in {
				def DS_WRITE_B8 : DS_1A1D_NORET<"ds_write_b8">;
				def DS_WRITE_B16 : DS_1A1D_NORET<"ds_write_b16">;
				def DS_WRITE_B32 : DS_1A1D_NORET<"ds_write_b32">;
				def DS_WRITE2_B32 : DS_1A2D_Off8_NORET<"ds_write2_b32">;
				def DS_WRITE2ST64_B32 : DS_1A2D_Off8_NORET<"ds_write2st64_b32">;
				}

				def DS_MSKOR_B32 : DS_1A2D_NORET<"ds_mskor_b32">;
				def DS_CMPST_B32 : DS_1A2D_NORET<"ds_cmpst_b32">;
				def DS_CMPST_F32 : DS_1A2D_NORET<"ds_cmpst_f32">;
				def DS_MIN_F32 : DS_1A2D_NORET<"ds_min_f32">;
				def DS_MAX_F32 : DS_1A2D_NORET<"ds_max_f32">;

				def DS_ADD_U64 : DS_1A1D_NORET<"ds_add_u64", VReg_64>;
				def DS_SUB_U64 : DS_1A1D_NORET<"ds_sub_u64", VReg_64>;
				def DS_RSUB_U64 : DS_1A1D_NORET<"ds_rsub_u64", VReg_64>;
				def DS_INC_U64 : DS_1A1D_NORET<"ds_inc_u64", VReg_64>;
				def DS_DEC_U64 : DS_1A1D_NORET<"ds_dec_u64", VReg_64>;
				def DS_MIN_I64 : DS_1A1D_NORET<"ds_min_i64", VReg_64>;
				def DS_MAX_I64 : DS_1A1D_NORET<"ds_max_i64", VReg_64>;
				def DS_MIN_U64 : DS_1A1D_NORET<"ds_min_u64", VReg_64>;
				def DS_MAX_U64 : DS_1A1D_NORET<"ds_max_u64", VReg_64>;
				def DS_AND_B64 : DS_1A1D_NORET<"ds_and_b64", VReg_64>;
				def DS_OR_B64 : DS_1A1D_NORET<"ds_or_b64", VReg_64>;
				def DS_XOR_B64 : DS_1A1D_NORET<"ds_xor_b64", VReg_64>;
				def DS_MSKOR_B64 : DS_1A2D_NORET<"ds_mskor_b64", VReg_64>;
				let mayLoad = 0 in {
				def DS_WRITE_B64 : DS_1A1D_NORET<"ds_write_b64", VReg_64>;
				def DS_WRITE2_B64 : DS_1A2D_Off8_NORET<"ds_write2_b64", VReg_64>;
				def DS_WRITE2ST64_B64 : DS_1A2D_Off8_NORET<"ds_write2st64_b64", VReg_64>;
				}
				def DS_CMPST_B64 : DS_1A2D_NORET<"ds_cmpst_b64", VReg_64>;
				def DS_CMPST_F64 : DS_1A2D_NORET<"ds_cmpst_f64", VReg_64>;
				def DS_MIN_F64 : DS_1A1D_NORET<"ds_min_f64", VReg_64>;
				def DS_MAX_F64 : DS_1A1D_NORET<"ds_max_f64", VReg_64>;

				def DS_ADD_RTN_U32 : DS_1A1D_RET<"ds_add_rtn_u32">,
				AtomicNoRet<"ds_add_u32", 1>;
				def DS_SUB_RTN_U32 : DS_1A1D_RET<"ds_sub_rtn_u32">,
				AtomicNoRet<"ds_sub_u32", 1>;
				def DS_RSUB_RTN_U32 : DS_1A1D_RET<"ds_rsub_rtn_u32">,
				AtomicNoRet<"ds_rsub_u32", 1>;
				def DS_INC_RTN_U32 : DS_1A1D_RET<"ds_inc_rtn_u32">,
				AtomicNoRet<"ds_inc_u32", 1>;
				def DS_DEC_RTN_U32 : DS_1A1D_RET<"ds_dec_rtn_u32">,
				AtomicNoRet<"ds_dec_u32", 1>;
				def DS_MIN_RTN_I32 : DS_1A1D_RET<"ds_min_rtn_i32">,
				AtomicNoRet<"ds_min_i32", 1>;
				def DS_MAX_RTN_I32 : DS_1A1D_RET<"ds_max_rtn_i32">,
				AtomicNoRet<"ds_max_i32", 1>;
				def DS_MIN_RTN_U32 : DS_1A1D_RET<"ds_min_rtn_u32">,
				AtomicNoRet<"ds_min_u32", 1>;
				def DS_MAX_RTN_U32 : DS_1A1D_RET<"ds_max_rtn_u32">,
				AtomicNoRet<"ds_max_u32", 1>;
				def DS_AND_RTN_B32 : DS_1A1D_RET<"ds_and_rtn_b32">,
				AtomicNoRet<"ds_and_b32", 1>;
				def DS_OR_RTN_B32 : DS_1A1D_RET<"ds_or_rtn_b32">,
				AtomicNoRet<"ds_or_b32", 1>;
				def DS_XOR_RTN_B32 : DS_1A1D_RET<"ds_xor_rtn_b32">,
				AtomicNoRet<"ds_xor_b32", 1>;
				def DS_MSKOR_RTN_B32 : DS_1A2D_RET<"ds_mskor_rtn_b32">,
				AtomicNoRet<"ds_mskor_b32", 1>;
				def DS_CMPST_RTN_B32 : DS_1A2D_RET <"ds_cmpst_rtn_b32">,
				AtomicNoRet<"ds_cmpst_b32", 1>;
				def DS_CMPST_RTN_F32 : DS_1A2D_RET <"ds_cmpst_rtn_f32">,
				AtomicNoRet<"ds_cmpst_f32", 1>;
				def DS_MIN_RTN_F32 : DS_1A2D_RET <"ds_min_rtn_f32">,
				AtomicNoRet<"ds_min_f32", 1>;
				def DS_MAX_RTN_F32 : DS_1A2D_RET <"ds_max_rtn_f32">,
				AtomicNoRet<"ds_max_f32", 1>;

				def DS_WRXCHG_RTN_B32 : DS_1A1D_RET<"ds_wrxchg_rtn_b32">,
				AtomicNoRet<"", 1>;
				def DS_WRXCHG2_RTN_B32 : DS_1A2D_RET<"ds_wrxchg2_rtn_b32", VReg_64, VGPR_32>,
				AtomicNoRet<"", 1>;
				def DS_WRXCHG2ST64_RTN_B32 : DS_1A2D_RET<"ds_wrxchg2st64_rtn_b32", VReg_64, VGPR_32>,
				AtomicNoRet<"", 1>;

				def DS_ADD_RTN_U64 : DS_1A1D_RET<"ds_add_rtn_u64", VReg_64>,
				AtomicNoRet<"ds_add_u64", 1>;
				def DS_SUB_RTN_U64 : DS_1A1D_RET<"ds_sub_rtn_u64", VReg_64>,
				AtomicNoRet<"ds_sub_u64", 1>;
				def DS_RSUB_RTN_U64 : DS_1A1D_RET<"ds_rsub_rtn_u64", VReg_64>,
				AtomicNoRet<"ds_rsub_u64", 1>;
				def DS_INC_RTN_U64 : DS_1A1D_RET<"ds_inc_rtn_u64", VReg_64>,
				AtomicNoRet<"ds_inc_u64", 1>;
				def DS_DEC_RTN_U64 : DS_1A1D_RET<"ds_dec_rtn_u64", VReg_64>,
				AtomicNoRet<"ds_dec_u64", 1>;
				def DS_MIN_RTN_I64 : DS_1A1D_RET<"ds_min_rtn_i64", VReg_64>,
				AtomicNoRet<"ds_min_i64", 1>;
				def DS_MAX_RTN_I64 : DS_1A1D_RET<"ds_max_rtn_i64", VReg_64>,
				AtomicNoRet<"ds_max_i64", 1>;
				def DS_MIN_RTN_U64 : DS_1A1D_RET<"ds_min_rtn_u64", VReg_64>,
				AtomicNoRet<"ds_min_u64", 1>;
				def DS_MAX_RTN_U64 : DS_1A1D_RET<"ds_max_rtn_u64", VReg_64>,
				AtomicNoRet<"ds_max_u64", 1>;
				def DS_AND_RTN_B64 : DS_1A1D_RET<"ds_and_rtn_b64", VReg_64>,
				AtomicNoRet<"ds_and_b64", 1>;
				def DS_OR_RTN_B64 : DS_1A1D_RET<"ds_or_rtn_b64", VReg_64>,
				AtomicNoRet<"ds_or_b64", 1>;
				def DS_XOR_RTN_B64 : DS_1A1D_RET<"ds_xor_rtn_b64", VReg_64>,
				AtomicNoRet<"ds_xor_b64", 1>;
				def DS_MSKOR_RTN_B64 : DS_1A2D_RET<"ds_mskor_rtn_b64", VReg_64>,
				AtomicNoRet<"ds_mskor_b64", 1>;
				def DS_CMPST_RTN_B64 : DS_1A2D_RET<"ds_cmpst_rtn_b64", VReg_64>,
				AtomicNoRet<"ds_cmpst_b64", 1>;
				def DS_CMPST_RTN_F64 : DS_1A2D_RET<"ds_cmpst_rtn_f64", VReg_64>,
				AtomicNoRet<"ds_cmpst_f64", 1>;
				def DS_MIN_RTN_F64 : DS_1A1D_RET<"ds_min_rtn_f64", VReg_64>,
				AtomicNoRet<"ds_min_f64", 1>;
				def DS_MAX_RTN_F64 : DS_1A1D_RET<"ds_max_rtn_f64", VReg_64>,
				AtomicNoRet<"ds_max_f64", 1>;

				def DS_WRXCHG_RTN_B64 : DS_1A1D_RET<"ds_wrxchg_rtn_b64", VReg_64>,
				AtomicNoRet<"ds_wrxchg_b64", 1>;
				def DS_WRXCHG2_RTN_B64 : DS_1A2D_RET<"ds_wrxchg2_rtn_b64", VReg_128, VReg_64>,
				AtomicNoRet<"ds_wrxchg2_b64", 1>;
				def DS_WRXCHG2ST64_RTN_B64 : DS_1A2D_RET<"ds_wrxchg2st64_rtn_b64", VReg_128, VReg_64>,
				AtomicNoRet<"ds_wrxchg2st64_b64", 1>;

				def DS_GWS_INIT : DS_1A_GDS<"ds_gws_init">;
				def DS_GWS_SEMA_V : DS_1A_GDS<"ds_gws_sema_v">;
				def DS_GWS_SEMA_BR : DS_1A_GDS<"ds_gws_sema_br">;
				def DS_GWS_SEMA_P : DS_1A_GDS<"ds_gws_sema_p">;
				def DS_GWS_BARRIER : DS_1A_GDS<"ds_gws_barrier">;

				def DS_ADD_SRC2_U32 : DS_1A<"ds_add_src2_u32">;
				def DS_SUB_SRC2_U32 : DS_1A<"ds_sub_src2_u32">;
				def DS_RSUB_SRC2_U32 : DS_1A<"ds_rsub_src2_u32">;
				def DS_INC_SRC2_U32 : DS_1A<"ds_inc_src2_u32">;
				def DS_DEC_SRC2_U32 : DS_1A<"ds_dec_src2_u32">;
				def DS_MIN_SRC2_I32 : DS_1A<"ds_min_src2_i32">;
				def DS_MAX_SRC2_I32 : DS_1A<"ds_max_src2_i32">;
				def DS_MIN_SRC2_U32 : DS_1A<"ds_min_src2_u32">;
				def DS_MAX_SRC2_U32 : DS_1A<"ds_max_src2_u32">;
				def DS_AND_SRC2_B32 : DS_1A<"ds_and_src_b32">;
				def DS_OR_SRC2_B32 : DS_1A<"ds_or_src2_b32">;
				def DS_XOR_SRC2_B32 : DS_1A<"ds_xor_src2_b32">;
				def DS_MIN_SRC2_F32 : DS_1A<"ds_min_src2_f32">;
				def DS_MAX_SRC2_F32 : DS_1A<"ds_max_src2_f32">;

				def DS_ADD_SRC2_U64 : DS_1A<"ds_add_src2_u64">;
				def DS_SUB_SRC2_U64 : DS_1A<"ds_sub_src2_u64">;
				def DS_RSUB_SRC2_U64 : DS_1A<"ds_rsub_src2_u64">;
				def DS_INC_SRC2_U64 : DS_1A<"ds_inc_src2_u64">;
				def DS_DEC_SRC2_U64 : DS_1A<"ds_dec_src2_u64">;
				def DS_MIN_SRC2_I64 : DS_1A<"ds_min_src2_i64">;
				def DS_MAX_SRC2_I64 : DS_1A<"ds_max_src2_i64">;
				def DS_MIN_SRC2_U64 : DS_1A<"ds_min_src2_u64">;
				def DS_MAX_SRC2_U64 : DS_1A<"ds_max_src2_u64">;
				def DS_AND_SRC2_B64 : DS_1A<"ds_and_src2_b64">;
				def DS_OR_SRC2_B64 : DS_1A<"ds_or_src2_b64">;
				def DS_XOR_SRC2_B64 : DS_1A<"ds_xor_src2_b64">;
				def DS_MIN_SRC2_F64 : DS_1A<"ds_min_src2_f64">;
				def DS_MAX_SRC2_F64 : DS_1A<"ds_max_src2_f64">;

				def DS_WRITE_SRC2_B32 : DS_1A_Off8_NORET<"ds_write_src2_b32">;
				def DS_WRITE_SRC2_B64 : DS_1A_Off8_NORET<"ds_write_src2_b64">;

				let Uses = [EXEC], mayLoad = 0, mayStore = 0, isConvergent = 1 in {
				def DS_SWIZZLE_B32 : DS_1A_RET <"ds_swizzle_b32">;
				}

				let mayStore = 0 in {
				def DS_READ_I8 : DS_1A_RET<"ds_read_i8">;
				def DS_READ_U8 : DS_1A_RET<"ds_read_u8">;
				def DS_READ_I16 : DS_1A_RET<"ds_read_i16">;
				def DS_READ_U16 : DS_1A_RET<"ds_read_u16">;
				def DS_READ_B32 : DS_1A_RET<"ds_read_b32">;
				def DS_READ_B64 : DS_1A_RET<"ds_read_b64", VReg_64>;

				def DS_READ2_B32 : DS_1A_Off8_RET<"ds_read2_b32", VReg_64>;
				def DS_READ2ST64_B32 : DS_1A_Off8_RET<"ds_read2st64_b32", VReg_64>;

				def DS_READ2_B64 : DS_1A_Off8_RET<"ds_read2_b64", VReg_128>;
				def DS_READ2ST64_B64 : DS_1A_Off8_RET<"ds_read2st64_b64", VReg_128>;
				}

				let SubtargetPredicate = isSICI in {
				def DS_CONSUME : DS_0A_RET<"ds_consume">;
				def DS_APPEND : DS_0A_RET<"ds_append">;
				def DS_ORDERED_COUNT : DS_1A_RET_GDS<"ds_ordered_count">;
				}

				//===----------------------------------------------------------------------===//
				// Instruction definitions for CI and newer.
				//===----------------------------------------------------------------------===//
				// Remaining instructions:
				// DS_NOP
				// DS_GWS_SEMA_RELEASE_ALL
				// DS_WRAP_RTN_B32
				// DS_CNDXCHG32_RTN_B64
				// DS_WRITE_B96
				// DS_WRITE_B128
				// DS_CONDXCHG32_RTN_B128
				// DS_READ_B96
				// DS_READ_B128

				let SubtargetPredicate = isCIVI in {

				def DS_WRAP_RTN_F32 : DS_1A1D_RET <"ds_wrap_rtn_f32">,
				AtomicNoRet<"ds_wrap_f32", 1>;

				} // let SubtargetPredicate = isCIVI

				//===----------------------------------------------------------------------===//
				// Instruction definitions for VI and newer.
				//===----------------------------------------------------------------------===//

				let SubtargetPredicate = isVI in {

				let Uses = [EXEC] in {
				def DS_PERMUTE_B32 : DS_1A1D_PERMUTE <"ds_permute_b32",
				int_amdgcn_ds_permute>;
				def DS_BPERMUTE_B32 : DS_1A1D_PERMUTE <"ds_bpermute_b32",
				int_amdgcn_ds_bpermute>;
				}

				} // let SubtargetPredicate = isVI

				//===----------------------------------------------------------------------===//
				// DS Patterns
				//===----------------------------------------------------------------------===//

				let Predicates = [isGCN] in {

				def : Pat <
				(int_amdgcn_ds_swizzle i32:$src, imm:$offset16),
				(DS_SWIZZLE_B32 $src, (as_i16imm $offset16), (i1 0))
				>;

				class DSReadPat <DS_Pseudo inst, ValueType vt, PatFrag frag> : Pat <
				(vt (frag (DS1Addr1Offset i32:$ptr, i32:$offset))),
				(inst $ptr, (as_i16imm $offset), (i1 0))
				>;

				def : DSReadPat <DS_READ_I8, i32, si_sextload_local_i8>;
				def : DSReadPat <DS_READ_U8, i32, si_az_extload_local_i8>;
				def : DSReadPat <DS_READ_I16, i32, si_sextload_local_i16>;
				def : DSReadPat <DS_READ_U16, i32, si_az_extload_local_i16>;
				def : DSReadPat <DS_READ_B32, i32, si_load_local>;

				let AddedComplexity = 100 in {

				def : DSReadPat <DS_READ_B64, v2i32, si_load_local_align8>;

				} // End AddedComplexity = 100

				def : Pat <
				(v2i32 (si_load_local (DS64Bit4ByteAligned i32:$ptr, i8:$offset0,
				i8:$offset1))),
				(DS_READ2_B32 $ptr, $offset0, $offset1, (i1 0))
				>;

				class DSWritePat <DS_Pseudo inst, ValueType vt, PatFrag frag> : Pat <
				(frag vt:$value, (DS1Addr1Offset i32:$ptr, i32:$offset)),
				(inst $ptr, $value, (as_i16imm $offset), (i1 0))
				>;

				def : DSWritePat <DS_WRITE_B8, i32, si_truncstore_local_i8>;
				def : DSWritePat <DS_WRITE_B16, i32, si_truncstore_local_i16>;
				def : DSWritePat <DS_WRITE_B32, i32, si_store_local>;

				let AddedComplexity = 100 in {

				def : DSWritePat <DS_WRITE_B64, v2i32, si_store_local_align8>;
				} // End AddedComplexity = 100

				def : Pat <
				(si_store_local v2i32:$value, (DS64Bit4ByteAligned i32:$ptr, i8:$offset0,
				i8:$offset1)),
				(DS_WRITE2_B32 $ptr, (EXTRACT_SUBREG $value, sub0),
				(EXTRACT_SUBREG $value, sub1), $offset0, $offset1,
				(i1 0))
				>;

				class DSAtomicRetPat<DS_Pseudo inst, ValueType vt, PatFrag frag> : Pat <
				(frag (DS1Addr1Offset i32:$ptr, i32:$offset), vt:$value),
				(inst $ptr, $value, (as_i16imm $offset), (i1 0))
				>;

				class DSAtomicCmpXChg<DS_Pseudo inst, ValueType vt, PatFrag frag> : Pat <
				(frag (DS1Addr1Offset i32:$ptr, i32:$offset), vt:$cmp, vt:$swap),
				(inst $ptr, $cmp, $swap, (as_i16imm $offset), (i1 0))
				>;


				// 32-bit atomics.
				def : DSAtomicRetPat<DS_WRXCHG_RTN_B32, i32, si_atomic_swap_local>;
				def : DSAtomicRetPat<DS_ADD_RTN_U32, i32, si_atomic_load_add_local>;
				def : DSAtomicRetPat<DS_SUB_RTN_U32, i32, si_atomic_load_sub_local>;
				def : DSAtomicRetPat<DS_INC_RTN_U32, i32, si_atomic_inc_local>;
				def : DSAtomicRetPat<DS_DEC_RTN_U32, i32, si_atomic_dec_local>;
				def : DSAtomicRetPat<DS_AND_RTN_B32, i32, si_atomic_load_and_local>;
				def : DSAtomicRetPat<DS_OR_RTN_B32, i32, si_atomic_load_or_local>;
				def : DSAtomicRetPat<DS_XOR_RTN_B32, i32, si_atomic_load_xor_local>;
				def : DSAtomicRetPat<DS_MIN_RTN_I32, i32, si_atomic_load_min_local>;
				def : DSAtomicRetPat<DS_MAX_RTN_I32, i32, si_atomic_load_max_local>;
				def : DSAtomicRetPat<DS_MIN_RTN_U32, i32, si_atomic_load_umin_local>;
				def : DSAtomicRetPat<DS_MAX_RTN_U32, i32, si_atomic_load_umax_local>;
				def : DSAtomicCmpXChg<DS_CMPST_RTN_B32, i32, si_atomic_cmp_swap_32_local>;

				// 64-bit atomics.
				def : DSAtomicRetPat<DS_WRXCHG_RTN_B64, i64, si_atomic_swap_local>;
				def : DSAtomicRetPat<DS_ADD_RTN_U64, i64, si_atomic_load_add_local>;
				def : DSAtomicRetPat<DS_SUB_RTN_U64, i64, si_atomic_load_sub_local>;
				def : DSAtomicRetPat<DS_INC_RTN_U64, i64, si_atomic_inc_local>;
				def : DSAtomicRetPat<DS_DEC_RTN_U64, i64, si_atomic_dec_local>;
				def : DSAtomicRetPat<DS_AND_RTN_B64, i64, si_atomic_load_and_local>;
				def : DSAtomicRetPat<DS_OR_RTN_B64, i64, si_atomic_load_or_local>;
				def : DSAtomicRetPat<DS_XOR_RTN_B64, i64, si_atomic_load_xor_local>;
				def : DSAtomicRetPat<DS_MIN_RTN_I64, i64, si_atomic_load_min_local>;
				def : DSAtomicRetPat<DS_MAX_RTN_I64, i64, si_atomic_load_max_local>;
				def : DSAtomicRetPat<DS_MIN_RTN_U64, i64, si_atomic_load_umin_local>;
				def : DSAtomicRetPat<DS_MAX_RTN_U64, i64, si_atomic_load_umax_local>;

				def : DSAtomicCmpXChg<DS_CMPST_RTN_B64, i64, si_atomic_cmp_swap_64_local>;

				} // let Predicates = [isGCN]

				//===----------------------------------------------------------------------===//
				// Real instructions
				//===----------------------------------------------------------------------===//

				//===----------------------------------------------------------------------===//
				// SIInstructions.td
				//===----------------------------------------------------------------------===//

				class DS_Real_si <bits<8> op, DS_Pseudo ds> :
				DS_Real <ds>,
				SIMCInstr <ds.Mnemonic, SIEncodingFamily.SI> {
				let AssemblerPredicates=[isSICI];
				let DecoderNamespace="SICI";

				// encoding
				let Inst{7-0} = !if(ds.has_offset0, offset0, 0);
				let Inst{15-8} = !if(ds.has_offset1, offset1, 0);
				let Inst{17} = !if(ds.has_gds, gds, ds.gdsValue);
				let Inst{25-18} = op;
				let Inst{31-26} = 0x36; // ds prefix
				let Inst{39-32} = !if(ds.has_addr, addr, 0);
				let Inst{47-40} = !if(ds.has_data0, data0, 0);
				let Inst{55-48} = !if(ds.has_data1, data1, 0);
				let Inst{63-56} = !if(ds.has_vdst, vdst, 0);
				}

				def DS_ADD_U32_si : DS_Real_si<0x0, DS_ADD_U32>;
				def DS_SUB_U32_si : DS_Real_si<0x1, DS_SUB_U32>;
				def DS_RSUB_U32_si : DS_Real_si<0x2, DS_RSUB_U32>;
				def DS_INC_U32_si : DS_Real_si<0x3, DS_INC_U32>;
				def DS_DEC_U32_si : DS_Real_si<0x4, DS_DEC_U32>;
				def DS_MIN_I32_si : DS_Real_si<0x5, DS_MIN_I32>;
				def DS_MAX_I32_si : DS_Real_si<0x6, DS_MAX_I32>;
				def DS_MIN_U32_si : DS_Real_si<0x7, DS_MIN_U32>;
				def DS_MAX_U32_si : DS_Real_si<0x8, DS_MAX_U32>;
				def DS_AND_B32_si : DS_Real_si<0x9, DS_AND_B32>;
				def DS_OR_B32_si : DS_Real_si<0xa, DS_OR_B32>;
				def DS_XOR_B32_si : DS_Real_si<0xb, DS_XOR_B32>;
				def DS_MSKOR_B32_si : DS_Real_si<0xc, DS_MSKOR_B32>;
				def DS_WRITE_B32_si : DS_Real_si<0xd, DS_WRITE_B32>;
				def DS_WRITE2_B32_si : DS_Real_si<0xe, DS_WRITE2_B32>;
				def DS_WRITE2ST64_B32_si : DS_Real_si<0xf, DS_WRITE2ST64_B32>;
				def DS_CMPST_B32_si : DS_Real_si<0x10, DS_CMPST_B32>;
				def DS_CMPST_F32_si : DS_Real_si<0x11, DS_CMPST_F32>;
				def DS_MIN_F32_si : DS_Real_si<0x12, DS_MIN_F32>;
				def DS_MAX_F32_si : DS_Real_si<0x13, DS_MAX_F32>;
				def DS_GWS_INIT_si : DS_Real_si<0x19, DS_GWS_INIT>;
				def DS_GWS_SEMA_V_si : DS_Real_si<0x1a, DS_GWS_SEMA_V>;
				def DS_GWS_SEMA_BR_si : DS_Real_si<0x1b, DS_GWS_SEMA_BR>;
				def DS_GWS_SEMA_P_si : DS_Real_si<0x1c, DS_GWS_SEMA_P>;
				def DS_GWS_BARRIER_si : DS_Real_si<0x1d, DS_GWS_BARRIER>;
				def DS_WRITE_B8_si : DS_Real_si<0x1e, DS_WRITE_B8>;
				def DS_WRITE_B16_si : DS_Real_si<0x1f, DS_WRITE_B16>;
				def DS_ADD_RTN_U32_si : DS_Real_si<0x20, DS_ADD_RTN_U32>;
				def DS_SUB_RTN_U32_si : DS_Real_si<0x21, DS_SUB_RTN_U32>;
				def DS_RSUB_RTN_U32_si : DS_Real_si<0x22, DS_RSUB_RTN_U32>;
				def DS_INC_RTN_U32_si : DS_Real_si<0x23, DS_INC_RTN_U32>;
				def DS_DEC_RTN_U32_si : DS_Real_si<0x24, DS_DEC_RTN_U32>;
				def DS_MIN_RTN_I32_si : DS_Real_si<0x25, DS_MIN_RTN_I32>;
				def DS_MAX_RTN_I32_si : DS_Real_si<0x26, DS_MAX_RTN_I32>;
				def DS_MIN_RTN_U32_si : DS_Real_si<0x27, DS_MIN_RTN_U32>;
				def DS_MAX_RTN_U32_si : DS_Real_si<0x28, DS_MAX_RTN_U32>;
				def DS_AND_RTN_B32_si : DS_Real_si<0x29, DS_AND_RTN_B32>;
				def DS_OR_RTN_B32_si : DS_Real_si<0x2a, DS_OR_RTN_B32>;
				def DS_XOR_RTN_B32_si : DS_Real_si<0x2b, DS_XOR_RTN_B32>;
				def DS_MSKOR_RTN_B32_si : DS_Real_si<0x2c, DS_MSKOR_RTN_B32>;
				def DS_WRXCHG_RTN_B32_si : DS_Real_si<0x2d, DS_WRXCHG_RTN_B32>;
				def DS_WRXCHG2_RTN_B32_si : DS_Real_si<0x2e, DS_WRXCHG2_RTN_B32>;
				def DS_WRXCHG2ST64_RTN_B32_si : DS_Real_si<0x2f, DS_WRXCHG2ST64_RTN_B32>;
				def DS_CMPST_RTN_B32_si : DS_Real_si<0x30, DS_CMPST_RTN_B32>;
				def DS_CMPST_RTN_F32_si : DS_Real_si<0x31, DS_CMPST_RTN_F32>;
				def DS_MIN_RTN_F32_si : DS_Real_si<0x32, DS_MIN_RTN_F32>;
				def DS_MAX_RTN_F32_si : DS_Real_si<0x33, DS_MAX_RTN_F32>;

				// FIXME: this instruction is actually CI/VI
				def DS_WRAP_RTN_F32_si : DS_Real_si<0x34, DS_WRAP_RTN_F32>;

				def DS_SWIZZLE_B32_si : DS_Real_si<0x35, DS_SWIZZLE_B32>;
				def DS_READ_B32_si : DS_Real_si<0x36, DS_READ_B32>;
				def DS_READ2_B32_si : DS_Real_si<0x37, DS_READ2_B32>;
				def DS_READ2ST64_B32_si : DS_Real_si<0x38, DS_READ2ST64_B32>;
				def DS_READ_I8_si : DS_Real_si<0x39, DS_READ_I8>;
				def DS_READ_U8_si : DS_Real_si<0x3a, DS_READ_U8>;
				def DS_READ_I16_si : DS_Real_si<0x3b, DS_READ_I16>;
				def DS_READ_U16_si : DS_Real_si<0x3c, DS_READ_U16>;
				def DS_CONSUME_si : DS_Real_si<0x3d, DS_CONSUME>;
				def DS_APPEND_si : DS_Real_si<0x3e, DS_APPEND>;
				def DS_ORDERED_COUNT_si : DS_Real_si<0x3f, DS_ORDERED_COUNT>;
				def DS_ADD_U64_si : DS_Real_si<0x40, DS_ADD_U64>;
				def DS_SUB_U64_si : DS_Real_si<0x41, DS_SUB_U64>;
				def DS_RSUB_U64_si : DS_Real_si<0x42, DS_RSUB_U64>;
				def DS_INC_U64_si : DS_Real_si<0x43, DS_INC_U64>;
				def DS_DEC_U64_si : DS_Real_si<0x44, DS_DEC_U64>;
				def DS_MIN_I64_si : DS_Real_si<0x45, DS_MIN_I64>;
				def DS_MAX_I64_si : DS_Real_si<0x46, DS_MAX_I64>;
				def DS_MIN_U64_si : DS_Real_si<0x47, DS_MIN_U64>;
				def DS_MAX_U64_si : DS_Real_si<0x48, DS_MAX_U64>;
				def DS_AND_B64_si : DS_Real_si<0x49, DS_AND_B64>;
				def DS_OR_B64_si : DS_Real_si<0x4a, DS_OR_B64>;
				def DS_XOR_B64_si : DS_Real_si<0x4b, DS_XOR_B64>;
				def DS_MSKOR_B64_si : DS_Real_si<0x4c, DS_MSKOR_B64>;
				def DS_WRITE_B64_si : DS_Real_si<0x4d, DS_WRITE_B64>;
				def DS_WRITE2_B64_si : DS_Real_si<0x4E, DS_WRITE2_B64>;
				def DS_WRITE2ST64_B64_si : DS_Real_si<0x4f, DS_WRITE2ST64_B64>;
				def DS_CMPST_B64_si : DS_Real_si<0x50, DS_CMPST_B64>;
				def DS_CMPST_F64_si : DS_Real_si<0x51, DS_CMPST_F64>;
				def DS_MIN_F64_si : DS_Real_si<0x52, DS_MIN_F64>;
				def DS_MAX_F64_si : DS_Real_si<0x53, DS_MAX_F64>;

				def DS_ADD_RTN_U64_si : DS_Real_si<0x60, DS_ADD_RTN_U64>;
				def DS_SUB_RTN_U64_si : DS_Real_si<0x61, DS_SUB_RTN_U64>;
				def DS_RSUB_RTN_U64_si : DS_Real_si<0x62, DS_RSUB_RTN_U64>;
				def DS_INC_RTN_U64_si : DS_Real_si<0x63, DS_INC_RTN_U64>;
				def DS_DEC_RTN_U64_si : DS_Real_si<0x64, DS_DEC_RTN_U64>;
				def DS_MIN_RTN_I64_si : DS_Real_si<0x65, DS_MIN_RTN_I64>;
				def DS_MAX_RTN_I64_si : DS_Real_si<0x66, DS_MAX_RTN_I64>;
				def DS_MIN_RTN_U64_si : DS_Real_si<0x67, DS_MIN_RTN_U64>;
				def DS_MAX_RTN_U64_si : DS_Real_si<0x68, DS_MAX_RTN_U64>;
				def DS_AND_RTN_B64_si : DS_Real_si<0x69, DS_AND_RTN_B64>;
				def DS_OR_RTN_B64_si : DS_Real_si<0x6a, DS_OR_RTN_B64>;
				def DS_XOR_RTN_B64_si : DS_Real_si<0x6b, DS_XOR_RTN_B64>;
				def DS_MSKOR_RTN_B64_si : DS_Real_si<0x6c, DS_MSKOR_RTN_B64>;
				def DS_WRXCHG_RTN_B64_si : DS_Real_si<0x6d, DS_WRXCHG_RTN_B64>;
				def DS_WRXCHG2_RTN_B64_si : DS_Real_si<0x6e, DS_WRXCHG2_RTN_B64>;
				def DS_WRXCHG2ST64_RTN_B64_si : DS_Real_si<0x6f, DS_WRXCHG2ST64_RTN_B64>;
				def DS_CMPST_RTN_B64_si : DS_Real_si<0x70, DS_CMPST_RTN_B64>;
				def DS_CMPST_RTN_F64_si : DS_Real_si<0x71, DS_CMPST_RTN_F64>;
				def DS_MIN_RTN_F64_si : DS_Real_si<0x72, DS_MIN_RTN_F64>;
				def DS_MAX_RTN_F64_si : DS_Real_si<0x73, DS_MAX_RTN_F64>;

				def DS_READ_B64_si : DS_Real_si<0x76, DS_READ_B64>;
				def DS_READ2_B64_si : DS_Real_si<0x77, DS_READ2_B64>;
				def DS_READ2ST64_B64_si : DS_Real_si<0x78, DS_READ2ST64_B64>;

				def DS_ADD_SRC2_U32_si : DS_Real_si<0x80, DS_ADD_SRC2_U32>;
				def DS_SUB_SRC2_U32_si : DS_Real_si<0x81, DS_SUB_SRC2_U32>;
				def DS_RSUB_SRC2_U32_si : DS_Real_si<0x82, DS_RSUB_SRC2_U32>;
				def DS_INC_SRC2_U32_si : DS_Real_si<0x83, DS_INC_SRC2_U32>;
				def DS_DEC_SRC2_U32_si : DS_Real_si<0x84, DS_DEC_SRC2_U32>;
				def DS_MIN_SRC2_I32_si : DS_Real_si<0x85, DS_MIN_SRC2_I32>;
				def DS_MAX_SRC2_I32_si : DS_Real_si<0x86, DS_MAX_SRC2_I32>;
				def DS_MIN_SRC2_U32_si : DS_Real_si<0x87, DS_MIN_SRC2_U32>;
				def DS_MAX_SRC2_U32_si : DS_Real_si<0x88, DS_MAX_SRC2_U32>;
				def DS_AND_SRC2_B32_si : DS_Real_si<0x89, DS_AND_SRC2_B32>;
				def DS_OR_SRC2_B32_si : DS_Real_si<0x8a, DS_OR_SRC2_B32>;
				def DS_XOR_SRC2_B32_si : DS_Real_si<0x8b, DS_XOR_SRC2_B32>;
				def DS_WRITE_SRC2_B32_si : DS_Real_si<0x8d, DS_WRITE_SRC2_B32>;

				def DS_MIN_SRC2_F32_si : DS_Real_si<0x92, DS_MIN_SRC2_F32>;
				def DS_MAX_SRC2_F32_si : DS_Real_si<0x93, DS_MAX_SRC2_F32>;

				def DS_ADD_SRC2_U64_si : DS_Real_si<0xc0, DS_ADD_SRC2_U64>;
				def DS_SUB_SRC2_U64_si : DS_Real_si<0xc1, DS_SUB_SRC2_U64>;
				def DS_RSUB_SRC2_U64_si : DS_Real_si<0xc2, DS_RSUB_SRC2_U64>;
				def DS_INC_SRC2_U64_si : DS_Real_si<0xc3, DS_INC_SRC2_U64>;
				def DS_DEC_SRC2_U64_si : DS_Real_si<0xc4, DS_DEC_SRC2_U64>;
				def DS_MIN_SRC2_I64_si : DS_Real_si<0xc5, DS_MIN_SRC2_I64>;
				def DS_MAX_SRC2_I64_si : DS_Real_si<0xc6, DS_MAX_SRC2_I64>;
				def DS_MIN_SRC2_U64_si : DS_Real_si<0xc7, DS_MIN_SRC2_U64>;
				def DS_MAX_SRC2_U64_si : DS_Real_si<0xc8, DS_MAX_SRC2_U64>;
				def DS_AND_SRC2_B64_si : DS_Real_si<0xc9, DS_AND_SRC2_B64>;
				def DS_OR_SRC2_B64_si : DS_Real_si<0xca, DS_OR_SRC2_B64>;
				def DS_XOR_SRC2_B64_si : DS_Real_si<0xcb, DS_XOR_SRC2_B64>;
				def DS_WRITE_SRC2_B64_si : DS_Real_si<0xcd, DS_WRITE_SRC2_B64>;

				def DS_MIN_SRC2_F64_si : DS_Real_si<0xd2, DS_MIN_SRC2_F64>;
				def DS_MAX_SRC2_F64_si : DS_Real_si<0xd3, DS_MAX_SRC2_F64>;

				//===----------------------------------------------------------------------===//
				// VIInstructions.td
				//===----------------------------------------------------------------------===//

				tstellarAMDUnsubmitted Not Done Reply Inline Actions There's a few place like this with lots of extra whitespace that could be cleaned up. tstellarAMD: There's a few place like this with lots of extra whitespace that could be cleaned up.
				class DS_Real_vi <bits<8> op, DS_Pseudo ds> :
				DS_Real <ds>,
				SIMCInstr <ds.Mnemonic, SIEncodingFamily.VI> {
				let AssemblerPredicates = [isVI];
				let DecoderNamespace="VI";

				// encoding
				let Inst{7-0} = !if(ds.has_offset0, offset0, 0);
				let Inst{15-8} = !if(ds.has_offset1, offset1, 0);
				let Inst{16} = !if(ds.has_gds, gds, ds.gdsValue);
				let Inst{24-17} = op;
				let Inst{31-26} = 0x36; // ds prefix
				let Inst{39-32} = !if(ds.has_addr, addr, 0);
				let Inst{47-40} = !if(ds.has_data0, data0, 0);
				let Inst{55-48} = !if(ds.has_data1, data1, 0);
				let Inst{63-56} = !if(ds.has_vdst, vdst, 0);
				}

				def DS_ADD_U32_vi : DS_Real_vi<0x0, DS_ADD_U32>;
				def DS_SUB_U32_vi : DS_Real_vi<0x1, DS_SUB_U32>;
				def DS_RSUB_U32_vi : DS_Real_vi<0x2, DS_RSUB_U32>;
				def DS_INC_U32_vi : DS_Real_vi<0x3, DS_INC_U32>;
				def DS_DEC_U32_vi : DS_Real_vi<0x4, DS_DEC_U32>;
				def DS_MIN_I32_vi : DS_Real_vi<0x5, DS_MIN_I32>;
				def DS_MAX_I32_vi : DS_Real_vi<0x6, DS_MAX_I32>;
				def DS_MIN_U32_vi : DS_Real_vi<0x7, DS_MIN_U32>;
				def DS_MAX_U32_vi : DS_Real_vi<0x8, DS_MAX_U32>;
				def DS_AND_B32_vi : DS_Real_vi<0x9, DS_AND_B32>;
				def DS_OR_B32_vi : DS_Real_vi<0xa, DS_OR_B32>;
				def DS_XOR_B32_vi : DS_Real_vi<0xb, DS_XOR_B32>;
				def DS_MSKOR_B32_vi : DS_Real_vi<0xc, DS_MSKOR_B32>;
				def DS_WRITE_B32_vi : DS_Real_vi<0xd, DS_WRITE_B32>;
				def DS_WRITE2_B32_vi : DS_Real_vi<0xe, DS_WRITE2_B32>;
				def DS_WRITE2ST64_B32_vi : DS_Real_vi<0xf, DS_WRITE2ST64_B32>;
				def DS_CMPST_B32_vi : DS_Real_vi<0x10, DS_CMPST_B32>;
				def DS_CMPST_F32_vi : DS_Real_vi<0x11, DS_CMPST_F32>;
				def DS_MIN_F32_vi : DS_Real_vi<0x12, DS_MIN_F32>;
				def DS_MAX_F32_vi : DS_Real_vi<0x13, DS_MAX_F32>;
				def DS_GWS_INIT_vi : DS_Real_vi<0x19, DS_GWS_INIT>;
				def DS_GWS_SEMA_V_vi : DS_Real_vi<0x1a, DS_GWS_SEMA_V>;
				def DS_GWS_SEMA_BR_vi : DS_Real_vi<0x1b, DS_GWS_SEMA_BR>;
				def DS_GWS_SEMA_P_vi : DS_Real_vi<0x1c, DS_GWS_SEMA_P>;
				def DS_GWS_BARRIER_vi : DS_Real_vi<0x1d, DS_GWS_BARRIER>;
				def DS_WRITE_B8_vi : DS_Real_vi<0x1e, DS_WRITE_B8>;
				def DS_WRITE_B16_vi : DS_Real_vi<0x1f, DS_WRITE_B16>;
				def DS_ADD_RTN_U32_vi : DS_Real_vi<0x20, DS_ADD_RTN_U32>;
				def DS_SUB_RTN_U32_vi : DS_Real_vi<0x21, DS_SUB_RTN_U32>;
				def DS_RSUB_RTN_U32_vi : DS_Real_vi<0x22, DS_RSUB_RTN_U32>;
				def DS_INC_RTN_U32_vi : DS_Real_vi<0x23, DS_INC_RTN_U32>;
				def DS_DEC_RTN_U32_vi : DS_Real_vi<0x24, DS_DEC_RTN_U32>;
				def DS_MIN_RTN_I32_vi : DS_Real_vi<0x25, DS_MIN_RTN_I32>;
				def DS_MAX_RTN_I32_vi : DS_Real_vi<0x26, DS_MAX_RTN_I32>;
				def DS_MIN_RTN_U32_vi : DS_Real_vi<0x27, DS_MIN_RTN_U32>;
				def DS_MAX_RTN_U32_vi : DS_Real_vi<0x28, DS_MAX_RTN_U32>;
				def DS_AND_RTN_B32_vi : DS_Real_vi<0x29, DS_AND_RTN_B32>;
				def DS_OR_RTN_B32_vi : DS_Real_vi<0x2a, DS_OR_RTN_B32>;
				def DS_XOR_RTN_B32_vi : DS_Real_vi<0x2b, DS_XOR_RTN_B32>;
				def DS_MSKOR_RTN_B32_vi : DS_Real_vi<0x2c, DS_MSKOR_RTN_B32>;
				def DS_WRXCHG_RTN_B32_vi : DS_Real_vi<0x2d, DS_WRXCHG_RTN_B32>;
				def DS_WRXCHG2_RTN_B32_vi : DS_Real_vi<0x2e, DS_WRXCHG2_RTN_B32>;
				def DS_WRXCHG2ST64_RTN_B32_vi : DS_Real_vi<0x2f, DS_WRXCHG2ST64_RTN_B32>;
				def DS_CMPST_RTN_B32_vi : DS_Real_vi<0x30, DS_CMPST_RTN_B32>;
				def DS_CMPST_RTN_F32_vi : DS_Real_vi<0x31, DS_CMPST_RTN_F32>;
				def DS_MIN_RTN_F32_vi : DS_Real_vi<0x32, DS_MIN_RTN_F32>;
				def DS_MAX_RTN_F32_vi : DS_Real_vi<0x33, DS_MAX_RTN_F32>;
				def DS_WRAP_RTN_F32_vi : DS_Real_vi<0x34, DS_WRAP_RTN_F32>;
				def DS_READ_B32_vi : DS_Real_vi<0x36, DS_READ_B32>;
				def DS_READ2_B32_vi : DS_Real_vi<0x37, DS_READ2_B32>;
				def DS_READ2ST64_B32_vi : DS_Real_vi<0x38, DS_READ2ST64_B32>;
				def DS_READ_I8_vi : DS_Real_vi<0x39, DS_READ_I8>;
				def DS_READ_U8_vi : DS_Real_vi<0x3a, DS_READ_U8>;
				def DS_READ_I16_vi : DS_Real_vi<0x3b, DS_READ_I16>;
				def DS_READ_U16_vi : DS_Real_vi<0x3c, DS_READ_U16>;
				def DS_SWIZZLE_B32_vi : DS_Real_vi<0x3d, DS_SWIZZLE_B32>;
				def DS_PERMUTE_B32_vi : DS_Real_vi<0x3e, DS_PERMUTE_B32>;
				def DS_BPERMUTE_B32_vi : DS_Real_vi<0x3f, DS_BPERMUTE_B32>;

				def DS_ADD_U64_vi : DS_Real_vi<0x40, DS_ADD_U64>;
				def DS_SUB_U64_vi : DS_Real_vi<0x41, DS_SUB_U64>;
				def DS_RSUB_U64_vi : DS_Real_vi<0x42, DS_RSUB_U64>;
				def DS_INC_U64_vi : DS_Real_vi<0x43, DS_INC_U64>;
				def DS_DEC_U64_vi : DS_Real_vi<0x44, DS_DEC_U64>;
				def DS_MIN_I64_vi : DS_Real_vi<0x45, DS_MIN_I64>;
				def DS_MAX_I64_vi : DS_Real_vi<0x46, DS_MAX_I64>;
				def DS_MIN_U64_vi : DS_Real_vi<0x47, DS_MIN_U64>;
				def DS_MAX_U64_vi : DS_Real_vi<0x48, DS_MAX_U64>;
				def DS_AND_B64_vi : DS_Real_vi<0x49, DS_AND_B64>;
				def DS_OR_B64_vi : DS_Real_vi<0x4a, DS_OR_B64>;
				def DS_XOR_B64_vi : DS_Real_vi<0x4b, DS_XOR_B64>;
				def DS_MSKOR_B64_vi : DS_Real_vi<0x4c, DS_MSKOR_B64>;
				def DS_WRITE_B64_vi : DS_Real_vi<0x4d, DS_WRITE_B64>;
				def DS_WRITE2_B64_vi : DS_Real_vi<0x4E, DS_WRITE2_B64>;
				def DS_WRITE2ST64_B64_vi : DS_Real_vi<0x4f, DS_WRITE2ST64_B64>;
				def DS_CMPST_B64_vi : DS_Real_vi<0x50, DS_CMPST_B64>;
				def DS_CMPST_F64_vi : DS_Real_vi<0x51, DS_CMPST_F64>;
				def DS_MIN_F64_vi : DS_Real_vi<0x52, DS_MIN_F64>;
				def DS_MAX_F64_vi : DS_Real_vi<0x53, DS_MAX_F64>;

				def DS_ADD_RTN_U64_vi : DS_Real_vi<0x60, DS_ADD_RTN_U64>;
				def DS_SUB_RTN_U64_vi : DS_Real_vi<0x61, DS_SUB_RTN_U64>;
				def DS_RSUB_RTN_U64_vi : DS_Real_vi<0x62, DS_RSUB_RTN_U64>;
				def DS_INC_RTN_U64_vi : DS_Real_vi<0x63, DS_INC_RTN_U64>;
				def DS_DEC_RTN_U64_vi : DS_Real_vi<0x64, DS_DEC_RTN_U64>;
				def DS_MIN_RTN_I64_vi : DS_Real_vi<0x65, DS_MIN_RTN_I64>;
				def DS_MAX_RTN_I64_vi : DS_Real_vi<0x66, DS_MAX_RTN_I64>;
				def DS_MIN_RTN_U64_vi : DS_Real_vi<0x67, DS_MIN_RTN_U64>;
				def DS_MAX_RTN_U64_vi : DS_Real_vi<0x68, DS_MAX_RTN_U64>;
				def DS_AND_RTN_B64_vi : DS_Real_vi<0x69, DS_AND_RTN_B64>;
				def DS_OR_RTN_B64_vi : DS_Real_vi<0x6a, DS_OR_RTN_B64>;
				def DS_XOR_RTN_B64_vi : DS_Real_vi<0x6b, DS_XOR_RTN_B64>;
				def DS_MSKOR_RTN_B64_vi : DS_Real_vi<0x6c, DS_MSKOR_RTN_B64>;
				def DS_WRXCHG_RTN_B64_vi : DS_Real_vi<0x6d, DS_WRXCHG_RTN_B64>;
				def DS_WRXCHG2_RTN_B64_vi : DS_Real_vi<0x6e, DS_WRXCHG2_RTN_B64>;
				def DS_WRXCHG2ST64_RTN_B64_vi : DS_Real_vi<0x6f, DS_WRXCHG2ST64_RTN_B64>;
				def DS_CMPST_RTN_B64_vi : DS_Real_vi<0x70, DS_CMPST_RTN_B64>;
				def DS_CMPST_RTN_F64_vi : DS_Real_vi<0x71, DS_CMPST_RTN_F64>;
				def DS_MIN_RTN_F64_vi : DS_Real_vi<0x72, DS_MIN_RTN_F64>;
				def DS_MAX_RTN_F64_vi : DS_Real_vi<0x73, DS_MAX_RTN_F64>;

				def DS_READ_B64_vi : DS_Real_vi<0x76, DS_READ_B64>;
				def DS_READ2_B64_vi : DS_Real_vi<0x77, DS_READ2_B64>;
				def DS_READ2ST64_B64_vi : DS_Real_vi<0x78, DS_READ2ST64_B64>;

				def DS_ADD_SRC2_U32_vi : DS_Real_vi<0x80, DS_ADD_SRC2_U32>;
				def DS_SUB_SRC2_U32_vi : DS_Real_vi<0x81, DS_SUB_SRC2_U32>;
				def DS_RSUB_SRC2_U32_vi : DS_Real_vi<0x82, DS_RSUB_SRC2_U32>;
				def DS_INC_SRC2_U32_vi : DS_Real_vi<0x83, DS_INC_SRC2_U32>;
				def DS_DEC_SRC2_U32_vi : DS_Real_vi<0x84, DS_DEC_SRC2_U32>;
				def DS_MIN_SRC2_I32_vi : DS_Real_vi<0x85, DS_MIN_SRC2_I32>;
				def DS_MAX_SRC2_I32_vi : DS_Real_vi<0x86, DS_MAX_SRC2_I32>;
				def DS_MIN_SRC2_U32_vi : DS_Real_vi<0x87, DS_MIN_SRC2_U32>;
				def DS_MAX_SRC2_U32_vi : DS_Real_vi<0x88, DS_MAX_SRC2_U32>;
				def DS_AND_SRC2_B32_vi : DS_Real_vi<0x89, DS_AND_SRC2_B32>;
				def DS_OR_SRC2_B32_vi : DS_Real_vi<0x8a, DS_OR_SRC2_B32>;
				def DS_XOR_SRC2_B32_vi : DS_Real_vi<0x8b, DS_XOR_SRC2_B32>;
				def DS_WRITE_SRC2_B32_vi : DS_Real_vi<0x8d, DS_WRITE_SRC2_B32>;
				def DS_MIN_SRC2_F32_vi : DS_Real_vi<0x92, DS_MIN_SRC2_F32>;
				def DS_MAX_SRC2_F32_vi : DS_Real_vi<0x93, DS_MAX_SRC2_F32>;
				def DS_ADD_SRC2_U64_vi : DS_Real_vi<0xc0, DS_ADD_SRC2_U64>;
				def DS_SUB_SRC2_U64_vi : DS_Real_vi<0xc1, DS_SUB_SRC2_U64>;
				def DS_RSUB_SRC2_U64_vi : DS_Real_vi<0xc2, DS_RSUB_SRC2_U64>;
				def DS_INC_SRC2_U64_vi : DS_Real_vi<0xc3, DS_INC_SRC2_U64>;
				def DS_DEC_SRC2_U64_vi : DS_Real_vi<0xc4, DS_DEC_SRC2_U64>;
				def DS_MIN_SRC2_I64_vi : DS_Real_vi<0xc5, DS_MIN_SRC2_I64>;
				def DS_MAX_SRC2_I64_vi : DS_Real_vi<0xc6, DS_MAX_SRC2_I64>;
				def DS_MIN_SRC2_U64_vi : DS_Real_vi<0xc7, DS_MIN_SRC2_U64>;
				def DS_MAX_SRC2_U64_vi : DS_Real_vi<0xc8, DS_MAX_SRC2_U64>;
				def DS_AND_SRC2_B64_vi : DS_Real_vi<0xc9, DS_AND_SRC2_B64>;
				def DS_OR_SRC2_B64_vi : DS_Real_vi<0xca, DS_OR_SRC2_B64>;
				def DS_XOR_SRC2_B64_vi : DS_Real_vi<0xcb, DS_XOR_SRC2_B64>;
				def DS_WRITE_SRC2_B64_vi : DS_Real_vi<0xcd, DS_WRITE_SRC2_B64>;
				def DS_MIN_SRC2_F64_vi : DS_Real_vi<0xd2, DS_MIN_SRC2_F64>;
				def DS_MAX_SRC2_F64_vi : DS_Real_vi<0xd3, DS_MAX_SRC2_F64>;

lib/Target/AMDGPU/SIInstrFormats.td

Show First 20 Lines • Show All 476 Lines • ▼ Show 20 Lines	class VINTRPe <bits<2> op> : Enc32 {
let Inst{7-0} = vsrc;		let Inst{7-0} = vsrc;
let Inst{9-8} = attrchan;		let Inst{9-8} = attrchan;
let Inst{15-10} = attr;		let Inst{15-10} = attr;
let Inst{17-16} = op;		let Inst{17-16} = op;
let Inst{25-18} = vdst;		let Inst{25-18} = vdst;
let Inst{31-26} = 0x32; // encoding		let Inst{31-26} = 0x32; // encoding
}		}

class DSe <bits<8> op> : Enc64 {
bits<8> vdst;
bits<1> gds;
bits<8> addr;
bits<8> data0;
bits<8> data1;
bits<8> offset0;
bits<8> offset1;

let Inst{7-0} = offset0;
let Inst{15-8} = offset1;
let Inst{17} = gds;
let Inst{25-18} = op;
let Inst{31-26} = 0x36; //encoding
let Inst{39-32} = addr;
let Inst{47-40} = data0;
let Inst{55-48} = data1;
let Inst{63-56} = vdst;
}

class MUBUFe <bits<7> op> : Enc64 {		class MUBUFe <bits<7> op> : Enc64 {
bits<12> offset;		bits<12> offset;
bits<1> offen;		bits<1> offen;
bits<1> idxen;		bits<1> idxen;
bits<1> glc;		bits<1> glc;
bits<1> addr64;		bits<1> addr64;
bits<1> lds;		bits<1> lds;
bits<8> vaddr;		bits<8> vaddr;
▲ Show 20 Lines • Show All 148 Lines • ▼ Show 20 Lines
}		}

} // End Uses = [EXEC]		} // End Uses = [EXEC]

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Vector I/O operations		// Vector I/O operations
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class DS <dag outs, dag ins, string asm, list<dag> pattern> :
InstSI <outs, ins, asm, pattern> {

let LGKM_CNT = 1;
let DS = 1;
let UseNamedOperandTable = 1;
let Uses = [M0, EXEC];

// Most instruction load and store data, so set this as the default.
let mayLoad = 1;
let mayStore = 1;

let hasSideEffects = 0;
let AsmMatchConverter = "cvtDS";
let SchedRW = [WriteLDS];
}

class MUBUF <dag outs, dag ins, string asm, list<dag> pattern> :		class MUBUF <dag outs, dag ins, string asm, list<dag> pattern> :
InstSI<outs, ins, asm, pattern> {		InstSI<outs, ins, asm, pattern> {

let VM_CNT = 1;		let VM_CNT = 1;
let EXP_CNT = 1;		let EXP_CNT = 1;
let MUBUF = 1;		let MUBUF = 1;
let Uses = [EXEC];		let Uses = [EXEC];

▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

lib/Target/AMDGPU/SIInstrInfo.td

Show First 20 Lines • Show All 2,580 Lines • ▼ Show 20 Lines	multiclass VINTRP_m <bits <2> op, dag outs, dag ins, string asm,
def "" : VINTRP_Pseudo <NAME, outs, ins, pattern>;		def "" : VINTRP_Pseudo <NAME, outs, ins, pattern>;

def _si : VINTRP_Real_si <op, NAME, outs, ins, asm>;		def _si : VINTRP_Real_si <op, NAME, outs, ins, asm>;

def _vi : VINTRP_Real_vi <op, NAME, outs, ins, asm>;		def _vi : VINTRP_Real_vi <op, NAME, outs, ins, asm>;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Vector I/O classes
//===----------------------------------------------------------------------===//

class DS_Pseudo <string opName, dag outs, dag ins, list<dag> pattern> :
DS <outs, ins, "", pattern>,
SIMCInstr <opName, SIEncodingFamily.NONE> {
let isPseudo = 1;
let isCodeGenOnly = 1;
}

class DS_Real_si <bits<8> op, string opName, dag outs, dag ins, string asm> :
DS <outs, ins, asm, []>,
DSe <op>,
SIMCInstr <opName, SIEncodingFamily.SI> {
let isCodeGenOnly = 0;
let AssemblerPredicates = [isSICI];
let DecoderNamespace="SICI";
let DisableDecoder = DisableSIDecoder;
}

class DS_Real_vi <bits<8> op, string opName, dag outs, dag ins, string asm> :
DS <outs, ins, asm, []>,
DSe_vi <op>,
SIMCInstr <opName, SIEncodingFamily.VI> {
let isCodeGenOnly = 0;
let AssemblerPredicates = [isVI];
let DecoderNamespace="VI";
let DisableDecoder = DisableVIDecoder;
}

class DS_Off16_Real_si <bits<8> op, string opName, dag outs, dag ins, string asm> :
DS_Real_si <op,opName, outs, ins, asm> {

// Single load interpret the 2 i8imm operands as a single i16 offset.
bits<16> offset;
let offset0 = offset{7-0};
let offset1 = offset{15-8};
}

class DS_Off16_Real_vi <bits<8> op, string opName, dag outs, dag ins, string asm> :
DS_Real_vi <op, opName, outs, ins, asm> {

// Single load interpret the 2 i8imm operands as a single i16 offset.
bits<16> offset;
let offset0 = offset{7-0};
let offset1 = offset{15-8};
}

multiclass DS_1A_RET_ <dsop op, string opName, RegisterClass rc,
dag outs = (outs rc:$vdst),
dag ins = (ins VGPR_32:$addr, offset:$offset, gds:$gds),
string asm = opName#" $vdst, $addr"#"$offset$gds"> {

def "" : DS_Pseudo <opName, outs, ins, []>;

let data0 = 0, data1 = 0 in {
def _si : DS_Off16_Real_si <op.SI, opName, outs, ins, asm>;
def _vi : DS_Off16_Real_vi <op.VI, opName, outs, ins, asm>;
}
}

// TODO: DS_1A_RET can be inherited from DS_1A_RET_ but its not working
// for some reason. In fact we can remove this class if use dsop everywhere
multiclass DS_1A_RET <bits<8> op, string opName, RegisterClass rc,
dag outs = (outs rc:$vdst),
dag ins = (ins VGPR_32:$addr, offset:$offset, gds:$gds),
string asm = opName#" $vdst, $addr"#"$offset$gds"> {

def "" : DS_Pseudo <opName, outs, ins, []>;

let data0 = 0, data1 = 0 in {
def _si : DS_Off16_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Off16_Real_vi <op, opName, outs, ins, asm>;
}
}

multiclass DS_1A_Off8_RET <bits<8> op, string opName, RegisterClass rc,
dag outs = (outs rc:$vdst),
dag ins = (ins VGPR_32:$addr, offset0:$offset0, offset1:$offset1,
gds:$gds),
string asm = opName#" $vdst, $addr"#"$offset0"#"$offset1$gds"> {

def "" : DS_Pseudo <opName, outs, ins, []>;

let data0 = 0, data1 = 0, AsmMatchConverter = "cvtDSOffset01" in {
def _si : DS_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Real_vi <op, opName, outs, ins, asm>;
}
}

multiclass DS_1A1D_NORET <bits<8> op, string opName, RegisterClass rc,
dag outs = (outs),
dag ins = (ins VGPR_32:$addr, rc:$data0, offset:$offset, gds:$gds),
string asm = opName#" $addr, $data0"#"$offset$gds"> {

def "" : DS_Pseudo <opName, outs, ins, []>,
AtomicNoRet<opName, 0>;

let data1 = 0, vdst = 0 in {
def _si : DS_Off16_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Off16_Real_vi <op, opName, outs, ins, asm>;
}
}

multiclass DS_1A_Off8_NORET <bits<8> op, string opName,
dag outs = (outs),
dag ins = (ins VGPR_32:$addr,
offset0:$offset0, offset1:$offset1, gds:$gds),
string asm = opName#" $addr $offset0"#"$offset1$gds"> {

def "" : DS_Pseudo <opName, outs, ins, []>;

let data0 = 0, data1 = 0, vdst = 0, AsmMatchConverter = "cvtDSOffset01" in {
def _si : DS_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Real_vi <op, opName, outs, ins, asm>;
}
}

multiclass DS_1A2D_Off8_NORET <bits<8> op, string opName, RegisterClass rc,
dag outs = (outs),
dag ins = (ins VGPR_32:$addr, rc:$data0, rc:$data1,
offset0:$offset0, offset1:$offset1, gds:$gds),
string asm = opName#" $addr, $data0, $data1$offset0$offset1$gds"> {

def "" : DS_Pseudo <opName, outs, ins, []>;

let vdst = 0, AsmMatchConverter = "cvtDSOffset01" in {
def _si : DS_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Real_vi <op, opName, outs, ins, asm>;
}
}

multiclass DS_1A1D_RET <bits<8> op, string opName, RegisterClass rc,
string noRetOp = "",
dag outs = (outs rc:$vdst),
dag ins = (ins VGPR_32:$addr, rc:$data0, offset:$offset, gds:$gds),
string asm = opName#" $vdst, $addr, $data0"#"$offset$gds"> {

let hasPostISelHook = 1 in {
def "" : DS_Pseudo <opName, outs, ins, []>,
AtomicNoRet<noRetOp, 1>;

let data1 = 0 in {
def _si : DS_Off16_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Off16_Real_vi <op, opName, outs, ins, asm>;
}
}
}

multiclass DS_1A1D_PERMUTE <bits<8> op, string opName, RegisterClass rc,
SDPatternOperator node = null_frag,
dag outs = (outs rc:$vdst),
dag ins = (ins VGPR_32:$addr, rc:$data0, offset:$offset),
string asm = opName#" $vdst, $addr, $data0"#"$offset"> {

let mayLoad = 0, mayStore = 0, isConvergent = 1 in {
def "" : DS_Pseudo <opName, outs, ins,
[(set i32:$vdst,
(node (DS1Addr1Offset i32:$addr, i16:$offset), i32:$data0))]>;

let data1 = 0, gds = 0 in {
def "_vi" : DS_Off16_Real_vi <op, opName, outs, ins, asm>;
}
}
}

multiclass DS_1A2D_RET_m <bits<8> op, string opName, RegisterClass rc,
string noRetOp = "", dag ins,
dag outs = (outs rc:$vdst),
string asm = opName#" $vdst, $addr, $data0, $data1"#"$offset"#"$gds"> {

let hasPostISelHook = 1 in {
def "" : DS_Pseudo <opName, outs, ins, []>,
AtomicNoRet<noRetOp, 1>;

def _si : DS_Off16_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Off16_Real_vi <op, opName, outs, ins, asm>;
}
}

multiclass DS_1A2D_RET <bits<8> op, string asm, RegisterClass rc,
string noRetOp = "", RegisterClass src = rc> :
DS_1A2D_RET_m <op, asm, rc, noRetOp,
(ins VGPR_32:$addr, src:$data0, src:$data1,
offset:$offset, gds:$gds)
>;

multiclass DS_1A2D_NORET <bits<8> op, string opName, RegisterClass rc,
string noRetOp = opName,
dag outs = (outs),
dag ins = (ins VGPR_32:$addr, rc:$data0, rc:$data1,
offset:$offset, gds:$gds),
string asm = opName#" $addr, $data0, $data1"#"$offset"#"$gds"> {

def "" : DS_Pseudo <opName, outs, ins, []>,
AtomicNoRet<noRetOp, 0>;

let vdst = 0 in {
def _si : DS_Off16_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Off16_Real_vi <op, opName, outs, ins, asm>;
}
}

multiclass DS_0A_RET <bits<8> op, string opName,
dag outs = (outs VGPR_32:$vdst),
dag ins = (ins offset:$offset, gds:$gds),
string asm = opName#" $vdst"#"$offset"#"$gds"> {

let mayLoad = 1, mayStore = 1 in {
def "" : DS_Pseudo <opName, outs, ins, []>;

let addr = 0, data0 = 0, data1 = 0 in {
def _si : DS_Off16_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Off16_Real_vi <op, opName, outs, ins, asm>;
} // end addr = 0, data0 = 0, data1 = 0
} // end mayLoad = 1, mayStore = 1
}

multiclass DS_1A_RET_GDS <bits<8> op, string opName,
dag outs = (outs VGPR_32:$vdst),
dag ins = (ins VGPR_32:$addr, offset:$offset),
string asm = opName#" $vdst, $addr"#"$offset gds"> {

def "" : DS_Pseudo <opName, outs, ins, []>;

let data0 = 0, data1 = 0, gds = 1 in {
def _si : DS_Off16_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Off16_Real_vi <op, opName, outs, ins, asm>;
} // end data0 = 0, data1 = 0, gds = 1
}

multiclass DS_1A_GDS <bits<8> op, string opName,
dag outs = (outs),
dag ins = (ins VGPR_32:$addr),
string asm = opName#" $addr gds"> {

def "" : DS_Pseudo <opName, outs, ins, []>;

let vdst = 0, data0 = 0, data1 = 0, offset0 = 0, offset1 = 0, gds = 1 in {
def _si : DS_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Real_vi <op, opName, outs, ins, asm>;
} // end vdst = 0, data = 0, data1 = 0, gds = 1
}

multiclass DS_1A <bits<8> op, string opName,
dag outs = (outs),
dag ins = (ins VGPR_32:$addr, offset:$offset, gds:$gds),
string asm = opName#" $addr"#"$offset"#"$gds"> {

let mayLoad = 1, mayStore = 1 in {
def "" : DS_Pseudo <opName, outs, ins, []>;

let vdst = 0, data0 = 0, data1 = 0 in {
def _si : DS_Off16_Real_si <op, opName, outs, ins, asm>;
def _vi : DS_Off16_Real_vi <op, opName, outs, ins, asm>;
} // let vdst = 0, data0 = 0, data1 = 0
} // end mayLoad = 1, mayStore = 1
}

//===----------------------------------------------------------------------===//
// MTBUF classes		// MTBUF classes
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class MTBUF_Pseudo <string opName, dag outs, dag ins, list<dag> pattern> :		class MTBUF_Pseudo <string opName, dag outs, dag ins, list<dag> pattern> :
MTBUF <outs, ins, "", pattern>,		MTBUF <outs, ins, "", pattern>,
SIMCInstr<opName, SIEncodingFamily.NONE> {		SIMCInstr<opName, SIEncodingFamily.NONE> {
let isPseudo = 1;		let isPseudo = 1;
let isCodeGenOnly = 1;		let isCodeGenOnly = 1;
▲ Show 20 Lines • Show All 866 Lines • ▼ Show 20 Lines	def getAtomicNoRetOp : InstrMapping {
let ColFields = ["IsRet"];		let ColFields = ["IsRet"];
let KeyCol = ["1"];		let KeyCol = ["1"];
let ValueCols = [["0"]];		let ValueCols = [["0"]];
}		}

include "SIInstructions.td"		include "SIInstructions.td"
include "CIInstructions.td"		include "CIInstructions.td"
include "VIInstructions.td"		include "VIInstructions.td"

		include "DSInstructions.td"

lib/Target/AMDGPU/SIInstructions.td

	Show First 20 Lines • Show All 751 Lines • ▼ Show 20 Lines
	} // End isCompare = 1, isCommutable = 1			} // End isCompare = 1, isCommutable = 1

	defm V_CMP_CLASS_F32 : VOPC_CLASS_F32 <vopc<0x88, 0x10>, "v_cmp_class_f32">;			defm V_CMP_CLASS_F32 : VOPC_CLASS_F32 <vopc<0x88, 0x10>, "v_cmp_class_f32">;
	defm V_CMPX_CLASS_F32 : VOPCX_CLASS_F32 <vopc<0x98, 0x11>, "v_cmpx_class_f32">;			defm V_CMPX_CLASS_F32 : VOPCX_CLASS_F32 <vopc<0x98, 0x11>, "v_cmpx_class_f32">;
	defm V_CMP_CLASS_F64 : VOPC_CLASS_F64 <vopc<0xa8, 0x12>, "v_cmp_class_f64">;			defm V_CMP_CLASS_F64 : VOPC_CLASS_F64 <vopc<0xa8, 0x12>, "v_cmp_class_f64">;
	defm V_CMPX_CLASS_F64 : VOPCX_CLASS_F64 <vopc<0xb8, 0x13>, "v_cmpx_class_f64">;			defm V_CMPX_CLASS_F64 : VOPCX_CLASS_F64 <vopc<0xb8, 0x13>, "v_cmpx_class_f64">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// DS Instructions
	//===----------------------------------------------------------------------===//

	defm DS_ADD_U32 : DS_1A1D_NORET <0x0, "ds_add_u32", VGPR_32>;
	defm DS_SUB_U32 : DS_1A1D_NORET <0x1, "ds_sub_u32", VGPR_32>;
	defm DS_RSUB_U32 : DS_1A1D_NORET <0x2, "ds_rsub_u32", VGPR_32>;
	defm DS_INC_U32 : DS_1A1D_NORET <0x3, "ds_inc_u32", VGPR_32>;
	defm DS_DEC_U32 : DS_1A1D_NORET <0x4, "ds_dec_u32", VGPR_32>;
	defm DS_MIN_I32 : DS_1A1D_NORET <0x5, "ds_min_i32", VGPR_32>;
	defm DS_MAX_I32 : DS_1A1D_NORET <0x6, "ds_max_i32", VGPR_32>;
	defm DS_MIN_U32 : DS_1A1D_NORET <0x7, "ds_min_u32", VGPR_32>;
	defm DS_MAX_U32 : DS_1A1D_NORET <0x8, "ds_max_u32", VGPR_32>;
	defm DS_AND_B32 : DS_1A1D_NORET <0x9, "ds_and_b32", VGPR_32>;
	defm DS_OR_B32 : DS_1A1D_NORET <0xa, "ds_or_b32", VGPR_32>;
	defm DS_XOR_B32 : DS_1A1D_NORET <0xb, "ds_xor_b32", VGPR_32>;
	defm DS_MSKOR_B32 : DS_1A2D_NORET <0xc, "ds_mskor_b32", VGPR_32>;
	let mayLoad = 0 in {
	defm DS_WRITE_B32 : DS_1A1D_NORET <0xd, "ds_write_b32", VGPR_32>;
	defm DS_WRITE2_B32 : DS_1A2D_Off8_NORET <0xe, "ds_write2_b32", VGPR_32>;
	defm DS_WRITE2ST64_B32 : DS_1A2D_Off8_NORET <0xf, "ds_write2st64_b32", VGPR_32>;
	}
	defm DS_CMPST_B32 : DS_1A2D_NORET <0x10, "ds_cmpst_b32", VGPR_32>;
	defm DS_CMPST_F32 : DS_1A2D_NORET <0x11, "ds_cmpst_f32", VGPR_32>;
	defm DS_MIN_F32 : DS_1A2D_NORET <0x12, "ds_min_f32", VGPR_32>;
	defm DS_MAX_F32 : DS_1A2D_NORET <0x13, "ds_max_f32", VGPR_32>;

	defm DS_GWS_INIT : DS_1A_GDS <0x19, "ds_gws_init">;
	defm DS_GWS_SEMA_V : DS_1A_GDS <0x1a, "ds_gws_sema_v">;
	defm DS_GWS_SEMA_BR : DS_1A_GDS <0x1b, "ds_gws_sema_br">;
	defm DS_GWS_SEMA_P : DS_1A_GDS <0x1c, "ds_gws_sema_p">;
	defm DS_GWS_BARRIER : DS_1A_GDS <0x1d, "ds_gws_barrier">;
	let mayLoad = 0 in {
	defm DS_WRITE_B8 : DS_1A1D_NORET <0x1e, "ds_write_b8", VGPR_32>;
	defm DS_WRITE_B16 : DS_1A1D_NORET <0x1f, "ds_write_b16", VGPR_32>;
	}
	defm DS_ADD_RTN_U32 : DS_1A1D_RET <0x20, "ds_add_rtn_u32", VGPR_32, "ds_add_u32">;
	defm DS_SUB_RTN_U32 : DS_1A1D_RET <0x21, "ds_sub_rtn_u32", VGPR_32, "ds_sub_u32">;
	defm DS_RSUB_RTN_U32 : DS_1A1D_RET <0x22, "ds_rsub_rtn_u32", VGPR_32, "ds_rsub_u32">;
	defm DS_INC_RTN_U32 : DS_1A1D_RET <0x23, "ds_inc_rtn_u32", VGPR_32, "ds_inc_u32">;
	defm DS_DEC_RTN_U32 : DS_1A1D_RET <0x24, "ds_dec_rtn_u32", VGPR_32, "ds_dec_u32">;
	defm DS_MIN_RTN_I32 : DS_1A1D_RET <0x25, "ds_min_rtn_i32", VGPR_32, "ds_min_i32">;
	defm DS_MAX_RTN_I32 : DS_1A1D_RET <0x26, "ds_max_rtn_i32", VGPR_32, "ds_max_i32">;
	defm DS_MIN_RTN_U32 : DS_1A1D_RET <0x27, "ds_min_rtn_u32", VGPR_32, "ds_min_u32">;
	defm DS_MAX_RTN_U32 : DS_1A1D_RET <0x28, "ds_max_rtn_u32", VGPR_32, "ds_max_u32">;
	defm DS_AND_RTN_B32 : DS_1A1D_RET <0x29, "ds_and_rtn_b32", VGPR_32, "ds_and_b32">;
	defm DS_OR_RTN_B32 : DS_1A1D_RET <0x2a, "ds_or_rtn_b32", VGPR_32, "ds_or_b32">;
	defm DS_XOR_RTN_B32 : DS_1A1D_RET <0x2b, "ds_xor_rtn_b32", VGPR_32, "ds_xor_b32">;
	defm DS_MSKOR_RTN_B32 : DS_1A2D_RET <0x2c, "ds_mskor_rtn_b32", VGPR_32, "ds_mskor_b32">;
	defm DS_WRXCHG_RTN_B32 : DS_1A1D_RET <0x2d, "ds_wrxchg_rtn_b32", VGPR_32>;
	defm DS_WRXCHG2_RTN_B32 : DS_1A2D_RET <
	0x2e, "ds_wrxchg2_rtn_b32", VReg_64, "", VGPR_32
	>;
	defm DS_WRXCHG2ST64_RTN_B32 : DS_1A2D_RET <
	0x2f, "ds_wrxchg2st64_rtn_b32", VReg_64, "", VGPR_32
	>;
	defm DS_CMPST_RTN_B32 : DS_1A2D_RET <0x30, "ds_cmpst_rtn_b32", VGPR_32, "ds_cmpst_b32">;
	defm DS_CMPST_RTN_F32 : DS_1A2D_RET <0x31, "ds_cmpst_rtn_f32", VGPR_32, "ds_cmpst_f32">;
	defm DS_MIN_RTN_F32 : DS_1A2D_RET <0x32, "ds_min_rtn_f32", VGPR_32, "ds_min_f32">;
	defm DS_MAX_RTN_F32 : DS_1A2D_RET <0x33, "ds_max_rtn_f32", VGPR_32, "ds_max_f32">;

	let Uses = [EXEC], mayLoad =0, mayStore = 0, isConvergent = 1 in {
	defm DS_SWIZZLE_B32 : DS_1A_RET_ <dsop<0x35, 0x3d>, "ds_swizzle_b32", VGPR_32>;
	}

	let mayStore = 0 in {
	defm DS_READ_B32 : DS_1A_RET <0x36, "ds_read_b32", VGPR_32>;
	defm DS_READ2_B32 : DS_1A_Off8_RET <0x37, "ds_read2_b32", VReg_64>;
	defm DS_READ2ST64_B32 : DS_1A_Off8_RET <0x38, "ds_read2st64_b32", VReg_64>;
	defm DS_READ_I8 : DS_1A_RET <0x39, "ds_read_i8", VGPR_32>;
	defm DS_READ_U8 : DS_1A_RET <0x3a, "ds_read_u8", VGPR_32>;
	defm DS_READ_I16 : DS_1A_RET <0x3b, "ds_read_i16", VGPR_32>;
	defm DS_READ_U16 : DS_1A_RET <0x3c, "ds_read_u16", VGPR_32>;
	}
	defm DS_CONSUME : DS_0A_RET <0x3d, "ds_consume">;
	defm DS_APPEND : DS_0A_RET <0x3e, "ds_append">;
	defm DS_ORDERED_COUNT : DS_1A_RET_GDS <0x3f, "ds_ordered_count">;
	defm DS_ADD_U64 : DS_1A1D_NORET <0x40, "ds_add_u64", VReg_64>;
	defm DS_SUB_U64 : DS_1A1D_NORET <0x41, "ds_sub_u64", VReg_64>;
	defm DS_RSUB_U64 : DS_1A1D_NORET <0x42, "ds_rsub_u64", VReg_64>;
	defm DS_INC_U64 : DS_1A1D_NORET <0x43, "ds_inc_u64", VReg_64>;
	defm DS_DEC_U64 : DS_1A1D_NORET <0x44, "ds_dec_u64", VReg_64>;
	defm DS_MIN_I64 : DS_1A1D_NORET <0x45, "ds_min_i64", VReg_64>;
	defm DS_MAX_I64 : DS_1A1D_NORET <0x46, "ds_max_i64", VReg_64>;
	defm DS_MIN_U64 : DS_1A1D_NORET <0x47, "ds_min_u64", VReg_64>;
	defm DS_MAX_U64 : DS_1A1D_NORET <0x48, "ds_max_u64", VReg_64>;
	defm DS_AND_B64 : DS_1A1D_NORET <0x49, "ds_and_b64", VReg_64>;
	defm DS_OR_B64 : DS_1A1D_NORET <0x4a, "ds_or_b64", VReg_64>;
	defm DS_XOR_B64 : DS_1A1D_NORET <0x4b, "ds_xor_b64", VReg_64>;
	defm DS_MSKOR_B64 : DS_1A2D_NORET <0x4c, "ds_mskor_b64", VReg_64>;
	let mayLoad = 0 in {
	defm DS_WRITE_B64 : DS_1A1D_NORET <0x4d, "ds_write_b64", VReg_64>;
	defm DS_WRITE2_B64 : DS_1A2D_Off8_NORET <0x4E, "ds_write2_b64", VReg_64>;
	defm DS_WRITE2ST64_B64 : DS_1A2D_Off8_NORET <0x4f, "ds_write2st64_b64", VReg_64>;
	}
	defm DS_CMPST_B64 : DS_1A2D_NORET <0x50, "ds_cmpst_b64", VReg_64>;
	defm DS_CMPST_F64 : DS_1A2D_NORET <0x51, "ds_cmpst_f64", VReg_64>;
	defm DS_MIN_F64 : DS_1A1D_NORET <0x52, "ds_min_f64", VReg_64>;
	defm DS_MAX_F64 : DS_1A1D_NORET <0x53, "ds_max_f64", VReg_64>;

	defm DS_ADD_RTN_U64 : DS_1A1D_RET <0x60, "ds_add_rtn_u64", VReg_64, "ds_add_u64">;
	defm DS_SUB_RTN_U64 : DS_1A1D_RET <0x61, "ds_sub_rtn_u64", VReg_64, "ds_sub_u64">;
	defm DS_RSUB_RTN_U64 : DS_1A1D_RET <0x62, "ds_rsub_rtn_u64", VReg_64, "ds_rsub_u64">;
	defm DS_INC_RTN_U64 : DS_1A1D_RET <0x63, "ds_inc_rtn_u64", VReg_64, "ds_inc_u64">;
	defm DS_DEC_RTN_U64 : DS_1A1D_RET <0x64, "ds_dec_rtn_u64", VReg_64, "ds_dec_u64">;
	defm DS_MIN_RTN_I64 : DS_1A1D_RET <0x65, "ds_min_rtn_i64", VReg_64, "ds_min_i64">;
	defm DS_MAX_RTN_I64 : DS_1A1D_RET <0x66, "ds_max_rtn_i64", VReg_64, "ds_max_i64">;
	defm DS_MIN_RTN_U64 : DS_1A1D_RET <0x67, "ds_min_rtn_u64", VReg_64, "ds_min_u64">;
	defm DS_MAX_RTN_U64 : DS_1A1D_RET <0x68, "ds_max_rtn_u64", VReg_64, "ds_max_u64">;
	defm DS_AND_RTN_B64 : DS_1A1D_RET <0x69, "ds_and_rtn_b64", VReg_64, "ds_and_b64">;
	defm DS_OR_RTN_B64 : DS_1A1D_RET <0x6a, "ds_or_rtn_b64", VReg_64, "ds_or_b64">;
	defm DS_XOR_RTN_B64 : DS_1A1D_RET <0x6b, "ds_xor_rtn_b64", VReg_64, "ds_xor_b64">;
	defm DS_MSKOR_RTN_B64 : DS_1A2D_RET <0x6c, "ds_mskor_rtn_b64", VReg_64, "ds_mskor_b64">;
	defm DS_WRXCHG_RTN_B64 : DS_1A1D_RET <0x6d, "ds_wrxchg_rtn_b64", VReg_64, "ds_wrxchg_b64">;
	defm DS_WRXCHG2_RTN_B64 : DS_1A2D_RET <0x6e, "ds_wrxchg2_rtn_b64", VReg_128, "ds_wrxchg2_b64", VReg_64>;
	defm DS_WRXCHG2ST64_RTN_B64 : DS_1A2D_RET <0x6f, "ds_wrxchg2st64_rtn_b64", VReg_128, "ds_wrxchg2st64_b64", VReg_64>;
	defm DS_CMPST_RTN_B64 : DS_1A2D_RET <0x70, "ds_cmpst_rtn_b64", VReg_64, "ds_cmpst_b64">;
	defm DS_CMPST_RTN_F64 : DS_1A2D_RET <0x71, "ds_cmpst_rtn_f64", VReg_64, "ds_cmpst_f64">;
	defm DS_MIN_RTN_F64 : DS_1A1D_RET <0x72, "ds_min_rtn_f64", VReg_64, "ds_min_f64">;
	defm DS_MAX_RTN_F64 : DS_1A1D_RET <0x73, "ds_max_rtn_f64", VReg_64, "ds_max_f64">;

	let mayStore = 0 in {
	defm DS_READ_B64 : DS_1A_RET <0x76, "ds_read_b64", VReg_64>;
	defm DS_READ2_B64 : DS_1A_Off8_RET <0x77, "ds_read2_b64", VReg_128>;
	defm DS_READ2ST64_B64 : DS_1A_Off8_RET <0x78, "ds_read2st64_b64", VReg_128>;
	}

	defm DS_ADD_SRC2_U32 : DS_1A <0x80, "ds_add_src2_u32">;
	defm DS_SUB_SRC2_U32 : DS_1A <0x81, "ds_sub_src2_u32">;
	defm DS_RSUB_SRC2_U32 : DS_1A <0x82, "ds_rsub_src2_u32">;
	defm DS_INC_SRC2_U32 : DS_1A <0x83, "ds_inc_src2_u32">;
	defm DS_DEC_SRC2_U32 : DS_1A <0x84, "ds_dec_src2_u32">;
	defm DS_MIN_SRC2_I32 : DS_1A <0x85, "ds_min_src2_i32">;
	defm DS_MAX_SRC2_I32 : DS_1A <0x86, "ds_max_src2_i32">;
	defm DS_MIN_SRC2_U32 : DS_1A <0x87, "ds_min_src2_u32">;
	defm DS_MAX_SRC2_U32 : DS_1A <0x88, "ds_max_src2_u32">;
	defm DS_AND_SRC2_B32 : DS_1A <0x89, "ds_and_src_b32">;
	defm DS_OR_SRC2_B32 : DS_1A <0x8a, "ds_or_src2_b32">;
	defm DS_XOR_SRC2_B32 : DS_1A <0x8b, "ds_xor_src2_b32">;
	defm DS_WRITE_SRC2_B32 : DS_1A_Off8_NORET <0x8d, "ds_write_src2_b32">;

	defm DS_MIN_SRC2_F32 : DS_1A <0x92, "ds_min_src2_f32">;
	defm DS_MAX_SRC2_F32 : DS_1A <0x93, "ds_max_src2_f32">;

	defm DS_ADD_SRC2_U64 : DS_1A <0xc0, "ds_add_src2_u64">;
	defm DS_SUB_SRC2_U64 : DS_1A <0xc1, "ds_sub_src2_u64">;
	defm DS_RSUB_SRC2_U64 : DS_1A <0xc2, "ds_rsub_src2_u64">;
	defm DS_INC_SRC2_U64 : DS_1A <0xc3, "ds_inc_src2_u64">;
	defm DS_DEC_SRC2_U64 : DS_1A <0xc4, "ds_dec_src2_u64">;
	defm DS_MIN_SRC2_I64 : DS_1A <0xc5, "ds_min_src2_i64">;
	defm DS_MAX_SRC2_I64 : DS_1A <0xc6, "ds_max_src2_i64">;
	defm DS_MIN_SRC2_U64 : DS_1A <0xc7, "ds_min_src2_u64">;
	defm DS_MAX_SRC2_U64 : DS_1A <0xc8, "ds_max_src2_u64">;
	defm DS_AND_SRC2_B64 : DS_1A <0xc9, "ds_and_src2_b64">;
	defm DS_OR_SRC2_B64 : DS_1A <0xca, "ds_or_src2_b64">;
	defm DS_XOR_SRC2_B64 : DS_1A <0xcb, "ds_xor_src2_b64">;
	defm DS_WRITE_SRC2_B64 : DS_1A_Off8_NORET <0xcd, "ds_write_src2_b64">;

	defm DS_MIN_SRC2_F64 : DS_1A <0xd2, "ds_min_src2_f64">;
	defm DS_MAX_SRC2_F64 : DS_1A <0xd3, "ds_max_src2_f64">;

	//===----------------------------------------------------------------------===//
	// MUBUF Instructions			// MUBUF Instructions
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	defm BUFFER_LOAD_FORMAT_X : MUBUF_Load_Helper <			defm BUFFER_LOAD_FORMAT_X : MUBUF_Load_Helper <
	mubuf<0x00>, "buffer_load_format_x", VGPR_32			mubuf<0x00>, "buffer_load_format_x", VGPR_32
	>;			>;
	defm BUFFER_LOAD_FORMAT_XY : MUBUF_Load_Helper <			defm BUFFER_LOAD_FORMAT_XY : MUBUF_Load_Helper <
	mubuf<0x01>, "buffer_load_format_xy", VReg_64			mubuf<0x01>, "buffer_load_format_xy", VReg_64
	▲ Show 20 Lines • Show All 1,420 Lines • ▼ Show 20 Lines
	// S_GETREG_B32 Intrinsic Pattern.			// S_GETREG_B32 Intrinsic Pattern.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	def : Pat <			def : Pat <
	(int_amdgcn_s_getreg imm:$simm16),			(int_amdgcn_s_getreg imm:$simm16),
	(S_GETREG_B32 (as_i16imm $simm16))			(S_GETREG_B32 (as_i16imm $simm16))
	>;			>;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// DS_SWIZZLE Intrinsic Pattern.
	//===----------------------------------------------------------------------===//
	def : Pat <
	(int_amdgcn_ds_swizzle i32:$src, imm:$offset16),
	(DS_SWIZZLE_B32 $src, (as_i16imm $offset16), (i1 0))
	>;

	//===----------------------------------------------------------------------===//
	// SMRD Patterns			// SMRD Patterns
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	multiclass SMRD_Pattern <string Instr, ValueType vt> {			multiclass SMRD_Pattern <string Instr, ValueType vt> {

	// 1. IMM offset			// 1. IMM offset
	def : Pat <			def : Pat <
	(smrd_load (SMRDImm i64:$sbase, i32:$offset)),			(smrd_load (SMRDImm i64:$sbase, i32:$offset)),
	▲ Show 20 Lines • Show All 691 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def : IMad24Pat<V_MAD_I32_I24>;			def : IMad24Pat<V_MAD_I32_I24>;
	def : UMad24Pat<V_MAD_U32_U24>;			def : UMad24Pat<V_MAD_U32_U24>;

	defm : BFIPatterns <V_BFI_B32, S_MOV_B32, SReg_64>;			defm : BFIPatterns <V_BFI_B32, S_MOV_B32, SReg_64>;
	def : ROTRPattern <V_ALIGNBIT_B32>;			def : ROTRPattern <V_ALIGNBIT_B32>;

	/******** ======================= ********/
	/******** Load/Store Patterns ********/
	/******** ======================= ********/

	class DSReadPat <DS inst, ValueType vt, PatFrag frag> : Pat <
	(vt (frag (DS1Addr1Offset i32:$ptr, i32:$offset))),
	(inst $ptr, (as_i16imm $offset), (i1 0))
	>;

	def : DSReadPat <DS_READ_I8, i32, si_sextload_local_i8>;
	def : DSReadPat <DS_READ_U8, i32, si_az_extload_local_i8>;
	def : DSReadPat <DS_READ_I16, i32, si_sextload_local_i16>;
	def : DSReadPat <DS_READ_U16, i32, si_az_extload_local_i16>;
	def : DSReadPat <DS_READ_B32, i32, si_load_local>;

	let AddedComplexity = 100 in {

	def : DSReadPat <DS_READ_B64, v2i32, si_load_local_align8>;

	} // End AddedComplexity = 100

	def : Pat <
	(v2i32 (si_load_local (DS64Bit4ByteAligned i32:$ptr, i8:$offset0,
	i8:$offset1))),
	(DS_READ2_B32 $ptr, $offset0, $offset1, (i1 0))
	>;

	class DSWritePat <DS inst, ValueType vt, PatFrag frag> : Pat <
	(frag vt:$value, (DS1Addr1Offset i32:$ptr, i32:$offset)),
	(inst $ptr, $value, (as_i16imm $offset), (i1 0))
	>;

	def : DSWritePat <DS_WRITE_B8, i32, si_truncstore_local_i8>;
	def : DSWritePat <DS_WRITE_B16, i32, si_truncstore_local_i16>;
	def : DSWritePat <DS_WRITE_B32, i32, si_store_local>;

	let AddedComplexity = 100 in {

	def : DSWritePat <DS_WRITE_B64, v2i32, si_store_local_align8>;
	} // End AddedComplexity = 100

	def : Pat <
	(si_store_local v2i32:$value, (DS64Bit4ByteAligned i32:$ptr, i8:$offset0,
	i8:$offset1)),
	(DS_WRITE2_B32 $ptr, (EXTRACT_SUBREG $value, sub0),
	(EXTRACT_SUBREG $value, sub1), $offset0, $offset1,
	(i1 0))
	>;

	class DSAtomicRetPat<DS inst, ValueType vt, PatFrag frag> : Pat <
	(frag (DS1Addr1Offset i32:$ptr, i32:$offset), vt:$value),
	(inst $ptr, $value, (as_i16imm $offset), (i1 0))
	>;

	class DSAtomicCmpXChg <DS inst, ValueType vt, PatFrag frag> : Pat <
	(frag (DS1Addr1Offset i32:$ptr, i32:$offset), vt:$cmp, vt:$swap),
	(inst $ptr, $cmp, $swap, (as_i16imm $offset), (i1 0))
	>;


	// 32-bit atomics.
	def : DSAtomicRetPat<DS_WRXCHG_RTN_B32, i32, si_atomic_swap_local>;
	def : DSAtomicRetPat<DS_ADD_RTN_U32, i32, si_atomic_load_add_local>;
	def : DSAtomicRetPat<DS_SUB_RTN_U32, i32, si_atomic_load_sub_local>;
	def : DSAtomicRetPat<DS_INC_RTN_U32, i32, si_atomic_inc_local>;
	def : DSAtomicRetPat<DS_DEC_RTN_U32, i32, si_atomic_dec_local>;
	def : DSAtomicRetPat<DS_AND_RTN_B32, i32, si_atomic_load_and_local>;
	def : DSAtomicRetPat<DS_OR_RTN_B32, i32, si_atomic_load_or_local>;
	def : DSAtomicRetPat<DS_XOR_RTN_B32, i32, si_atomic_load_xor_local>;
	def : DSAtomicRetPat<DS_MIN_RTN_I32, i32, si_atomic_load_min_local>;
	def : DSAtomicRetPat<DS_MAX_RTN_I32, i32, si_atomic_load_max_local>;
	def : DSAtomicRetPat<DS_MIN_RTN_U32, i32, si_atomic_load_umin_local>;
	def : DSAtomicRetPat<DS_MAX_RTN_U32, i32, si_atomic_load_umax_local>;
	def : DSAtomicCmpXChg<DS_CMPST_RTN_B32, i32, si_atomic_cmp_swap_32_local>;

	// 64-bit atomics.
	def : DSAtomicRetPat<DS_WRXCHG_RTN_B64, i64, si_atomic_swap_local>;
	def : DSAtomicRetPat<DS_ADD_RTN_U64, i64, si_atomic_load_add_local>;
	def : DSAtomicRetPat<DS_SUB_RTN_U64, i64, si_atomic_load_sub_local>;
	def : DSAtomicRetPat<DS_INC_RTN_U64, i64, si_atomic_inc_local>;
	def : DSAtomicRetPat<DS_DEC_RTN_U64, i64, si_atomic_dec_local>;
	def : DSAtomicRetPat<DS_AND_RTN_B64, i64, si_atomic_load_and_local>;
	def : DSAtomicRetPat<DS_OR_RTN_B64, i64, si_atomic_load_or_local>;
	def : DSAtomicRetPat<DS_XOR_RTN_B64, i64, si_atomic_load_xor_local>;
	def : DSAtomicRetPat<DS_MIN_RTN_I64, i64, si_atomic_load_min_local>;
	def : DSAtomicRetPat<DS_MAX_RTN_I64, i64, si_atomic_load_max_local>;
	def : DSAtomicRetPat<DS_MIN_RTN_U64, i64, si_atomic_load_umin_local>;
	def : DSAtomicRetPat<DS_MAX_RTN_U64, i64, si_atomic_load_umax_local>;

	def : DSAtomicCmpXChg<DS_CMPST_RTN_B64, i64, si_atomic_cmp_swap_64_local>;


	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// MUBUF Patterns			// MUBUF Patterns
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	class MUBUFLoad_Pattern <MUBUF Instr_ADDR64, ValueType vt,			class MUBUFLoad_Pattern <MUBUF Instr_ADDR64, ValueType vt,
	PatFrag constant_ld> : Pat <			PatFrag constant_ld> : Pat <
	(vt (constant_ld (MUBUFAddr64 v4i32:$srsrc, i64:$vaddr, i32:$soffset,			(vt (constant_ld (MUBUFAddr64 v4i32:$srsrc, i64:$vaddr, i32:$soffset,
	i16:$offset, i1:$glc, i1:$slc, i1:$tfe))),			i16:$offset, i1:$glc, i1:$slc, i1:$tfe))),
	▲ Show 20 Lines • Show All 397 Lines • Show Last 20 Lines

lib/Target/AMDGPU/VIInstrFormats.td

	//===-- VIInstrFormats.td - VI Instruction Encodings ----------------------===//			//===-- VIInstrFormats.td - VI Instruction Encodings ----------------------===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// VI Instruction format definitions.			// VI Instruction format definitions.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	class DSe_vi <bits<8> op> : Enc64 {
	bits<8> vdst;
	bits<1> gds;
	bits<8> addr;
	bits<8> data0;
	bits<8> data1;
	bits<8> offset0;
	bits<8> offset1;

	let Inst{7-0} = offset0;
	let Inst{15-8} = offset1;
	let Inst{16} = gds;
	let Inst{24-17} = op;
	let Inst{31-26} = 0x36; //encoding
	let Inst{39-32} = addr;
	let Inst{47-40} = data0;
	let Inst{55-48} = data1;
	let Inst{63-56} = vdst;
	}

	class MUBUFe_vi <bits<7> op> : Enc64 {			class MUBUFe_vi <bits<7> op> : Enc64 {
	bits<12> offset;			bits<12> offset;
	bits<1> offen;			bits<1> offen;
	bits<1> idxen;			bits<1> idxen;
	bits<1> glc;			bits<1> glc;
	bits<1> lds;			bits<1> lds;
	bits<8> vaddr;			bits<8> vaddr;
	bits<8> vdata;			bits<8> vdata;
	▲ Show 20 Lines • Show All 256 Lines • Show Last 20 Lines

lib/Target/AMDGPU/VIInstructions.td

	Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines
	// Misc Patterns			// Misc Patterns
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def : Pat <			def : Pat <
	(i64 (readcyclecounter)),			(i64 (readcyclecounter)),
	(S_MEMREALTIME)			(S_MEMREALTIME)
	>;			>;

	//===----------------------------------------------------------------------===//
	// DS_PERMUTE/DS_BPERMUTE Instructions.
	//===----------------------------------------------------------------------===//

	let Uses = [EXEC] in {
	defm DS_PERMUTE_B32 : DS_1A1D_PERMUTE <0x3e, "ds_permute_b32", VGPR_32,
	int_amdgcn_ds_permute>;
	defm DS_BPERMUTE_B32 : DS_1A1D_PERMUTE <0x3f, "ds_bpermute_b32", VGPR_32,
	int_amdgcn_ds_bpermute>;
	}

	} // End Predicates = [isVI]			} // End Predicates = [isVI]

test/MC/AMDGPU/ds.s

	Show First 20 Lines • Show All 264 Lines • ▼ Show 20 Lines
	ds_read_i16 v8, v2			ds_read_i16 v8, v2
	// SICI: ds_read_i16 v8, v2 ; encoding: [0x00,0x00,0xec,0xd8,0x02,0x00,0x00,0x08]			// SICI: ds_read_i16 v8, v2 ; encoding: [0x00,0x00,0xec,0xd8,0x02,0x00,0x00,0x08]
	// VI: ds_read_i16 v8, v2 ; encoding: [0x00,0x00,0x76,0xd8,0x02,0x00,0x00,0x08]			// VI: ds_read_i16 v8, v2 ; encoding: [0x00,0x00,0x76,0xd8,0x02,0x00,0x00,0x08]

	ds_read_u16 v8, v2			ds_read_u16 v8, v2
	// SICI: ds_read_u16 v8, v2 ; encoding: [0x00,0x00,0xf0,0xd8,0x02,0x00,0x00,0x08]			// SICI: ds_read_u16 v8, v2 ; encoding: [0x00,0x00,0xf0,0xd8,0x02,0x00,0x00,0x08]
	// VI: ds_read_u16 v8, v2 ; encoding: [0x00,0x00,0x78,0xd8,0x02,0x00,0x00,0x08]			// VI: ds_read_u16 v8, v2 ; encoding: [0x00,0x00,0x78,0xd8,0x02,0x00,0x00,0x08]

	ds_consume v8
	// SICI: ds_consume v8 ; encoding: [0x00,0x00,0xf4,0xd8,0x00,0x00,0x00,0x08]			//ds_consume v8
	// VI: ds_consume v8 ; encoding: [0x00,0x00,0x7a,0xd8,0x00,0x00,0x00,0x08]			// FIXMESICI: ds_consume v8 ; encoding: [0x00,0x00,0xf4,0xd8,0x00,0x00,0x00,0x08]
				// FIXMEVI: ds_consume v8 ; encoding: [0x00,0x00,0x7a,0xd8,0x00,0x00,0x00,0x08]
	ds_append v8
	// SICI: ds_append v8 ; encoding: [0x00,0x00,0xf8,0xd8,0x00,0x00,0x00,0x08]			//ds_append v8
	// VI: ds_append v8 ; encoding: [0x00,0x00,0x7c,0xd8,0x00,0x00,0x00,0x08]			// FIXMESICI: ds_append v8 ; encoding: [0x00,0x00,0xf8,0xd8,0x00,0x00,0x00,0x08]
				// FIXMEVI: ds_append v8 ; encoding: [0x00,0x00,0x7c,0xd8,0x00,0x00,0x00,0x08]
	ds_ordered_count v8, v2 gds
	// SICI: ds_ordered_count v8, v2 gds ; encoding: [0x00,0x00,0xfe,0xd8,0x02,0x00,0x00,0x08]			//ds_ordered_count v8, v2 gds
	// VI: ds_ordered_count v8, v2 gds ; encoding: [0x00,0x00,0x7f,0xd8,0x02,0x00,0x00,0x08]			// FIXMESICI: ds_ordered_count v8, v2 gds ; encoding: [0x00,0x00,0xfe,0xd8,0x02,0x00,0x00,0x08]
				// FIXMEVI: ds_ordered_count v8, v2 gds ; encoding: [0x00,0x00,0x7f,0xd8,0x02,0x00,0x00,0x08]

	ds_add_u64 v2, v[4:5]			ds_add_u64 v2, v[4:5]
	// SICI: ds_add_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x00,0xd9,0x02,0x04,0x00,0x00]			// SICI: ds_add_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x00,0xd9,0x02,0x04,0x00,0x00]
	// VI: ds_add_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x80,0xd8,0x02,0x04,0x00,0x00]			// VI: ds_add_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x80,0xd8,0x02,0x04,0x00,0x00]

	ds_sub_u64 v2, v[4:5]			ds_sub_u64 v2, v[4:5]
	// SICI: ds_sub_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x04,0xd9,0x02,0x04,0x00,0x00]			// SICI: ds_sub_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x04,0xd9,0x02,0x04,0x00,0x00]
	// VI: ds_sub_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x82,0xd8,0x02,0x04,0x00,0x00]			// VI: ds_sub_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x82,0xd8,0x02,0x04,0x00,0x00]
	▲ Show 20 Lines • Show All 165 Lines • Show Last 20 Lines

test/MC/Disassembler/AMDGPU/ds_vi.txt

	Show First 20 Lines • Show All 180 Lines • ▼ Show 20 Lines
	0x00 0x00 0x74 0xd8 0x02 0x00 0x00 0x08			0x00 0x00 0x74 0xd8 0x02 0x00 0x00 0x08

	# VI: ds_read_i16 v8, v2 ; encoding: [0x00,0x00,0x76,0xd8,0x02,0x00,0x00,0x08]			# VI: ds_read_i16 v8, v2 ; encoding: [0x00,0x00,0x76,0xd8,0x02,0x00,0x00,0x08]
	0x00 0x00 0x76 0xd8 0x02 0x00 0x00 0x08			0x00 0x00 0x76 0xd8 0x02 0x00 0x00 0x08

	# VI: ds_read_u16 v8, v2 ; encoding: [0x00,0x00,0x78,0xd8,0x02,0x00,0x00,0x08]			# VI: ds_read_u16 v8, v2 ; encoding: [0x00,0x00,0x78,0xd8,0x02,0x00,0x00,0x08]
	0x00 0x00 0x78 0xd8 0x02 0x00 0x00 0x08			0x00 0x00 0x78 0xd8 0x02 0x00 0x00 0x08

	# VI: ds_consume v8 ; encoding: [0x00,0x00,0x7a,0xd8,0x00,0x00,0x00,0x08]
	0x00 0x00 0x7a 0xd8 0x00 0x00 0x00 0x08

	# FIXME: ds_append v8 ; encoding: [0x00,0x00,0x7c,0xd8,0x00,0x00,0x00,0x08]
	0x00 0x00 0x7c 0xd8 0x00 0x00 0x00 0x08

	# VI: ds_ordered_count v8, v2 gds ; encoding: [0x00,0x00,0x7f,0xd8,0x02,0x00,0x00,0x08]
	0x00 0x00 0x7f 0xd8 0x02 0x00 0x00 0x08

	# VI: ds_add_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x80,0xd8,0x02,0x04,0x00,0x00]			# VI: ds_add_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x80,0xd8,0x02,0x04,0x00,0x00]
	0x00 0x00 0x80 0xd8 0x02 0x04 0x00 0x00			0x00 0x00 0x80 0xd8 0x02 0x04 0x00 0x00

	# VI: ds_sub_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x82,0xd8,0x02,0x04,0x00,0x00]			# VI: ds_sub_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x82,0xd8,0x02,0x04,0x00,0x00]
	0x00 0x00 0x82 0xd8 0x02 0x04 0x00 0x00			0x00 0x00 0x82 0xd8 0x02 0x04 0x00 0x00

	# VI: ds_rsub_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x84,0xd8,0x02,0x04,0x00,0x00]			# VI: ds_rsub_u64 v2, v[4:5] ; encoding: [0x00,0x00,0x84,0xd8,0x02,0x04,0x00,0x00]
	0x00 0x00 0x84 0xd8 0x02 0x04 0x00 0x00			0x00 0x00 0x84 0xd8 0x02 0x04 0x00 0x00
	▲ Show 20 Lines • Show All 120 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] refactor ds instruction definitions (proposal)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 65774

lib/Target/AMDGPU/CIInstructions.td

lib/Target/AMDGPU/DSInstructions.td

lib/Target/AMDGPU/SIInstrFormats.td

lib/Target/AMDGPU/SIInstrInfo.td

lib/Target/AMDGPU/SIInstructions.td

lib/Target/AMDGPU/VIInstrFormats.td

lib/Target/AMDGPU/VIInstructions.td

test/MC/AMDGPU/ds.s

test/MC/Disassembler/AMDGPU/ds_vi.txt

[AMDGPU] refactor ds instruction definitions (proposal)
ClosedPublic