Download Raw Diff

Details

Reviewers

• tstellarAMD
arsenm

Commits

rGd48445d51392: AMDGPU/SI: Implement sendmsghalt intrinsic
rL290977: AMDGPU/SI: Implement sendmsghalt intrinsic

Diff Detail

Repository: rL LLVM

Event Timeline

jvesely updated this revision to Diff 68039.Aug 15 2016, 8:19 AM

jvesely retitled this revision from to AMDGPU/SI: Implement sendmsghalt intrinsic.

jvesely updated this object.

jvesely added a reviewer: • tstellarAMD.

jvesely set the repository for this revision to rL LLVM.

Herald added subscribers: kzhuravl, arsenm. · View Herald TranscriptAug 15 2016, 8:19 AM

New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0

In D23511#515604, @arsenm wrote:

New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0

I'd like it to keep it as close to sendmsg as possible. Do you know if there are compatibility issues if I rename SI_sendmsg to amdgcn_sendmsg?

In D23511#515722, @jvesely wrote:

In D23511#515604, @arsenm wrote:

New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0

I'd like it to keep it as close to sendmsg as possible. Do you know if there are compatibility issues if I rename SI_sendmsg to amdgcn_sendmsg?

SI.sendmsg also needs to be changed and replaced (as well as the rest of the intrinsics in the backend). The goal is to eventually fix any intrinsic design issues and fix the names when moving them to the public intrinsics

rename and expose both sendmsg intrinsics

In D23511#515739, @arsenm wrote:

In D23511#515722, @jvesely wrote:

In D23511#515604, @arsenm wrote:

New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0

I'd like it to keep it as close to sendmsg as possible. Do you know if there are compatibility issues if I rename SI_sendmsg to amdgcn_sendmsg?

SI.sendmsg also needs to be changed and replaced (as well as the rest of the intrinsics in the backend). The goal is to eventually fix any intrinsic design issues and fix the names when moving them to the public intrinsics

I understand that, although having a list of approaches considered deprecated would reduce some wasted effort.
My question was whether there are users of the old sendmsg intrinsic name that would break.

I'm not sure how to unbundle writing m0 if we want to expose sendmsg as __builtin.function, is there a generic way to write to m0 from high level language?

In D23511#516775, @jvesely wrote:

In D23511#515739, @arsenm wrote:

In D23511#515722, @jvesely wrote:

In D23511#515604, @arsenm wrote:

New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0

I'd like it to keep it as close to sendmsg as possible. Do you know if there are compatibility issues if I rename SI_sendmsg to amdgcn_sendmsg?

SI.sendmsg also needs to be changed and replaced (as well as the rest of the intrinsics in the backend). The goal is to eventually fix any intrinsic design issues and fix the names when moving them to the public intrinsics

I understand that, although having a list of approaches considered deprecated would reduce some wasted effort.
My question was whether there are users of the old sendmsg intrinsic name that would break.

I'm not sure how to unbundle writing m0 if we want to expose sendmsg as __builtin.function, is there a generic way to write to m0 from high level language?

We can keep the old intrinsic working while adding the new one. A builtin would be needed for emitting the write to m0 since read/write_register are not directly exposed. My concern about this is what happens if you have something like:

llvm.write_register(m0)
%foo = load i32, i32 addrspace(3)*
llvm.amdgcn.s.sendmsg()

The lowering of the LDS access will insert initialization of m0 to -1, clobbering the old value. I'm not sure if it's better to either switch the M0 initialization lowering to copy the pre-existing value and restore after. I'm also not sure if we should keep considering m0 as an allocatable register

In D23511#517122, @arsenm wrote:

We can keep the old intrinsic working while adding the new one.

If we don't know of any users now, we probably won't know more in the future. I don't mind updating mine.
Let me know your preference, I have an alternative patch that keeps the old name ready.

A builtin would be needed for emitting the write to m0 since read/write_register are not directly exposed. My concern about this is what happens if you have something like:

llvm.write_register(m0)
%foo = load i32, i32 addrspace(3)*
llvm.amdgcn.s.sendmsg()

The lowering of the LDS access will insert initialization of m0 to -1, clobbering the old value. I'm not sure if it's better to either switch the M0 initialization lowering to copy the pre-existing value and restore after. I'm also not sure if we should keep considering m0 as an allocatable register

shouldn't register allocation (picking the same phys m0) and instruction scheduling figure out the hazard and either schedule the instructions correctly or spill?

anyway, It looks like proper handling of m0 would need a separate patch. I'd prefer to leave that for another time. This change just adds a halting copy of sendmsg, so both can be modified at the same time if necessary.

keep the old intrinsic around

Herald added subscribers: nhaehnle, wdng. · View Herald TranscriptSep 14 2016, 8:21 PM

arsenm added inline comments.Dec 20 2016, 9:27 AM

include/llvm/IR/IntrinsicsAMDGPU.td
107–110 ↗	(On Diff #71473)	These should have a comments explaining the arguments, at least mentioning that the implicit m0 argument is the last one
test/CodeGen/AMDGPU/amdgcn.sendmsg.ll
23 ↗	(On Diff #71473)	better name would be test_sendmsg or something
31–35 ↗	(On Diff #71473)	I would split each of these sets into its own function

add comment
improve test

Herald edited edge metadata. · View Herald TranscriptDec 23 2016, 11:36 AM

Herald added subscribers: tony-tye, yaxunl. · View Herald Transcript

jvesely marked 3 inline comments as done.Dec 23 2016, 11:37 AM

LGTM

test/CodeGen/AMDGPU/amdgcn.sendmsg.ll
1–2 ↗	(On Diff #82416)	Can you change these to use GCN as the check prefix

This revision is now accepted and ready to land.Jan 4 2017, 8:35 AM

Closed by commit rL290977: AMDGPU/SI: Implement sendmsghalt intrinsic (authored by jvesely). · Explain WhyJan 4 2017, 10:17 AM

This revision was automatically updated to reflect the committed changes.

Diff 68039

lib/Target/AMDGPU/AMDGPUISelLowering.h

Show First 20 Lines • Show All 296 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
/// T0\|v.x\| \| \| \|		/// T0\|v.x\| \| \| \|
/// T1\|v.y\| \| \| \|		/// T1\|v.y\| \| \| \|
/// T2\|v.z\| \| \| \|		/// T2\|v.z\| \| \| \|
/// T3\|v.w\| \| \| \|		/// T3\|v.w\| \| \| \|
BUILD_VERTICAL_VECTOR,		BUILD_VERTICAL_VECTOR,
/// Pointer to the start of the shader's constant data.		/// Pointer to the start of the shader's constant data.
CONST_DATA_PTR,		CONST_DATA_PTR,
SENDMSG,		SENDMSG,
		SENDMSGHALT,
INTERP_MOV,		INTERP_MOV,
INTERP_P1,		INTERP_P1,
INTERP_P2,		INTERP_P2,
PC_ADD_REL_OFFSET,		PC_ADD_REL_OFFSET,
KILL,		KILL,
FIRST_MEM_OPCODE_NUMBER = ISD::FIRST_TARGET_MEMORY_OPCODE,		FIRST_MEM_OPCODE_NUMBER = ISD::FIRST_TARGET_MEMORY_OPCODE,
STORE_MSKOR,		STORE_MSKOR,
LOAD_CONSTANT,		LOAD_CONSTANT,
Show All 13 Lines

lib/Target/AMDGPU/AMDGPUISelLowering.cpp

Show First 20 Lines • Show All 2,711 Lines • ▼ Show 20 Lines	const char* AMDGPUTargetLowering::getTargetNodeName(unsigned Opcode) const {
NODE_NAME_CASE(CVT_F32_UBYTE2)		NODE_NAME_CASE(CVT_F32_UBYTE2)
NODE_NAME_CASE(CVT_F32_UBYTE3)		NODE_NAME_CASE(CVT_F32_UBYTE3)
NODE_NAME_CASE(BUILD_VERTICAL_VECTOR)		NODE_NAME_CASE(BUILD_VERTICAL_VECTOR)
NODE_NAME_CASE(CONST_DATA_PTR)		NODE_NAME_CASE(CONST_DATA_PTR)
NODE_NAME_CASE(PC_ADD_REL_OFFSET)		NODE_NAME_CASE(PC_ADD_REL_OFFSET)
NODE_NAME_CASE(KILL)		NODE_NAME_CASE(KILL)
case AMDGPUISD::FIRST_MEM_OPCODE_NUMBER: break;		case AMDGPUISD::FIRST_MEM_OPCODE_NUMBER: break;
NODE_NAME_CASE(SENDMSG)		NODE_NAME_CASE(SENDMSG)
		NODE_NAME_CASE(SENDMSGHALT)
NODE_NAME_CASE(INTERP_MOV)		NODE_NAME_CASE(INTERP_MOV)
NODE_NAME_CASE(INTERP_P1)		NODE_NAME_CASE(INTERP_P1)
NODE_NAME_CASE(INTERP_P2)		NODE_NAME_CASE(INTERP_P2)
NODE_NAME_CASE(STORE_MSKOR)		NODE_NAME_CASE(STORE_MSKOR)
NODE_NAME_CASE(LOAD_CONSTANT)		NODE_NAME_CASE(LOAD_CONSTANT)
NODE_NAME_CASE(TBUFFER_STORE_FORMAT)		NODE_NAME_CASE(TBUFFER_STORE_FORMAT)
NODE_NAME_CASE(ATOMIC_CMP_SWAP)		NODE_NAME_CASE(ATOMIC_CMP_SWAP)
NODE_NAME_CASE(ATOMIC_INC)		NODE_NAME_CASE(ATOMIC_INC)
▲ Show 20 Lines • Show All 117 Lines • Show Last 20 Lines

lib/Target/AMDGPU/AMDGPUInstrInfo.td

	Show First 20 Lines • Show All 240 Lines • ▼ Show 20 Lines
	>;			>;

	def AMDGPUfmed3 : SDNode<"AMDGPUISD::FMED3", SDTFPTernaryOp, []>;			def AMDGPUfmed3 : SDNode<"AMDGPUISD::FMED3", SDTFPTernaryOp, []>;

	def AMDGPUsendmsg : SDNode<"AMDGPUISD::SENDMSG",			def AMDGPUsendmsg : SDNode<"AMDGPUISD::SENDMSG",
	SDTypeProfile<0, 1, [SDTCisInt<0>]>,			SDTypeProfile<0, 1, [SDTCisInt<0>]>,
	[SDNPHasChain, SDNPInGlue]>;			[SDNPHasChain, SDNPInGlue]>;

				def AMDGPUsendmsghalt : SDNode<"AMDGPUISD::SENDMSGHALT",
				SDTypeProfile<0, 1, [SDTCisInt<0>]>,
				[SDNPHasChain, SDNPInGlue]>;

	def AMDGPUinterp_mov : SDNode<"AMDGPUISD::INTERP_MOV",			def AMDGPUinterp_mov : SDNode<"AMDGPUISD::INTERP_MOV",
	SDTypeProfile<1, 3, [SDTCisFP<0>]>,			SDTypeProfile<1, 3, [SDTCisFP<0>]>,
	[SDNPInGlue]>;			[SDNPInGlue]>;

	def AMDGPUinterp_p1 : SDNode<"AMDGPUISD::INTERP_P1",			def AMDGPUinterp_p1 : SDNode<"AMDGPUISD::INTERP_P1",
	SDTypeProfile<1, 3, [SDTCisFP<0>]>,			SDTypeProfile<1, 3, [SDTCisFP<0>]>,
	[SDNPInGlue, SDNPOutGlue]>;			[SDNPInGlue, SDNPOutGlue]>;

	Show All 28 Lines

lib/Target/AMDGPU/SIISelLowering.cpp

Show First 20 Lines • Show All 2,289 Lines • ▼ Show 20 Lines	SDValue SITargetLowering::LowerINTRINSIC_VOID(SDValue Op,

switch (IntrinsicID) {		switch (IntrinsicID) {
case AMDGPUIntrinsic::SI_sendmsg: {		case AMDGPUIntrinsic::SI_sendmsg: {
Chain = copyToM0(DAG, Chain, DL, Op.getOperand(3));		Chain = copyToM0(DAG, Chain, DL, Op.getOperand(3));
SDValue Glue = Chain.getValue(1);		SDValue Glue = Chain.getValue(1);
return DAG.getNode(AMDGPUISD::SENDMSG, DL, MVT::Other, Chain,		return DAG.getNode(AMDGPUISD::SENDMSG, DL, MVT::Other, Chain,
Op.getOperand(2), Glue);		Op.getOperand(2), Glue);
}		}
		case AMDGPUIntrinsic::SI_sendmsghalt: {
		Chain = copyToM0(DAG, Chain, DL, Op.getOperand(3));
		SDValue Glue = Chain.getValue(1);
		return DAG.getNode(AMDGPUISD::SENDMSGHALT, DL, MVT::Other, Chain,
		Op.getOperand(2), Glue);
		}
case AMDGPUIntrinsic::SI_tbuffer_store: {		case AMDGPUIntrinsic::SI_tbuffer_store: {
SDValue Ops[] = {		SDValue Ops[] = {
Chain,		Chain,
Op.getOperand(2),		Op.getOperand(2),
Op.getOperand(3),		Op.getOperand(3),
Op.getOperand(4),		Op.getOperand(4),
Op.getOperand(5),		Op.getOperand(5),
Op.getOperand(6),		Op.getOperand(6),
▲ Show 20 Lines • Show All 1,485 Lines • Show Last 20 Lines

lib/Target/AMDGPU/SIInsertWaits.cpp

Show First 20 Lines • Show All 485 Lines • ▼ Show 20 Lines
}		}

void SIInsertWaits::handleSendMsg(MachineBasicBlock &MBB,		void SIInsertWaits::handleSendMsg(MachineBasicBlock &MBB,
MachineBasicBlock::iterator I) {		MachineBasicBlock::iterator I) {
if (ST->getGeneration() < SISubtarget::VOLCANIC_ISLANDS)		if (ST->getGeneration() < SISubtarget::VOLCANIC_ISLANDS)
return;		return;

// There must be "S_NOP 0" between an instruction writing M0 and S_SENDMSG.		// There must be "S_NOP 0" between an instruction writing M0 and S_SENDMSG.
if (LastInstWritesM0 && I->getOpcode() == AMDGPU::S_SENDMSG) {		if (LastInstWritesM0 && (I->getOpcode() == AMDGPU::S_SENDMSG \|\| I->getOpcode() == AMDGPU::S_SENDMSGHALT)) {
BuildMI(MBB, I, DebugLoc(), TII->get(AMDGPU::S_NOP)).addImm(0);		BuildMI(MBB, I, DebugLoc(), TII->get(AMDGPU::S_NOP)).addImm(0);
LastInstWritesM0 = false;		LastInstWritesM0 = false;
return;		return;
}		}

// Set whether this instruction sets M0		// Set whether this instruction sets M0
LastInstWritesM0 = false;		LastInstWritesM0 = false;

▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	for (MachineBasicBlock::iterator I = MBB.begin(), E = MBB.end();
Counters Required;		Counters Required;

// Wait for everything before a barrier.		// Wait for everything before a barrier.
//		//
// S_SENDMSG implicitly waits for all outstanding LGKM transfers to finish,		// S_SENDMSG implicitly waits for all outstanding LGKM transfers to finish,
// but we also want to wait for any other outstanding transfers before		// but we also want to wait for any other outstanding transfers before
// signalling other hardware blocks		// signalling other hardware blocks
if (I->getOpcode() == AMDGPU::S_BARRIER \|\|		if (I->getOpcode() == AMDGPU::S_BARRIER \|\|
I->getOpcode() == AMDGPU::S_SENDMSG)		I->getOpcode() == AMDGPU::S_SENDMSG \|\|
		I->getOpcode() == AMDGPU::S_SENDMSGHALT)
Required = LastIssued;		Required = LastIssued;
else		else
Required = handleOperands(*I);		Required = handleOperands(*I);

Counters Increment = getHwCounts(*I);		Counters Increment = getHwCounts(*I);

if (countersNonZero(Required) \|\| countersNonZero(Increment))		if (countersNonZero(Required) \|\| countersNonZero(Increment))
increaseCounters(Required, DelayedWaitOn);		increaseCounters(Required, DelayedWaitOn);
Show All 16 Lines

lib/Target/AMDGPU/SIInstructions.td

	Show First 20 Lines • Show All 502 Lines • ▼ Show 20 Lines

	def S_SETPRIO : SOPP <0x0000000f, (ins i16imm:$simm16), "s_setprio $simm16">;			def S_SETPRIO : SOPP <0x0000000f, (ins i16imm:$simm16), "s_setprio $simm16">;

	let Uses = [EXEC, M0] in {			let Uses = [EXEC, M0] in {
	// FIXME: Should this be mayLoad+mayStore?			// FIXME: Should this be mayLoad+mayStore?
	def S_SENDMSG : SOPP <0x00000010, (ins SendMsgImm:$simm16), "s_sendmsg $simm16",			def S_SENDMSG : SOPP <0x00000010, (ins SendMsgImm:$simm16), "s_sendmsg $simm16",
	[(AMDGPUsendmsg (i32 imm:$simm16))]			[(AMDGPUsendmsg (i32 imm:$simm16))]
	>;			>;
	} // End Uses = [EXEC, M0]

	def S_SENDMSGHALT : SOPP <0x00000011, (ins SendMsgImm:$simm16), "s_sendmsghalt $simm16">;			def S_SENDMSGHALT : SOPP <0x00000011, (ins SendMsgImm:$simm16), "s_sendmsghalt $simm16",
				[(AMDGPUsendmsghalt (i32 imm:$simm16))]
				>;
				} // End Uses = [EXEC, M0]
	def S_TRAP : SOPP <0x00000012, (ins i16imm:$simm16), "s_trap $simm16">;			def S_TRAP : SOPP <0x00000012, (ins i16imm:$simm16), "s_trap $simm16">;
	def S_ICACHE_INV : SOPP <0x00000013, (ins), "s_icache_inv"> {			def S_ICACHE_INV : SOPP <0x00000013, (ins), "s_icache_inv"> {
	let simm16 = 0;			let simm16 = 0;
	}			}
	def S_INCPERFLEVEL : SOPP <0x00000014, (ins i16imm:$simm16), "s_incperflevel $simm16">;			def S_INCPERFLEVEL : SOPP <0x00000014, (ins i16imm:$simm16), "s_incperflevel $simm16">;
	def S_DECPERFLEVEL : SOPP <0x00000015, (ins i16imm:$simm16), "s_decperflevel $simm16">;			def S_DECPERFLEVEL : SOPP <0x00000015, (ins i16imm:$simm16), "s_decperflevel $simm16">;
	def S_TTRACEDATA : SOPP <0x00000016, (ins), "s_ttracedata"> {			def S_TTRACEDATA : SOPP <0x00000016, (ins), "s_ttracedata"> {
	let simm16 = 0;			let simm16 = 0;
	▲ Show 20 Lines • Show All 3,010 Lines • Show Last 20 Lines

lib/Target/AMDGPU/SIIntrinsics.td

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	def int_SI_buffer_load_dword : Intrinsic <
llvm_i32_ty, // offen(imm)		llvm_i32_ty, // offen(imm)
llvm_i32_ty, // idxen(imm)		llvm_i32_ty, // idxen(imm)
llvm_i32_ty, // glc(imm)		llvm_i32_ty, // glc(imm)
llvm_i32_ty, // slc(imm)		llvm_i32_ty, // slc(imm)
llvm_i32_ty], // tfe(imm)		llvm_i32_ty], // tfe(imm)
[IntrReadMem, IntrArgMemOnly]>;		[IntrReadMem, IntrArgMemOnly]>;

def int_SI_sendmsg : Intrinsic <[], [llvm_i32_ty, llvm_i32_ty], []>;		def int_SI_sendmsg : Intrinsic <[], [llvm_i32_ty, llvm_i32_ty], []>;
		def int_SI_sendmsghalt : Intrinsic <[], [llvm_i32_ty, llvm_i32_ty], []>;

// Fully-flexible SAMPLE instruction.		// Fully-flexible SAMPLE instruction.
class SampleRaw : Intrinsic <		class SampleRaw : Intrinsic <
[llvm_v4f32_ty], // vdata(VGPR)		[llvm_v4f32_ty], // vdata(VGPR)
[llvm_anyint_ty, // vaddr(VGPR)		[llvm_anyint_ty, // vaddr(VGPR)
llvm_v8i32_ty, // rsrc(SGPR)		llvm_v8i32_ty, // rsrc(SGPR)
llvm_v4i32_ty, // sampler(SGPR)		llvm_v4i32_ty, // sampler(SGPR)
llvm_i32_ty, // dmask(imm)		llvm_i32_ty, // dmask(imm)
▲ Show 20 Lines • Show All 134 Lines • Show Last 20 Lines

test/CodeGen/AMDGPU/llvm.SI.sendmsg-m0.ll

	; RUN: llc -march=amdgcn -mcpu=verde -verify-machineinstrs < %s \| FileCheck -check-prefix=SI -check-prefix=GCN %s			; RUN: llc -march=amdgcn -mcpu=verde -verify-machineinstrs < %s \| FileCheck -check-prefix=SI -check-prefix=GCN %s
	; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s \| FileCheck -check-prefix=VI -check-prefix=GCN %s			; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s \| FileCheck -check-prefix=VI -check-prefix=GCN %s

	; GCN-LABEL: {{^}}main:			; GCN-LABEL: {{^}}main:
	; GCN: s_mov_b32 m0, s0			; GCN: s_mov_b32 m0, s0
	; VI-NEXT: s_nop 0			; VI-NEXT: s_nop 0
	; GCN-NEXT: sendmsg(MSG_GS_DONE, GS_OP_NOP)			; GCN-NEXT: sendmsg(MSG_GS_DONE, GS_OP_NOP)
	; GCN-NEXT: s_endpgm			; GCN-NEXT: s_endpgm

	define amdgpu_gs void @main(i32 inreg %a) #0 {			define amdgpu_gs void @main(i32 inreg %a) #0 {
	call void @llvm.SI.sendmsg(i32 3, i32 %a)			call void @llvm.SI.sendmsg(i32 3, i32 %a)
	ret void			ret void
	}			}

				; GCN-LABEL: {{^}}main_halt:
				; GCN: s_mov_b32 m0, s0
				; VI-NEXT: s_nop 0
				; GCN-NEXT: s_sendmsghalt sendmsg(MSG_INTERRUPT)
				; GCN-NEXT: s_endpgm

				define void @main_halt(i32 inreg %a) #0 {
				call void @llvm.SI.sendmsghalt(i32 1, i32 %a)
				ret void
				}

	declare void @llvm.SI.sendmsg(i32, i32) #0			declare void @llvm.SI.sendmsg(i32, i32) #0
				declare void @llvm.SI.sendmsghalt(i32, i32) #0

	attributes #0 = { nounwind }			attributes #0 = { nounwind }

test/CodeGen/AMDGPU/llvm.SI.sendmsg.ll

	;RUN: llc < %s -march=amdgcn -mcpu=verde -verify-machineinstrs \| FileCheck %s			;RUN: llc < %s -march=amdgcn -mcpu=verde -verify-machineinstrs \| FileCheck %s
	;RUN: llc < %s -march=amdgcn -mcpu=tonga -verify-machineinstrs \| FileCheck %s			;RUN: llc < %s -march=amdgcn -mcpu=tonga -verify-machineinstrs \| FileCheck %s

	; CHECK-LABEL: {{^}}main:			; CHECK-LABEL: {{^}}main:
	; CHECK: s_mov_b32 m0, 0			; CHECK: s_mov_b32 m0, 0
	; CHECK-NOT: s_mov_b32 m0			; CHECK-NOT: s_mov_b32 m0
				; CHECK: s_sendmsg sendmsg(MSG_INTERRUPT)
	; CHECK: s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT, 0)			; CHECK: s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT, 0)
	; CHECK: s_sendmsg sendmsg(MSG_GS, GS_OP_CUT, 1)			; CHECK: s_sendmsg sendmsg(MSG_GS, GS_OP_CUT, 1)
	; CHECK: s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT_CUT, 2)			; CHECK: s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT_CUT, 2)
	; CHECK: s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_NOP)			; CHECK: s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_NOP)
				; CHECK: s_sendmsghalt sendmsg(MSG_INTERRUPT)
				; CHECK: s_sendmsghalt sendmsg(MSG_GS, GS_OP_EMIT, 0)
				; CHECK: s_sendmsghalt sendmsg(MSG_GS, GS_OP_CUT, 1)
				; CHECK: s_sendmsghalt sendmsg(MSG_GS, GS_OP_EMIT_CUT, 2)
				; CHECK: s_sendmsghalt sendmsg(MSG_GS_DONE, GS_OP_NOP)

	define void @main() {			define void @main() {
	main_body:			main_body:
				call void @llvm.SI.sendmsg(i32 1, i32 0);
	call void @llvm.SI.sendmsg(i32 34, i32 0);			call void @llvm.SI.sendmsg(i32 34, i32 0);
	call void @llvm.SI.sendmsg(i32 274, i32 0);			call void @llvm.SI.sendmsg(i32 274, i32 0);
	call void @llvm.SI.sendmsg(i32 562, i32 0);			call void @llvm.SI.sendmsg(i32 562, i32 0);
	call void @llvm.SI.sendmsg(i32 3, i32 0);			call void @llvm.SI.sendmsg(i32 3, i32 0);

				call void @llvm.SI.sendmsghalt(i32 1, i32 0);
				call void @llvm.SI.sendmsghalt(i32 34, i32 0);
				call void @llvm.SI.sendmsghalt(i32 274, i32 0);
				call void @llvm.SI.sendmsghalt(i32 562, i32 0);
				call void @llvm.SI.sendmsghalt(i32 3, i32 0);
	ret void			ret void
	}			}

	; Function Attrs: nounwind			; Function Attrs: nounwind
	declare void @llvm.SI.sendmsg(i32, i32) #0			declare void @llvm.SI.sendmsg(i32, i32) #0
				declare void @llvm.SI.sendmsghalt(i32, i32) #0

	attributes #0 = { nounwind }			attributes #0 = { nounwind }

This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SI: Implement sendmsghalt intrinsic
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 68039

lib/Target/AMDGPU/AMDGPUISelLowering.h

lib/Target/AMDGPU/AMDGPUISelLowering.cpp

lib/Target/AMDGPU/AMDGPUInstrInfo.td

lib/Target/AMDGPU/SIISelLowering.cpp

lib/Target/AMDGPU/SIInsertWaits.cpp

lib/Target/AMDGPU/SIInstructions.td

lib/Target/AMDGPU/SIIntrinsics.td

test/CodeGen/AMDGPU/llvm.SI.sendmsg-m0.ll

test/CodeGen/AMDGPU/llvm.SI.sendmsg.ll

This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SI: Implement sendmsghalt intrinsicClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 68039

lib/Target/AMDGPU/AMDGPUISelLowering.h

lib/Target/AMDGPU/AMDGPUISelLowering.cpp

lib/Target/AMDGPU/AMDGPUInstrInfo.td

lib/Target/AMDGPU/SIISelLowering.cpp

lib/Target/AMDGPU/SIInsertWaits.cpp

lib/Target/AMDGPU/SIInstructions.td

lib/Target/AMDGPU/SIIntrinsics.td

test/CodeGen/AMDGPU/llvm.SI.sendmsg-m0.ll

test/CodeGen/AMDGPU/llvm.SI.sendmsg.ll

AMDGPU/SI: Implement sendmsghalt intrinsic
ClosedPublic