This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SI: Implement sendmsghalt intrinsic
ClosedPublic

Authored by jvesely on Aug 15 2016, 8:19 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

jvesely updated this revision to Diff 68039.Aug 15 2016, 8:19 AM
jvesely retitled this revision from to AMDGPU/SI: Implement sendmsghalt intrinsic.
jvesely updated this object.
jvesely added a reviewer: tstellarAMD.
jvesely set the repository for this revision to rL LLVM.

New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0

New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0

I'd like it to keep it as close to sendmsg as possible. Do you know if there are compatibility issues if I rename SI_sendmsg to amdgcn_sendmsg?

New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0

I'd like it to keep it as close to sendmsg as possible. Do you know if there are compatibility issues if I rename SI_sendmsg to amdgcn_sendmsg?

SI.sendmsg also needs to be changed and replaced (as well as the rest of the intrinsics in the backend). The goal is to eventually fix any intrinsic design issues and fix the names when moving them to the public intrinsics

jvesely updated this revision to Diff 68200.Aug 16 2016, 9:30 AM

rename and expose both sendmsg intrinsics

New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0

I'd like it to keep it as close to sendmsg as possible. Do you know if there are compatibility issues if I rename SI_sendmsg to amdgcn_sendmsg?

SI.sendmsg also needs to be changed and replaced (as well as the rest of the intrinsics in the backend). The goal is to eventually fix any intrinsic design issues and fix the names when moving them to the public intrinsics

I understand that, although having a list of approaches considered deprecated would reduce some wasted effort.
My question was whether there are users of the old sendmsg intrinsic name that would break.

I'm not sure how to unbundle writing m0 if we want to expose sendmsg as __builtin.function, is there a generic way to write to m0 from high level language?

New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0

I'd like it to keep it as close to sendmsg as possible. Do you know if there are compatibility issues if I rename SI_sendmsg to amdgcn_sendmsg?

SI.sendmsg also needs to be changed and replaced (as well as the rest of the intrinsics in the backend). The goal is to eventually fix any intrinsic design issues and fix the names when moving them to the public intrinsics

I understand that, although having a list of approaches considered deprecated would reduce some wasted effort.
My question was whether there are users of the old sendmsg intrinsic name that would break.

I'm not sure how to unbundle writing m0 if we want to expose sendmsg as __builtin.function, is there a generic way to write to m0 from high level language?

We can keep the old intrinsic working while adding the new one. A builtin would be needed for emitting the write to m0 since read/write_register are not directly exposed. My concern about this is what happens if you have something like:

llvm.write_register(m0)
%foo = load i32, i32 addrspace(3)*
llvm.amdgcn.s.sendmsg()

The lowering of the LDS access will insert initialization of m0 to -1, clobbering the old value. I'm not sure if it's better to either switch the M0 initialization lowering to copy the pre-existing value and restore after. I'm also not sure if we should keep considering m0 as an allocatable register

We can keep the old intrinsic working while adding the new one.

If we don't know of any users now, we probably won't know more in the future. I don't mind updating mine.
Let me know your preference, I have an alternative patch that keeps the old name ready.

A builtin would be needed for emitting the write to m0 since read/write_register are not directly exposed. My concern about this is what happens if you have something like:

llvm.write_register(m0)
%foo = load i32, i32 addrspace(3)*
llvm.amdgcn.s.sendmsg()

The lowering of the LDS access will insert initialization of m0 to -1, clobbering the old value. I'm not sure if it's better to either switch the M0 initialization lowering to copy the pre-existing value and restore after. I'm also not sure if we should keep considering m0 as an allocatable register

shouldn't register allocation (picking the same phys m0) and instruction scheduling figure out the hazard and either schedule the instructions correctly or spill?

anyway, It looks like proper handling of m0 would need a separate patch. I'd prefer to leave that for another time. This change just adds a halting copy of sendmsg, so both can be modified at the same time if necessary.

jvesely updated this revision to Diff 71473.Sep 14 2016, 8:21 PM

keep the old intrinsic around

arsenm added inline comments.Dec 20 2016, 9:27 AM
include/llvm/IR/IntrinsicsAMDGPU.td
107–110 ↗(On Diff #71473)

These should have a comments explaining the arguments, at least mentioning that the implicit m0 argument is the last one

test/CodeGen/AMDGPU/amdgcn.sendmsg.ll
23 ↗(On Diff #71473)

better name would be test_sendmsg or something

31–35 ↗(On Diff #71473)

I would split each of these sets into its own function

jvesely updated this revision to Diff 82416.Dec 23 2016, 11:36 AM

add comment
improve test

jvesely marked 3 inline comments as done.Dec 23 2016, 11:37 AM
arsenm accepted this revision.Jan 4 2017, 8:35 AM
arsenm added a reviewer: arsenm.

LGTM

test/CodeGen/AMDGPU/amdgcn.sendmsg.ll
1–2 ↗(On Diff #82416)

Can you change these to use GCN as the check prefix

This revision is now accepted and ready to land.Jan 4 2017, 8:35 AM
This revision was automatically updated to reflect the committed changes.