Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
New intrinsics should go in include/llvm/IR/IntrinsicsAMDGPU.td, and have an amdgcn prefix. I also think it probably should not have a separate parameter for m0, and instead rely on llvm.write_register setting m0
I'd like it to keep it as close to sendmsg as possible. Do you know if there are compatibility issues if I rename SI_sendmsg to amdgcn_sendmsg?
SI.sendmsg also needs to be changed and replaced (as well as the rest of the intrinsics in the backend). The goal is to eventually fix any intrinsic design issues and fix the names when moving them to the public intrinsics
I understand that, although having a list of approaches considered deprecated would reduce some wasted effort.
My question was whether there are users of the old sendmsg intrinsic name that would break.
I'm not sure how to unbundle writing m0 if we want to expose sendmsg as __builtin.function, is there a generic way to write to m0 from high level language?
We can keep the old intrinsic working while adding the new one. A builtin would be needed for emitting the write to m0 since read/write_register are not directly exposed. My concern about this is what happens if you have something like:
llvm.write_register(m0)
%foo = load i32, i32 addrspace(3)*
llvm.amdgcn.s.sendmsg()
The lowering of the LDS access will insert initialization of m0 to -1, clobbering the old value. I'm not sure if it's better to either switch the M0 initialization lowering to copy the pre-existing value and restore after. I'm also not sure if we should keep considering m0 as an allocatable register
If we don't know of any users now, we probably won't know more in the future. I don't mind updating mine.
Let me know your preference, I have an alternative patch that keeps the old name ready.
A builtin would be needed for emitting the write to m0 since read/write_register are not directly exposed. My concern about this is what happens if you have something like:
llvm.write_register(m0)
%foo = load i32, i32 addrspace(3)*
llvm.amdgcn.s.sendmsg()The lowering of the LDS access will insert initialization of m0 to -1, clobbering the old value. I'm not sure if it's better to either switch the M0 initialization lowering to copy the pre-existing value and restore after. I'm also not sure if we should keep considering m0 as an allocatable register
shouldn't register allocation (picking the same phys m0) and instruction scheduling figure out the hazard and either schedule the instructions correctly or spill?
anyway, It looks like proper handling of m0 would need a separate patch. I'd prefer to leave that for another time. This change just adds a halting copy of sendmsg, so both can be modified at the same time if necessary.
include/llvm/IR/IntrinsicsAMDGPU.td | ||
---|---|---|
107–110 | These should have a comments explaining the arguments, at least mentioning that the implicit m0 argument is the last one | |
test/CodeGen/AMDGPU/amdgcn.sendmsg.ll | ||
23 | better name would be test_sendmsg or something | |
31–35 | I would split each of these sets into its own function |
LGTM
test/CodeGen/AMDGPU/amdgcn.sendmsg.ll | ||
---|---|---|
2–3 | Can you change these to use GCN as the check prefix |
These should have a comments explaining the arguments, at least mentioning that the implicit m0 argument is the last one