This patch converts CUDA pointer kernel arguments with default address space to
CrossWorkGroup address space (__global in OpenCL). This is because Generic or
Function (OpenCL's private) is not supported as storage class for kernel pointer types.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
[CUDA][SPIRV] Convert CUDA kernels to SPIR-V kernels
Rephrase this? This patch is about kernel *arguments*, right?
clang/lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
10322 | I am surprised by this change. Is the language mode HIP only when compiling for device? Or are you intentionally changing the behavior in HIP mode? Same in SPIR.h |
clang/lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
10322 | We are targeting SPIRV so I think "compiling for device" is implied, I will let others comment on this to see if the assumption is correct. So this function can only be called when compiling for device, and won't be called when compiling for host. Also tried compiling for device and host separately to see where exactly does the code diverge (to make sure those two functions are not called when compiling for host):
In conclusion, those two functions won't be reached when compiling for host. |
clang/lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
10322 | LGTM. |
Thanks for the review, if it looks good, can we get this to land now? Otherwise more comments are welcome!
I'll land this for you!
At some point you should get commit access yourself, Shangwu.
commit 9de4fc0f2d3b60542956f7e5254951d049edeb1f (HEAD -> main, origin/main, origin/HEAD) Author: Shangwu Yao <shangwuyao@waymo.com> Date: Thu Feb 17 09:38:06 2022 -0800 [CUDA][SPIRV] Assign global address space to CUDA kernel arguments This patch converts CUDA pointer kernel arguments with default address space to CrossWorkGroup address space (__global in OpenCL). This is because Generic or Function (OpenCL's private) is not supported as storage class for kernel pointer types. Differential Revision: https://reviews.llvm.org/D119207
Looks like the compiled SPIR-V is slightly different for different build settings, for llvm-clang-x86_64-sie-ubuntu-fast, it is compiled to
define hidden spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %output.coerce) #0 {
so it is missing that extra hidden keyword.
And for clang-ve-ninja, it is compiled to
define spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %0) #0 {
so the kernel argument identifier is slightly different (%0 vs %output.coerce).
I could fix that, I wonder why it didn't trigger the same issue (for the hidden keyword) with this test tho, it is basically the same.
And why does those build test run only after merging? For future reference, can I try to run those myself before submitting?
For this change, should we do a rollback and then re-land it after applying the fix?
These are build bots that run builds when new commits are detected. That's how they work. There is some pre-commit testing, but I'm not sure how that works exactly.
You can run these builds yourself before submitting, just extract the cmake command from the job and then run the build/test the normal way. If you need help reproducing a bot failure, an email to the bot owner can help you if you have trouble reproducing a failure.
For this change, should we do a rollback and then re-land it after applying the fix?
If you can fix it soon, a quick fix is fine. If you need time to investigate, a revert would be best until you can fix it so that the bot does not keep failing.
I am surprised by this change. Is the language mode HIP only when compiling for device? Or are you intentionally changing the behavior in HIP mode?
Same in SPIR.h