This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][SPIRV] Assign global address space to CUDA kernel arguments
ClosedPublic

Authored by shangwuyao on Feb 7 2022, 6:20 PM.

Details

Summary

This patch converts CUDA pointer kernel arguments with default address space to
CrossWorkGroup address space (__global in OpenCL). This is because Generic or
Function (OpenCL's private) is not supported as storage class for kernel pointer types.

Diff Detail

Event Timeline

shangwuyao created this revision.Feb 7 2022, 6:20 PM
shangwuyao requested review of this revision.Feb 7 2022, 6:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2022, 6:20 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
shangwuyao edited the summary of this revision. (Show Details)Feb 7 2022, 6:24 PM
shangwuyao added reviewers: jlebar, mkuper, tra, dcastagna.

[CUDA][SPIRV] Convert CUDA kernels to SPIR-V kernels

Rephrase this? This patch is about kernel *arguments*, right?

jlebar added inline comments.Feb 8 2022, 12:09 AM
clang/lib/CodeGen/TargetInfo.cpp
10322

I am surprised by this change. Is the language mode HIP only when compiling for device? Or are you intentionally changing the behavior in HIP mode?

Same in SPIR.h

shangwuyao retitled this revision from [CUDA][SPIRV] Convert CUDA kernels to SPIR-V kernels to [CUDA][SPIRV] Assign global address space to CUDA kernel arguments.Feb 8 2022, 10:02 AM
shangwuyao added inline comments.Feb 8 2022, 5:49 PM
clang/lib/CodeGen/TargetInfo.cpp
10322

We are targeting SPIRV so I think "compiling for device" is implied, I will let others comment on this to see if the assumption is correct. So this function can only be called when compiling for device, and won't be called when compiling for host.

Also tried compiling for device and host separately to see where exactly does the code diverge (to make sure those two functions are not called when compiling for host):

  1. This classifyKernelArgumentType() function is called from here, which is only enabled when the calling convention is SPIR_KERNEL. And when compiling for host, the calling convention is C.
  1. For the SPIR.h file, the TargetInfo::adjust function is called both when compiling for host and for device, see here, while the setAddressSpaceMap function is only called when compiling for device (SPIRV).

In conclusion, those two functions won't be reached when compiling for host.

yaxunl added inline comments.Feb 15 2022, 1:00 PM
clang/lib/CodeGen/TargetInfo.cpp
10322

LGTM.

Thanks for the review, if it looks good, can we get this to land now? Otherwise more comments are welcome!

Thanks for the review, if it looks good, can we get this to land now? Otherwise more comments are welcome!

I'll land this for you!

At some point you should get commit access yourself, Shangwu.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 17 2022, 9:39 AM
This revision was automatically updated to reflect the committed changes.
commit 9de4fc0f2d3b60542956f7e5254951d049edeb1f (HEAD -> main, origin/main, origin/HEAD)
Author: Shangwu Yao <shangwuyao@waymo.com>
Date:   Thu Feb 17 09:38:06 2022 -0800

    [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

    This patch converts CUDA pointer kernel arguments with default address space to
    CrossWorkGroup address space (__global in OpenCL). This is because Generic or
    Function (OpenCL's private) is not supported as storage class for kernel pointer types.

    Differential Revision: https://reviews.llvm.org/D119207
dyung added a subscriber: dyung.Feb 17 2022, 11:18 AM

Hi, the test you added is failing on the PS4 Linux bot, can you take a look?

https://lab.llvm.org/buildbot/#/builders/139/builds/17199

Hi, the test you added is failing on the PS4 Linux bot, can you take a look?

https://lab.llvm.org/buildbot/#/builders/139/builds/17199

Looks like the compiled SPIR-V is slightly different for different build settings, for llvm-clang-x86_64-sie-ubuntu-fast, it is compiled to

define hidden spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %output.coerce) #0 {

so it is missing that extra hidden keyword.
And for clang-ve-ninja, it is compiled to

define spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %0) #0 {

so the kernel argument identifier is slightly different (%0 vs %output.coerce).

I could fix that, I wonder why it didn't trigger the same issue (for the hidden keyword) with this test tho, it is basically the same.

And why does those build test run only after merging? For future reference, can I try to run those myself before submitting?

For this change, should we do a rollback and then re-land it after applying the fix?

Hi, the test you added is failing on the PS4 Linux bot, can you take a look?

https://lab.llvm.org/buildbot/#/builders/139/builds/17199

Looks like the compiled SPIR-V is slightly different for different build settings, for llvm-clang-x86_64-sie-ubuntu-fast, it is compiled to

define hidden spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %output.coerce) #0 {

so it is missing that extra hidden keyword.
And for clang-ve-ninja, it is compiled to

define spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %0) #0 {

so the kernel argument identifier is slightly different (%0 vs %output.coerce).

I could fix that, I wonder why it didn't trigger the same issue (for the hidden keyword) with this test tho, it is basically the same.

And why does those build test run only after merging? For future reference, can I try to run those myself before submitting?

These are build bots that run builds when new commits are detected. That's how they work. There is some pre-commit testing, but I'm not sure how that works exactly.

You can run these builds yourself before submitting, just extract the cmake command from the job and then run the build/test the normal way. If you need help reproducing a bot failure, an email to the bot owner can help you if you have trouble reproducing a failure.

For this change, should we do a rollback and then re-land it after applying the fix?

If you can fix it soon, a quick fix is fine. If you need time to investigate, a revert would be best until you can fix it so that the bot does not keep failing.

ormris added a subscriber: ormris.Feb 17 2022, 2:35 PM

Hi Shangwu,

I've reverted this change to unblock the buildbots and our internal CI.

ormris removed a subscriber: ormris.Feb 22 2022, 10:10 AM