This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/
-
Basic/Targets/
-
Targets/
-
SPIR.h
-
CodeGen/
1/3
TargetInfo.cpp
-
test/CodeGenCUDASPIRV/
-
CodeGenCUDASPIRV/
-
kernel-argument.cu

Differential D119207

[CUDA][SPIRV] Assign global address space to CUDA kernel arguments
ClosedPublic

Authored by shangwuyao on Feb 7 2022, 6:20 PM.

Download Raw Diff

Details

Reviewers

jlebar
mkuper
tra
dcastagna
yaxunl

Commits

rG9de4fc0f2d3b: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

Summary

This patch converts CUDA pointer kernel arguments with default address space to
CrossWorkGroup address space (__global in OpenCL). This is because Generic or
Function (OpenCL's private) is not supported as storage class for kernel pointer types.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

shangwuyao created this revision.Feb 7 2022, 6:20 PM

Herald added subscribers: carlosgalvezp, ThomasRaoux, Anastasia, yaxunl. · View Herald TranscriptFeb 7 2022, 6:20 PM

shangwuyao requested review of this revision.Feb 7 2022, 6:20 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2022, 6:20 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

shangwuyao edited the summary of this revision. (Show Details)Feb 7 2022, 6:24 PM

shangwuyao added reviewers: jlebar, mkuper, tra, dcastagna.

Harbormaster completed remote builds in B148145: Diff 406668.Feb 7 2022, 9:07 PM

[CUDA][SPIRV] Convert CUDA kernels to SPIR-V kernels

Rephrase this? This patch is about kernel *arguments*, right?

jlebar added inline comments.Feb 8 2022, 12:09 AM

clang/lib/CodeGen/TargetInfo.cpp
10323	I am surprised by this change. Is the language mode HIP only when compiling for device? Or are you intentionally changing the behavior in HIP mode? Same in SPIR.h

shangwuyao retitled this revision from [CUDA][SPIRV] Convert CUDA kernels to SPIR-V kernels to [CUDA][SPIRV] Assign global address space to CUDA kernel arguments.Feb 8 2022, 10:02 AM

shangwuyao added inline comments.Feb 8 2022, 5:49 PM

clang/lib/CodeGen/TargetInfo.cpp
10323	We are targeting SPIRV so I think "compiling for device" is implied, I will let others comment on this to see if the assumption is correct. So this function can only be called when compiling for device, and won't be called when compiling for host. Also tried compiling for device and host separately to see where exactly does the code diverge (to make sure those two functions are not called when compiling for host): This `classifyKernelArgumentType()` function is called from here, which is only enabled when the calling convention is `SPIR_KERNEL`. And when compiling for host, the calling convention is `C`. For the SPIR.h file, the `TargetInfo::adjust` function is called both when compiling for host and for device, see here, while the `setAddressSpaceMap` function is only called when compiling for device (SPIRV). In conclusion, those two functions won't be reached when compiling for host.

shangwuyao added a reviewer: yaxunl.Feb 15 2022, 12:25 PM

yaxunl added inline comments.Feb 15 2022, 1:00 PM

clang/lib/CodeGen/TargetInfo.cpp
10323	LGTM.

Thanks for the review, if it looks good, can we get this to land now? Otherwise more comments are welcome!

In D119207#3327476, @shangwuyao wrote:

Thanks for the review, if it looks good, can we get this to land now? Otherwise more comments are welcome!

I'll land this for you!

At some point you should get commit access yourself, Shangwu.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 17 2022, 9:39 AM

Closed by commit rG9de4fc0f2d3b: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments (authored by shangwuyao, committed by jlebar). · Explain Why

This revision was automatically updated to reflect the committed changes.

jlebar added a commit: rG9de4fc0f2d3b: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments.

commit 9de4fc0f2d3b60542956f7e5254951d049edeb1f (HEAD -> main, origin/main, origin/HEAD)
Author: Shangwu Yao <shangwuyao@waymo.com>
Date:   Thu Feb 17 09:38:06 2022 -0800

    [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

    This patch converts CUDA pointer kernel arguments with default address space to
    CrossWorkGroup address space (__global in OpenCL). This is because Generic or
    Function (OpenCL's private) is not supported as storage class for kernel pointer types.

    Differential Revision: https://reviews.llvm.org/D119207

Hi, the test you added is failing on the PS4 Linux bot, can you take a look?

https://lab.llvm.org/buildbot/#/builders/139/builds/17199

In D119207#3330385, @dyung wrote:

Hi, the test you added is failing on the PS4 Linux bot, can you take a look?

https://lab.llvm.org/buildbot/#/builders/139/builds/17199

Looks like the compiled SPIR-V is slightly different for different build settings, for llvm-clang-x86_64-sie-ubuntu-fast, it is compiled to

define hidden spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %output.coerce) #0 {

so it is missing that extra hidden keyword.
And for clang-ve-ninja, it is compiled to

define spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %0) #0 {

so the kernel argument identifier is slightly different (%0 vs %output.coerce).

I could fix that, I wonder why it didn't trigger the same issue (for the hidden keyword) with this test tho, it is basically the same.

And why does those build test run only after merging? For future reference, can I try to run those myself before submitting?

For this change, should we do a rollback and then re-land it after applying the fix?

In D119207#3330494, @shangwuyao wrote:
In D119207#3330385, @dyung wrote:

Hi, the test you added is failing on the PS4 Linux bot, can you take a look?

https://lab.llvm.org/buildbot/#/builders/139/builds/17199

Looks like the compiled SPIR-V is slightly different for different build settings, for llvm-clang-x86_64-sie-ubuntu-fast, it is compiled to
define hidden spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %output.coerce) #0 {
so it is missing that extra hidden keyword.
And for clang-ve-ninja, it is compiled to
define spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %0) #0 {
so the kernel argument identifier is slightly different (%0 vs %output.coerce).

I could fix that, I wonder why it didn't trigger the same issue (for the hidden keyword) with this test tho, it is basically the same.

And why does those build test run only after merging? For future reference, can I try to run those myself before submitting?

These are build bots that run builds when new commits are detected. That's how they work. There is some pre-commit testing, but I'm not sure how that works exactly.

You can run these builds yourself before submitting, just extract the cmake command from the job and then run the build/test the normal way. If you need help reproducing a bot failure, an email to the bot owner can help you if you have trouble reproducing a failure.

For this change, should we do a rollback and then re-land it after applying the fix?

If you can fix it soon, a quick fix is fine. If you need time to investigate, a revert would be best until you can fix it so that the bot does not keep failing.

ormris added a reverting change: rG9ce09099bba4: Revert "[CUDA][SPIRV] Assign global address space to CUDA kernel arguments".Feb 17 2022, 2:32 PM

Hi Shangwu,

I've reverted this change to unblock the buildbots and our internal CI.

ormris removed a subscriber: ormris.Feb 22 2022, 10:10 AM

shangwuyao mentioned this in D120366: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments.Feb 22 2022, 4:17 PM

jlebar mentioned this in rGc2f501f39589: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments.Feb 24 2022, 8:52 PM

Revision Contents

Path

Size

clang/

lib/

Basic/

Targets/

SPIR.h

10 lines

CodeGen/

TargetInfo.cpp

6 lines

test/

CodeGenCUDASPIRV/

kernel-argument.cu

17 lines

Diff 409696

clang/lib/Basic/Targets/SPIR.h

Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	void setAddressSpaceMap(bool DefaultIsGeneric) {
AddrSpaceMap = DefaultIsGeneric ? &SPIRDefIsGenMap : &SPIRDefIsPrivMap;		AddrSpaceMap = DefaultIsGeneric ? &SPIRDefIsGenMap : &SPIRDefIsPrivMap;
}		}

void adjust(DiagnosticsEngine &Diags, LangOptions &Opts) override {		void adjust(DiagnosticsEngine &Diags, LangOptions &Opts) override {
TargetInfo::adjust(Diags, Opts);		TargetInfo::adjust(Diags, Opts);
// FIXME: SYCL specification considers unannotated pointers and references		// FIXME: SYCL specification considers unannotated pointers and references
// to be pointing to the generic address space. See section 5.9.3 of		// to be pointing to the generic address space. See section 5.9.3 of
// SYCL 2020 specification.		// SYCL 2020 specification.
// Currently, there is no way of representing SYCL's and HIP's default		// Currently, there is no way of representing SYCL's and HIP/CUDA's default
// address space language semantic along with the semantics of embedded C's		// address space language semantic along with the semantics of embedded C's
// default address space in the same address space map. Hence the map needs		// default address space in the same address space map. Hence the map needs
// to be reset to allow mapping to the desired value of 'Default' entry for		// to be reset to allow mapping to the desired value of 'Default' entry for
// SYCL and HIP.		// SYCL and HIP/CUDA.
setAddressSpaceMap(		setAddressSpaceMap(
/DefaultIsGeneric=/Opts.SYCLIsDevice \|\|		/DefaultIsGeneric=/Opts.SYCLIsDevice \|\|
// The address mapping from HIP language for device code is only defined		// The address mapping from HIP/CUDA language for device code is only
// for SPIR-V.		// defined for SPIR-V.
(getTriple().isSPIRV() && Opts.HIP && Opts.CUDAIsDevice));		(getTriple().isSPIRV() && Opts.CUDAIsDevice));
}		}

void setSupportedOpenCLOpts() override {		void setSupportedOpenCLOpts() override {
// Assume all OpenCL extensions and optional core features are supported		// Assume all OpenCL extensions and optional core features are supported
// for SPIR and SPIR-V since they are generic targets.		// for SPIR and SPIR-V since they are generic targets.
supportAllOpenCLOpts();		supportAllOpenCLOpts();
}		}

▲ Show 20 Lines • Show All 114 Lines • Show Last 20 Lines

clang/lib/CodeGen/TargetInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 10,314 Lines • ▼ Show 20 Lines
	} // End anonymous namespace.			} // End anonymous namespace.

	void CommonSPIRABIInfo::setCCs() {			void CommonSPIRABIInfo::setCCs() {
	assert(getRuntimeCC() == llvm::CallingConv::C);			assert(getRuntimeCC() == llvm::CallingConv::C);
	RuntimeCC = llvm::CallingConv::SPIR_FUNC;			RuntimeCC = llvm::CallingConv::SPIR_FUNC;
	}			}

	ABIArgInfo SPIRVABIInfo::classifyKernelArgumentType(QualType Ty) const {			ABIArgInfo SPIRVABIInfo::classifyKernelArgumentType(QualType Ty) const {
	if (getContext().getLangOpts().HIP) {			if (getContext().getLangOpts().CUDAIsDevice) {
				jlebarUnsubmitted Not Done Reply Inline Actions I am surprised by this change. Is the language mode HIP only when compiling for device? Or are you intentionally changing the behavior in HIP mode? Same in SPIR.h jlebar: I am surprised by this change. Is the language mode HIP only when compiling for device? Or…
				shangwuyaoAuthorUnsubmitted Done Reply Inline Actions We are targeting SPIRV so I think "compiling for device" is implied, I will let others comment on this to see if the assumption is correct. So this function can only be called when compiling for device, and won't be called when compiling for host. Also tried compiling for device and host separately to see where exactly does the code diverge (to make sure those two functions are not called when compiling for host): This `classifyKernelArgumentType()` function is called from here, which is only enabled when the calling convention is `SPIR_KERNEL`. And when compiling for host, the calling convention is `C`. For the SPIR.h file, the `TargetInfo::adjust` function is called both when compiling for host and for device, see here, while the `setAddressSpaceMap` function is only called when compiling for device (SPIRV). In conclusion, those two functions won't be reached when compiling for host. shangwuyao: We are targeting SPIRV so //I think// "compiling for device" is implied, I will let others…
				yaxunlUnsubmitted Not Done Reply Inline Actions LGTM. yaxunl: LGTM.
	// Coerce pointer arguments with default address space to CrossWorkGroup			// Coerce pointer arguments with default address space to CrossWorkGroup
	// pointers for HIPSPV. When the language mode is HIP, the SPIRTargetInfo			// pointers for HIPSPV/CUDASPV. When the language mode is HIP/CUDA, the
	// maps cuda_device to SPIR-V's CrossWorkGroup address space.			// SPIRTargetInfo maps cuda_device to SPIR-V's CrossWorkGroup address space.
	llvm::Type *LTy = CGT.ConvertType(Ty);			llvm::Type *LTy = CGT.ConvertType(Ty);
	auto DefaultAS = getContext().getTargetAddressSpace(LangAS::Default);			auto DefaultAS = getContext().getTargetAddressSpace(LangAS::Default);
	auto GlobalAS = getContext().getTargetAddressSpace(LangAS::cuda_device);			auto GlobalAS = getContext().getTargetAddressSpace(LangAS::cuda_device);
	auto *PtrTy = llvm::dyn_cast<llvm::PointerType>(LTy);			auto *PtrTy = llvm::dyn_cast<llvm::PointerType>(LTy);
	if (PtrTy && PtrTy->getAddressSpace() == DefaultAS) {			if (PtrTy && PtrTy->getAddressSpace() == DefaultAS) {
	LTy = llvm::PointerType::getWithSamePointeeType(PtrTy, GlobalAS);			LTy = llvm::PointerType::getWithSamePointeeType(PtrTy, GlobalAS);
	return ABIArgInfo::getDirect(LTy, 0, nullptr, false);			return ABIArgInfo::getDirect(LTy, 0, nullptr, false);
	}			}
	▲ Show 20 Lines • Show All 1,227 Lines • Show Last 20 Lines

clang/test/CodeGenCUDASPIRV/kernel-argument.cu

This file was added.

				// Tests CUDA kernel arguments get global address space when targetting SPIR-V.

				// REQUIRES: clang-driver

				// RUN: %clang -emit-llvm --cuda-device-only --offload=spirv32 \
				// RUN: -nocudalib -nocudainc %s -o %t.bc -c 2>&1
				// RUN: llvm-dis %t.bc -o %t.ll
				// RUN: FileCheck %s --input-file=%t.ll

				// RUN: %clang -emit-llvm --cuda-device-only --offload=spirv64 \
				// RUN: -nocudalib -nocudainc %s -o %t.bc -c 2>&1
				// RUN: llvm-dis %t.bc -o %t.ll
				// RUN: FileCheck %s --input-file=%t.ll

				// CHECK: define spir_kernel void @_Z6kernelPi(i32 addrspace(1)* noundef %output.coerce)

				__attribute__((global)) void kernel(int* output) { *output = 1; }