This is an archive of the discontinued LLVM Phabricator instance.

clang/lib/Driver/ToolChains/HIP.cpp
116	We do not do it for v2/v3. Could you elaborate on what makes v4 special that it needs its own offload kind? Will you need to target different object versions simultaneously? If yes, how? AFAICT, the version specified is currently global and applies to all sub-compilations. If not, then do we really need to encode the version in the offload target name?

yaxunl marked an inline comment as done.Mar 24 2021, 10:34 AM

yaxunl added inline comments.

clang/lib/Driver/ToolChains/HIP.cpp
116	Introducing hipv4 is to differentiate with code object version 2 and 3 which are used by HIP applications compiled by older version of clang. ROCm platform is required to keep binary backward compatibility, i.e., old HIP applications built by ROCm 4.0 should run on ROCm 4.1. The bundle ID has different interpretation depending on whether it is version 2/3 or version 4, e.g. 'gfx906' implies xnack and sramecc off with code object v2/3 but implies xnack and sramecc ANY with v4. Since code object version 2/3 uses 'hip', code object version 4 needs to be different, therefore it uses 'hipv4'.

lukebroskop added a subscriber: lukebroskop.Mar 26 2021, 11:11 AM

ping.

tra accepted this revision.Apr 6 2021, 3:11 PM

tra added inline comments.

clang/lib/Driver/ToolChains/HIP.cpp
115	Should it be an error if we pass `-mcode-object-version=99` ?

This revision is now accepted and ready to land.Apr 6 2021, 3:11 PM

yaxunl marked an inline comment as done.Apr 6 2021, 4:22 PM

yaxunl added inline comments.

clang/lib/Driver/ToolChains/HIP.cpp
115	Yes we diagnose that.

Still LGTM.

clang/test/Driver/hip-code-object-version.hip
24–39	Nit: it would be nice to move V2 tests above the V3, so the tests are in order.

Closed by commit rG4fd05e0ad7fb: [HIP] Change to code object v4 (authored by yaxunl). · Explain WhyApr 6 2021, 5:23 PM

This revision was automatically updated to reflect the committed changes.

yaxunl marked an inline comment as done.

yaxunl added a commit: rG4fd05e0ad7fb: [HIP] Change to code object v4.

Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2021, 5:23 PM

yaxunl marked an inline comment as done.Apr 6 2021, 5:28 PM

yaxunl added inline comments.

clang/test/Driver/hip-code-object-version.hip
24–39	sorry I missed it. will do.

gregrodgers added a subscriber: gregrodgers.May 18 2021, 11:54 AM

gregrodgers added inline comments.

clang/lib/Driver/ToolChains/HIP.cpp
116	We need to start thinking in terms of offload requirements of a compiled image vs the capabilities of a particular active runtime on a particular GPU. This concept can eliminate the need for a new offload kind. For AMD, we would add the requirement of code object v4 (cov4) if built for code object v4 or greater. This means it can only run on a system with that capability. This concept works well with requirements xnack+, xnack-, sramecc+ and sramecc-. The bundle entry id is the offload-kind, the triple, and the list of image requirements. The gpu type (offload-arch) is really an image requirement. In this model, there is no requirement for xnack-any. The lack of the xnack+ or xnack- requirement implies "any" which means it can run on any capable machine. This is a general model that is extensible. To make this work, a runtime must be able to detect the capabilities for any requirement that could be tagged on an image. In fact, every requirement of an embedded image must have its capability detected by the runtime for that offload image to be usable. However, a system's runtime could have more capabilities than the requirements of an image. So in the case of xnack, the lack of xnack- or xnack+ will be acceptable no matter what the xnack capability of the runtime is. If the compiler driver puts the requirement cov4 in the bundle entry id requirements field the runtime will not run that image unless the GPU loader supports v4 or greater. The clang driver can create the requirement xnack- for code object < 4 on those GPUs that support either xnack mode. This will ensure the image will gracefully fail or use an alternative image if the runtime capability is xnack+. But the cov4 requirement is mostly unrelated to xnack . It is about the capability of the GPU loader. If the code object version >= 4, then it will be tagged with the cov4 requirement. This would prevent an old system that does not have a newer software stack from running an image with a cov4 requirement. This general notion of image requirements and runtime capabilities is extensible to other offload architectures. Suppose cuda version 12 compilation REQUIRES that a cuda version 12 runtime. Old runtimes would never display cuv12 capability and would fail to run any image created with the requirement cuv12.

Revision Contents

Path

Size

clang/

lib/

Driver/

ToolChains/

CommonArgs.cpp

2 lines

HIP.cpp

3 lines

test/

Driver/

hip-code-object-version.hip

6 lines

hip-target-id.hip

6 lines

hip-toolchain-device-only.hip

2 lines

hip-toolchain-no-rdc.hip

4 lines

hip-toolchain-rdc-separate.hip

2 lines

hip-toolchain-rdc-static-lib.hip

2 lines

hip-toolchain-rdc.hip

2 lines

tools/

clang-offload-bundler/

ClangOffloadBundler.cpp

1 line

Diff 332859

clang/lib/Driver/ToolChains/CommonArgs.cpp

Show First 20 Lines • Show All 1,569 Lines • ▼ Show 20 Lines	if (const Arg *A = Args.getLastArg(options::OPT_mpad_max_prefix_size_EQ)) {
}		}
}		}
}		}

unsigned tools::getOrCheckAMDGPUCodeObjectVersion(		unsigned tools::getOrCheckAMDGPUCodeObjectVersion(
const Driver &D, const llvm::opt::ArgList &Args, bool Diagnose) {		const Driver &D, const llvm::opt::ArgList &Args, bool Diagnose) {
const unsigned MinCodeObjVer = 2;		const unsigned MinCodeObjVer = 2;
const unsigned MaxCodeObjVer = 4;		const unsigned MaxCodeObjVer = 4;
unsigned CodeObjVer = 3;		unsigned CodeObjVer = 4;

// Emit warnings for legacy options even if they are overridden.		// Emit warnings for legacy options even if they are overridden.
if (Diagnose) {		if (Diagnose) {
if (Args.hasArg(options::OPT_mno_code_object_v3_legacy))		if (Args.hasArg(options::OPT_mno_code_object_v3_legacy))
D.Diag(diag::warn_drv_deprecated_arg) << "-mno-code-object-v3"		D.Diag(diag::warn_drv_deprecated_arg) << "-mno-code-object-v3"
<< "-mcode-object-version=2";		<< "-mcode-object-version=2";

if (Args.hasArg(options::OPT_mcode_object_v3_legacy))		if (Args.hasArg(options::OPT_mcode_object_v3_legacy))
▲ Show 20 Lines • Show All 120 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/HIP.cpp

Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	void AMDGCN::constructHIPFatbinCommand(Compilation &C, const JobAction &JA,
BundlerArgs.push_back(		BundlerArgs.push_back(
Args.MakeArgString("-bundle-align=" + Twine(HIPCodeObjectAlign)));		Args.MakeArgString("-bundle-align=" + Twine(HIPCodeObjectAlign)));

// ToDo: Remove the dummy host binary entry which is required by		// ToDo: Remove the dummy host binary entry which is required by
// clang-offload-bundler.		// clang-offload-bundler.
std::string BundlerTargetArg = "-targets=host-x86_64-unknown-linux";		std::string BundlerTargetArg = "-targets=host-x86_64-unknown-linux";
std::string BundlerInputArg = "-inputs=" NULL_FILE;		std::string BundlerInputArg = "-inputs=" NULL_FILE;

// TODO: Change the bundle ID as requested by HIP runtime.
// For code object version 2 and 3, the offload kind in bundle ID is 'hip'		// For code object version 2 and 3, the offload kind in bundle ID is 'hip'
// for backward compatibility. For code object version 4 and greater, the		// for backward compatibility. For code object version 4 and greater, the
// offload kind in bundle ID is 'hipv4'.		// offload kind in bundle ID is 'hipv4'.
std::string OffloadKind = "hip";		std::string OffloadKind = "hip";
		if (getOrCheckAMDGPUCodeObjectVersion(C.getDriver(), Args) >= 4)
		traUnsubmitted Done Reply Inline Actions Should it be an error if we pass `-mcode-object-version=99` ? tra: Should it be an error if we pass `-mcode-object-version=99` ?
		yaxunlAuthorUnsubmitted Done Reply Inline Actions Yes we diagnose that. yaxunl: Yes we diagnose that.
		OffloadKind = OffloadKind + "v4";
		traUnsubmitted Done Reply Inline Actions We do not do it for v2/v3. Could you elaborate on what makes v4 special that it needs its own offload kind? Will you need to target different object versions simultaneously? If yes, how? AFAICT, the version specified is currently global and applies to all sub-compilations. If not, then do we really need to encode the version in the offload target name? tra: We do not do it for v2/v3. Could you elaborate on what makes v4 special that it needs its own…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions Introducing hipv4 is to differentiate with code object version 2 and 3 which are used by HIP applications compiled by older version of clang. ROCm platform is required to keep binary backward compatibility, i.e., old HIP applications built by ROCm 4.0 should run on ROCm 4.1. The bundle ID has different interpretation depending on whether it is version 2/3 or version 4, e.g. 'gfx906' implies xnack and sramecc off with code object v2/3 but implies xnack and sramecc ANY with v4. Since code object version 2/3 uses 'hip', code object version 4 needs to be different, therefore it uses 'hipv4'. yaxunl: Introducing hipv4 is to differentiate with code object version 2 and 3 which are used by HIP…
		gregrodgersUnsubmitted Not Done Reply Inline Actions We need to start thinking in terms of offload requirements of a compiled image vs the capabilities of a particular active runtime on a particular GPU. This concept can eliminate the need for a new offload kind. For AMD, we would add the requirement of code object v4 (cov4) if built for code object v4 or greater. This means it can only run on a system with that capability. This concept works well with requirements xnack+, xnack-, sramecc+ and sramecc-. The bundle entry id is the offload-kind, the triple, and the list of image requirements. The gpu type (offload-arch) is really an image requirement. In this model, there is no requirement for xnack-any. The lack of the xnack+ or xnack- requirement implies "any" which means it can run on any capable machine. This is a general model that is extensible. To make this work, a runtime must be able to detect the capabilities for any requirement that could be tagged on an image. In fact, every requirement of an embedded image must have its capability detected by the runtime for that offload image to be usable. However, a system's runtime could have more capabilities than the requirements of an image. So in the case of xnack, the lack of xnack- or xnack+ will be acceptable no matter what the xnack capability of the runtime is. If the compiler driver puts the requirement cov4 in the bundle entry id requirements field the runtime will not run that image unless the GPU loader supports v4 or greater. The clang driver can create the requirement xnack- for code object < 4 on those GPUs that support either xnack mode. This will ensure the image will gracefully fail or use an alternative image if the runtime capability is xnack+. But the cov4 requirement is mostly unrelated to xnack . It is about the capability of the GPU loader. If the code object version >= 4, then it will be tagged with the cov4 requirement. This would prevent an old system that does not have a newer software stack from running an image with a cov4 requirement. This general notion of image requirements and runtime capabilities is extensible to other offload architectures. Suppose cuda version 12 compilation REQUIRES that a cuda version 12 runtime. Old runtimes would never display cuv12 capability and would fail to run any image created with the requirement cuv12. gregrodgers: We need to start thinking in terms of offload requirements of a compiled image vs the…
for (const auto &II : Inputs) {		for (const auto &II : Inputs) {
const auto* A = II.getAction();		const auto* A = II.getAction();
BundlerTargetArg = BundlerTargetArg + "," + OffloadKind +		BundlerTargetArg = BundlerTargetArg + "," + OffloadKind +
"-amdgcn-amd-amdhsa--" +		"-amdgcn-amd-amdhsa--" +
StringRef(A->getOffloadingArch()).str();		StringRef(A->getOffloadingArch()).str();
BundlerInputArg = BundlerInputArg + "," + II.getFilename();		BundlerInputArg = BundlerInputArg + "," + II.getFilename();
}		}
BundlerArgs.push_back(Args.MakeArgString(BundlerTargetArg));		BundlerArgs.push_back(Args.MakeArgString(BundlerTargetArg));
▲ Show 20 Lines • Show All 325 Lines • Show Last 20 Lines

clang/test/Driver/hip-code-object-version.hip

	Show All 15 Lines
	// RUN: -mcode-object-version=4 -mcode-object-version=3 \			// RUN: -mcode-object-version=4 -mcode-object-version=3 \
	// RUN: --offload-arch=gfx906 -nogpulib \			// RUN: --offload-arch=gfx906 -nogpulib \
	// RUN: %s 2>&1 \| FileCheck -check-prefix=V3 %s			// RUN: %s 2>&1 \| FileCheck -check-prefix=V3 %s

	// V3-WARN: warning: argument '-mcode-object-v3' is deprecated, use '-mcode-object-version=3' instead [-Wdeprecated]			// V3-WARN: warning: argument '-mcode-object-v3' is deprecated, use '-mcode-object-version=3' instead [-Wdeprecated]
	// V3: "-mllvm" "--amdhsa-code-object-version=3"			// V3: "-mllvm" "--amdhsa-code-object-version=3"
	// V3: "-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx906"			// V3: "-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx906"

	// Check bundle ID for code object v2.			// Check bundle ID for code object v2.

	// RUN: %clang -### -target x86_64-linux-gnu \			// RUN: %clang -### -target x86_64-linux-gnu \
	// RUN: -mno-code-object-v3 \			// RUN: -mno-code-object-v3 \
	// RUN: --offload-arch=gfx906 -nogpulib \			// RUN: --offload-arch=gfx906 -nogpulib \
	// RUN: %s 2>&1 \| FileCheck -check-prefixes=V2,V2-WARN %s			// RUN: %s 2>&1 \| FileCheck -check-prefixes=V2,V2-WARN %s

	// RUN: %clang -### -target x86_64-linux-gnu \			// RUN: %clang -### -target x86_64-linux-gnu \
	// RUN: -mcode-object-version=2 \			// RUN: -mcode-object-version=2 \
	// RUN: --offload-arch=gfx906 -nogpulib \			// RUN: --offload-arch=gfx906 -nogpulib \
	// RUN: %s 2>&1 \| FileCheck -check-prefix=V2 %s			// RUN: %s 2>&1 \| FileCheck -check-prefix=V2 %s

	// V2-WARN: warning: argument '-mno-code-object-v3' is deprecated, use '-mcode-object-version=2' instead [-Wdeprecated]			// V2-WARN: warning: argument '-mno-code-object-v3' is deprecated, use '-mcode-object-version=2' instead [-Wdeprecated]
	// V2: "-mllvm" "--amdhsa-code-object-version=2"			// V2: "-mllvm" "--amdhsa-code-object-version=2"
	// V2: "-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx906"			// V2: "-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx906"

				traUnsubmitted Not Done Reply Inline Actions Nit: it would be nice to move V2 tests above the V3, so the tests are in order. tra: Nit: it would be nice to move V2 tests above the V3, so the tests are in order.
				yaxunlAuthorUnsubmitted Done Reply Inline Actions sorry I missed it. will do. yaxunl: sorry I missed it. will do.
	// Check bundle ID for code object version 4.			// Check bundle ID for code object version 4.

	// RUN: %clang -### -target x86_64-linux-gnu \			// RUN: %clang -### -target x86_64-linux-gnu \
	// RUN: -mcode-object-version=4 \			// RUN: -mcode-object-version=4 \
	// RUN: --offload-arch=gfx906 -nogpulib \			// RUN: --offload-arch=gfx906 -nogpulib \
	// RUN: %s 2>&1 \| FileCheck -check-prefix=V4 %s			// RUN: %s 2>&1 \| FileCheck -check-prefix=V4 %s

	// V4: "-mllvm" "--amdhsa-code-object-version=4"			// V4: "-mllvm" "--amdhsa-code-object-version=4"
	// V4: "-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx906"			// V4: "-targets=host-x86_64-unknown-linux,hipv4-amdgcn-amd-amdhsa--gfx906"

	// Check bundle ID for code object version default			// Check bundle ID for code object version default

	// RUN: %clang -### -target x86_64-linux-gnu \			// RUN: %clang -### -target x86_64-linux-gnu \
	// RUN: --offload-arch=gfx906 -nogpulib \			// RUN: --offload-arch=gfx906 -nogpulib \
	// RUN: %s 2>&1 \| FileCheck -check-prefix=VD %s			// RUN: %s 2>&1 \| FileCheck -check-prefix=VD %s

	// VD: "-mllvm" "--amdhsa-code-object-version=3"			// VD: "-mllvm" "--amdhsa-code-object-version=4"
	// VD: "-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx906"			// VD: "-targets=host-x86_64-unknown-linux,hipv4-amdgcn-amd-amdhsa--gfx906"

	// Check invalid code object version option.			// Check invalid code object version option.

	// RUN: %clang -### -target x86_64-linux-gnu \			// RUN: %clang -### -target x86_64-linux-gnu \
	// RUN: -mcode-object-version=1 \			// RUN: -mcode-object-version=1 \
	// RUN: --offload-arch=gfx906 -nogpulib \			// RUN: --offload-arch=gfx906 -nogpulib \
	// RUN: %s 2>&1 \| FileCheck -check-prefix=INVALID %s			// RUN: %s 2>&1 \| FileCheck -check-prefix=INVALID %s
	// INVALID: error: invalid integral value '1' in '-mcode-object-version=1'			// INVALID: error: invalid integral value '1' in '-mcode-object-version=1'
	Show All 12 Lines

clang/test/Driver/hip-target-id.hip

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	// CHECK-SAME: "-target-cpu" "gfx908"			// CHECK-SAME: "-target-cpu" "gfx908"
	// CHECK-SAME: "-target-feature" "-sramecc"			// CHECK-SAME: "-target-feature" "-sramecc"
	// CHECK-SAME: "-target-feature" "+xnack"			// CHECK-SAME: "-target-feature" "+xnack"

	// CHECK: [[LLD]] {{.*}} "-plugin-opt=mcpu=gfx908"			// CHECK: [[LLD]] {{.*}} "-plugin-opt=mcpu=gfx908"
	// CHECK-SAME: "-plugin-opt=-mattr=-sramecc,+xnack"			// CHECK-SAME: "-plugin-opt=-mattr=-sramecc,+xnack"

	// CHECK: {{"[^"]clang-offload-bundler[^"]"}}			// CHECK: {{"[^"]clang-offload-bundler[^"]"}}
	// CHECK-SAME: "-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx908:sramecc+:xnack+,hip-amdgcn-amd-amdhsa--gfx908:sramecc-:xnack+"			// CHECK-SAME: "-targets=host-x86_64-unknown-linux,hipv4-amdgcn-amd-amdhsa--gfx908:sramecc+:xnack+,hipv4-amdgcn-amd-amdhsa--gfx908:sramecc-:xnack+"

	// Check canonicalization and repeating of target ID.			// Check canonicalization and repeating of target ID.

	// RUN: %clang -### -target x86_64-linux-gnu \			// RUN: %clang -### -target x86_64-linux-gnu \
	// RUN: -x hip \			// RUN: -x hip \
	// RUN: --offload-arch=fiji \			// RUN: --offload-arch=fiji \
	// RUN: --offload-arch=gfx803 \			// RUN: --offload-arch=gfx803 \
	// RUN: --offload-arch=fiji \			// RUN: --offload-arch=fiji \
	// RUN: --rocm-path=%S/Inputs/rocm \			// RUN: --rocm-path=%S/Inputs/rocm \
	// RUN: %s 2>&1 \| FileCheck -check-prefix=FIJI %s			// RUN: %s 2>&1 \| FileCheck -check-prefix=FIJI %s
	// FIJI: "-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx803"			// FIJI: "-targets=host-x86_64-unknown-linux,hipv4-amdgcn-amd-amdhsa--gfx803"

	// RUN: %clang -### -target x86_64-linux-gnu \			// RUN: %clang -### -target x86_64-linux-gnu \
	// RUN: -x hip \			// RUN: -x hip \
	// RUN: --offload-arch=gfx900:xnack- \			// RUN: --offload-arch=gfx900:xnack- \
	// RUN: --offload-arch=gfx900:xnack+ \			// RUN: --offload-arch=gfx900:xnack+ \
	// RUN: --offload-arch=gfx908:sramecc+ \			// RUN: --offload-arch=gfx908:sramecc+ \
	// RUN: --offload-arch=gfx908:sramecc- \			// RUN: --offload-arch=gfx908:sramecc- \
	// RUN: --offload-arch=gfx906 \			// RUN: --offload-arch=gfx906 \
	// RUN: --rocm-path=%S/Inputs/rocm \			// RUN: --rocm-path=%S/Inputs/rocm \
	// RUN: %s 2>&1 \| FileCheck -check-prefix=MULTI %s			// RUN: %s 2>&1 \| FileCheck -check-prefix=MULTI %s
	// MULTI: "-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx900:xnack+,hip-amdgcn-amd-amdhsa--gfx900:xnack-,hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908:sramecc+,hip-amdgcn-amd-amdhsa--gfx908:sramecc-"			// MULTI: "-targets=host-x86_64-unknown-linux,hipv4-amdgcn-amd-amdhsa--gfx900:xnack+,hipv4-amdgcn-amd-amdhsa--gfx900:xnack-,hipv4-amdgcn-amd-amdhsa--gfx906,hipv4-amdgcn-amd-amdhsa--gfx908:sramecc+,hipv4-amdgcn-amd-amdhsa--gfx908:sramecc-"

clang/test/Driver/hip-toolchain-device-only.hip

	Show All 19 Lines
	// CHECK-SAME: "-fcuda-is-device"			// CHECK-SAME: "-fcuda-is-device"
	// CHECK-SAME: "-target-cpu" "gfx900"			// CHECK-SAME: "-target-cpu" "gfx900"
	// CHECK-SAME: {{.}} "-o" [[OBJ_DEV_A_900:".o"]] "-x" "hip"			// CHECK-SAME: {{.}} "-o" [[OBJ_DEV_A_900:".o"]] "-x" "hip"

	// CHECK: [[LLD]] "-flavor" "gnu" "--no-undefined" "-shared"			// CHECK: [[LLD]] "-flavor" "gnu" "--no-undefined" "-shared"
	// CHECK-SAME: "-o" "[[IMG_DEV_A_900:.*out]]" [[OBJ_DEV_A_900]]			// CHECK-SAME: "-o" "[[IMG_DEV_A_900:.*out]]" [[OBJ_DEV_A_900]]

	// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"			// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
	// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"			// CHECK-SAME: "-targets={{.}},hip{{.}}-amdgcn-amd-amdhsa--gfx803,hip{{.*}}-amdgcn-amd-amdhsa--gfx900"
	// CHECK-SAME: "-inputs={{.}},[[IMG_DEV_A_803]],[[IMG_DEV_A_900]]" "-outputs=[[BUNDLE_A:.hipfb]]"			// CHECK-SAME: "-inputs={{.}},[[IMG_DEV_A_803]],[[IMG_DEV_A_900]]" "-outputs=[[BUNDLE_A:.hipfb]]"

clang/test/Driver/hip-toolchain-no-rdc.hip

	Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
	// CHECK-SAME: "-o" "[[IMG_DEV_A_900:.*out]]" [[OBJ_DEV_A_900]]			// CHECK-SAME: "-o" "[[IMG_DEV_A_900:.*out]]" [[OBJ_DEV_A_900]]

	//			//
	// Bundle and embed device code in host object for a.cu.			// Bundle and embed device code in host object for a.cu.
	//			//

	// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"			// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
	// CHECK-SAME: "-bundle-align=4096"			// CHECK-SAME: "-bundle-align=4096"
	// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"			// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
	// CHECK-SAME: "-inputs={{.}},[[IMG_DEV_A_803]],[[IMG_DEV_A_900]]" "-outputs=[[BUNDLE_A:.hipfb]]"			// CHECK-SAME: "-inputs={{.}},[[IMG_DEV_A_803]],[[IMG_DEV_A_900]]" "-outputs=[[BUNDLE_A:.hipfb]]"

	// CHECK: [[CLANG]] "-cc1" "-triple" "x86_64-unknown-linux-gnu"			// CHECK: [[CLANG]] "-cc1" "-triple" "x86_64-unknown-linux-gnu"
	// CHECK-SAME: "-aux-triple" "amdgcn-amd-amdhsa"			// CHECK-SAME: "-aux-triple" "amdgcn-amd-amdhsa"
	// CHECK-SAME: "-emit-obj"			// CHECK-SAME: "-emit-obj"
	// CHECK-SAME: {{.*}} "-main-file-name" "a.cu"			// CHECK-SAME: {{.*}} "-main-file-name" "a.cu"
	// CHECK-SAME: {{.*}} "-fcuda-include-gpubinary" "[[BUNDLE_A]]"			// CHECK-SAME: {{.*}} "-fcuda-include-gpubinary" "[[BUNDLE_A]]"
	// CHECK-SAME: {{.}} "-o" [[A_OBJ_HOST:".o"]] "-x" "hip"			// CHECK-SAME: {{.}} "-o" [[A_OBJ_HOST:".o"]] "-x" "hip"
	▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
	// CHECK-SAME: "-o" "[[IMG_DEV_B_900:.*out]]" [[OBJ_DEV_B_900]]			// CHECK-SAME: "-o" "[[IMG_DEV_B_900:.*out]]" [[OBJ_DEV_B_900]]

	//			//
	// Bundle and embed device code in host object for b.hip.			// Bundle and embed device code in host object for b.hip.
	//			//

	// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"			// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
	// CHECK-SAME: "-bundle-align=4096"			// CHECK-SAME: "-bundle-align=4096"
	// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"			// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
	// CHECK-SAME: "-inputs={{.}},[[IMG_DEV_B_803]],[[IMG_DEV_B_900]]" "-outputs=[[BUNDLE_A:.hipfb]]"			// CHECK-SAME: "-inputs={{.}},[[IMG_DEV_B_803]],[[IMG_DEV_B_900]]" "-outputs=[[BUNDLE_A:.hipfb]]"

	// CHECK: [[CLANG]] "-cc1" "-triple" "x86_64-unknown-linux-gnu"			// CHECK: [[CLANG]] "-cc1" "-triple" "x86_64-unknown-linux-gnu"
	// CHECK-SAME: "-aux-triple" "amdgcn-amd-amdhsa"			// CHECK-SAME: "-aux-triple" "amdgcn-amd-amdhsa"
	// CHECK-SAME: "-emit-obj"			// CHECK-SAME: "-emit-obj"
	// CHECK-SAME: {{.*}} "-main-file-name" "b.hip"			// CHECK-SAME: {{.*}} "-main-file-name" "b.hip"
	// CHECK-SAME: {{.*}} "-fcuda-include-gpubinary" "[[BUNDLE_A]]"			// CHECK-SAME: {{.*}} "-fcuda-include-gpubinary" "[[BUNDLE_A]]"
	// CHECK-SAME: {{.}} "-o" [[B_OBJ_HOST:".o"]] "-x" "hip"			// CHECK-SAME: {{.}} "-o" [[B_OBJ_HOST:".o"]] "-x" "hip"
	Show All 17 Lines

clang/test/Driver/hip-toolchain-rdc-separate.hip

	Show First 20 Lines • Show All 119 Lines • ▼ Show 20 Lines
	// LINK-NOT: "*.llvm-link"			// LINK-NOT: "*.llvm-link"
	// LINK-NOT: ".*opt"			// LINK-NOT: ".*opt"
	// LINK-NOT: ".*llc"			// LINK-NOT: ".*llc"
	// LINK: {{".lld."}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"			// LINK: {{".lld."}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
	// LINK: "-plugin-opt=mcpu=gfx900"			// LINK: "-plugin-opt=mcpu=gfx900"
	// LINK-SAME: "-o" "[[IMG_DEV2:.*.out]]" "[[A_BC2]]" "[[B_BC2]]"			// LINK-SAME: "-o" "[[IMG_DEV2:.*.out]]" "[[A_BC2]]" "[[B_BC2]]"

	// LINK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"			// LINK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
	// LINK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"			// LINK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
	// LINK-SAME: "-inputs={{.}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.hipfb]]"			// LINK-SAME: "-inputs={{.}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.hipfb]]"

	// LINK: {{".llvm-mc."}} "-o" "[[OBJBUNDLE:.o]]" "{{.}}.mcin" "--filetype=obj"			// LINK: {{".llvm-mc."}} "-o" "[[OBJBUNDLE:.o]]" "{{.}}.mcin" "--filetype=obj"

	// LINK: [[LD:".ld."]] {{.}} "-o" "a.out" {{.}} "[[A_OBJ_HOST]]"			// LINK: [[LD:".ld."]] {{.}} "-o" "a.out" {{.}} "[[A_OBJ_HOST]]"
	// LINK-SAME: "[[B_OBJ_HOST]]" "[[OBJBUNDLE]]"			// LINK-SAME: "[[B_OBJ_HOST]]" "[[OBJBUNDLE]]"

clang/test/Driver/hip-toolchain-rdc-static-lib.hip

	Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines
	// CHECK-NOT: ".*opt"			// CHECK-NOT: ".*opt"
	// CHECK-NOT: ".*llc"			// CHECK-NOT: ".*llc"
	// CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"			// CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
	// CHECK-SAME: "-plugin-opt=mcpu=gfx900"			// CHECK-SAME: "-plugin-opt=mcpu=gfx900"
	// CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]			// CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]

	// combine images generated into hip fat binary object			// combine images generated into hip fat binary object
	// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"			// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
	// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"			// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
	// CHECK-SAME: "-inputs={{.}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.hipfb]]"			// CHECK-SAME: "-inputs={{.}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.hipfb]]"

	// CHECK: [[MC:".llvm-mc."]] "-o" [[OBJBUNDLE:".o"]] "{{.}}.mcin" "--filetype=obj"			// CHECK: [[MC:".llvm-mc."]] "-o" [[OBJBUNDLE:".o"]] "{{.}}.mcin" "--filetype=obj"

	// CHECK: [[AR:".llvm-ar."]] "rcsD" "{{.*}}.out" [[A_OBJ_HOST]] [[B_OBJ_HOST]] [[OBJBUNDLE]]			// CHECK: [[AR:".llvm-ar."]] "rcsD" "{{.*}}.out" [[A_OBJ_HOST]] [[B_OBJ_HOST]] [[OBJBUNDLE]]

clang/test/Driver/hip-toolchain-rdc.hip

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	// CHECK-NOT: ".*llc"			// CHECK-NOT: ".*llc"
	// CHECK: {{".lld."}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"			// CHECK: {{".lld."}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
	// CHECK-SAME: "-plugin-opt=mcpu=gfx900"			// CHECK-SAME: "-plugin-opt=mcpu=gfx900"
	// CHECK-SAME: "-o" "[[IMG_DEV2:.*.out]]" [[A_BC2]] [[B_BC2]]			// CHECK-SAME: "-o" "[[IMG_DEV2:.*.out]]" [[A_BC2]] [[B_BC2]]

	// combine images generated into hip fat binary object			// combine images generated into hip fat binary object
	// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"			// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
	// CHECK-SAME: "-bundle-align=4096"			// CHECK-SAME: "-bundle-align=4096"
	// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"			// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
	// CHECK-SAME: "-inputs={{.}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.hipfb]]"			// CHECK-SAME: "-inputs={{.}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.hipfb]]"

	// CHECK: [[MC:".llvm-mc."]] "-o" [[OBJBUNDLE:".o"]] "{{.}}.mcin" "--filetype=obj"			// CHECK: [[MC:".llvm-mc."]] "-o" [[OBJBUNDLE:".o"]] "{{.}}.mcin" "--filetype=obj"

	// output the executable			// output the executable
	// CHECK: [[LD:".ld."]] {{.}}"-o" "a.out" {{.}} [[A_OBJ_HOST]] [[B_OBJ_HOST]] [[OBJBUNDLE]]			// CHECK: [[LD:".ld."]] {{.}}"-o" "a.out" {{.}} [[A_OBJ_HOST]] [[B_OBJ_HOST]] [[OBJBUNDLE]]

clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp

Show First 20 Lines • Show All 1,137 Lines • ▼ Show 20 Lines	for (StringRef Target : TargetNames) {
StringRef Triple;		StringRef Triple;
getOffloadKindAndTriple(Target, Kind, Triple);		getOffloadKindAndTriple(Target, Kind, Triple);

bool KindIsValid = !Kind.empty();		bool KindIsValid = !Kind.empty();
KindIsValid = KindIsValid && StringSwitch<bool>(Kind)		KindIsValid = KindIsValid && StringSwitch<bool>(Kind)
.Case("host", true)		.Case("host", true)
.Case("openmp", true)		.Case("openmp", true)
.Case("hip", true)		.Case("hip", true)
		.Case("hipv4", true)
.Default(false);		.Default(false);

bool TripleIsValid = !Triple.empty();		bool TripleIsValid = !Triple.empty();
llvm::Triple T(Triple);		llvm::Triple T(Triple);
TripleIsValid &= T.getArch() != Triple::UnknownArch;		TripleIsValid &= T.getArch() != Triple::UnknownArch;

if (!KindIsValid \|\| !TripleIsValid) {		if (!KindIsValid \|\| !TripleIsValid) {
SmallVector<char, 128u> Buf;		SmallVector<char, 128u> Buf;
Show All 29 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[HIP] Change to code object v4ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 332859

clang/lib/Driver/ToolChains/CommonArgs.cpp

clang/lib/Driver/ToolChains/HIP.cpp

clang/test/Driver/hip-code-object-version.hip

clang/test/Driver/hip-target-id.hip

clang/test/Driver/hip-toolchain-device-only.hip

clang/test/Driver/hip-toolchain-no-rdc.hip

clang/test/Driver/hip-toolchain-rdc-separate.hip

clang/test/Driver/hip-toolchain-rdc-static-lib.hip

clang/test/Driver/hip-toolchain-rdc.hip

clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp

[HIP] Change to code object v4
ClosedPublic