This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/CodeGen/
-
lib/
-
CodeGen/
2
CodeGenModule.cpp

Differential D63277

[CUDA][HIP] Don't set "comdat" attribute for CUDA device stub functions.
AcceptedPublic

Authored by kpyzhov on Jun 13 2019, 8:27 AM.

Download Raw Diff

Details

Reviewers

rjmccall
tra

Summary

When compiling the HOST part of CUDA programs, clang replaces device kernels with so-called "stub" functions that contains a few calls to the Runtime API functions (which set the kernel argument values and launch the kernel itself). The stub functions are very small, so they may have identical generated code for different kernels with same arguments.
The Microsoft Linker has an optimization called "COMDAT Folding". It's able to detect functions with identical binary code and "merge" them, i.e. replace calls and pointers to those different functions with call/pointer to one of them and eliminate other copies.

Here is the description of this optimization: https://docs.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=vs-2019
That page contains a warning about "COMDAT Folding":
"Because /OPT:ICF can cause the same address to be assigned to different functions or read-only data members (that is, const variables when compiled by using /Gy), it can break a program that depends on unique addresses for functions or read-only data members."
That's exactly what happens to the CUDA stub functions.

This change disables setting "COMDAT" attribute for CUDA stub functions in the HOST code.

Diff Detail

Event Timeline

kpyzhov created this revision.Jun 13 2019, 8:27 AM

Herald added a subscriber: cfe-commits. · View Herald TranscriptJun 13 2019, 8:27 AM

kpyzhov edited the summary of this revision. (Show Details)Jun 13 2019, 8:33 AM

kpyzhov retitled this revision from Don't set "comdat" attribute for CUDA device stub functions. to [CUDA][HIP] Don't set "comdat" attribute for CUDA device stub functions..Jun 19 2019, 8:09 AM

kpyzhov added a reviewer: tra.

SGTM in principle. Folding the stubs would be bad as their addresses are implicitly used to identify the kernels to launch.

clang/lib/CodeGen/CodeGenModule.cpp
4294	Perhaps this should be pushed further down into `shouldBeInCOMDAT()` which seems to be the right place to decide whether something is not suitable for comdat.

kpyzhov added inline comments.Jun 19 2019, 11:06 AM

clang/lib/CodeGen/CodeGenModule.cpp
4294	Good idea. Let me do that.

kpyzhov updated this revision to Diff 205655.Jun 19 2019, 12:22 PM

This optimization is disabled for functions not in COMDAT sections? Is that documented somewhere?

tra accepted this revision.Jun 19 2019, 12:30 PM

This revision is now accepted and ready to land.Jun 19 2019, 12:30 PM

In D63277#1550870, @rjmccall wrote:

This optimization is disabled for functions not in COMDAT sections? Is that documented somewhere?

It is documented here: https://docs.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=vs-2019

Alright, thanks. I agree that per the documentation this should be sufficient.

tra mentioned this in D112492: [CUDA][HIP] Allow comdat for kernels.Nov 9 2021, 11:04 AM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CodeGenModule.cpp

5 lines

Diff 205655

clang/lib/CodeGen/CodeGenModule.cpp

Show First 20 Lines • Show All 3,700 Lines • ▼ Show 20 Lines	void CodeGenModule::MaybeHandleStaticInExternC(const SomeDecl *D,
if (!R.second)		if (!R.second)
R.first->second = nullptr;		R.first->second = nullptr;
}		}

static bool shouldBeInCOMDAT(CodeGenModule &CGM, const Decl &D) {		static bool shouldBeInCOMDAT(CodeGenModule &CGM, const Decl &D) {
if (!CGM.supportsCOMDAT())		if (!CGM.supportsCOMDAT())
return false;		return false;

		// Do not set COMDAT attribute for CUDA/HIP stub functions to prevent
		// them being "merged" by the COMDAT Folding linker optimization.
		if (D.hasAttr<CUDAGlobalAttr>())
		return false;

if (D.hasAttr<SelectAnyAttr>())		if (D.hasAttr<SelectAnyAttr>())
return true;		return true;

GVALinkage Linkage;		GVALinkage Linkage;
if (auto *VD = dyn_cast<VarDecl>(&D))		if (auto *VD = dyn_cast<VarDecl>(&D))
Linkage = CGM.getContext().GetGVALinkageForVariable(VD);		Linkage = CGM.getContext().GetGVALinkageForVariable(VD);
else		else
Linkage = CGM.getContext().GetGVALinkageForFunction(cast<FunctionDecl>(&D));		Linkage = CGM.getContext().GetGVALinkageForFunction(cast<FunctionDecl>(&D));
▲ Show 20 Lines • Show All 564 Lines • ▼ Show 20 Lines	void CodeGenModule::EmitGlobalFunctionDefinition(GlobalDecl GD,
// generating code for it because various parts of IR generation		// generating code for it because various parts of IR generation
// want to propagate this information down (e.g. to local static		// want to propagate this information down (e.g. to local static
// declarations).		// declarations).
auto *Fn = cast<llvm::Function>(GV);		auto *Fn = cast<llvm::Function>(GV);
setFunctionLinkage(GD, Fn);		setFunctionLinkage(GD, Fn);

// FIXME: this is redundant with part of setFunctionDefinitionAttributes		// FIXME: this is redundant with part of setFunctionDefinitionAttributes
setGVProperties(Fn, GD);		setGVProperties(Fn, GD);

		traUnsubmitted Not Done Reply Inline Actions Perhaps this should be pushed further down into `shouldBeInCOMDAT()` which seems to be the right place to decide whether something is not suitable for comdat. tra: Perhaps this should be pushed further down into `shouldBeInCOMDAT()` which seems to be the…
		kpyzhovAuthorUnsubmitted Not Done Reply Inline Actions Good idea. Let me do that. kpyzhov: Good idea. Let me do that.
MaybeHandleStaticInExternC(D, Fn);		MaybeHandleStaticInExternC(D, Fn);


maybeSetTrivialComdat(D, Fn);		maybeSetTrivialComdat(D, Fn);

CodeGenFunction(*this).GenerateCode(D, Fn, FI);		CodeGenFunction(*this).GenerateCode(D, Fn, FI);

setNonAliasAttributes(GD, Fn);		setNonAliasAttributes(GD, Fn);
▲ Show 20 Lines • Show All 1,497 Lines • Show Last 20 Lines