This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][HIP] Don't set "comdat" attribute for CUDA device stub functions.
AcceptedPublic

Authored by kpyzhov on Jun 13 2019, 8:27 AM.

Details

Reviewers
rjmccall
tra
Summary

When compiling the HOST part of CUDA programs, clang replaces device kernels with so-called "stub" functions that contains a few calls to the Runtime API functions (which set the kernel argument values and launch the kernel itself). The stub functions are very small, so they may have identical generated code for different kernels with same arguments.
The Microsoft Linker has an optimization called "COMDAT Folding". It's able to detect functions with identical binary code and "merge" them, i.e. replace calls and pointers to those different functions with call/pointer to one of them and eliminate other copies.

Here is the description of this optimization: https://docs.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=vs-2019
That page contains a warning about "COMDAT Folding":
"Because /OPT:ICF can cause the same address to be assigned to different functions or read-only data members (that is, const variables when compiled by using /Gy), it can break a program that depends on unique addresses for functions or read-only data members."
That's exactly what happens to the CUDA stub functions.

This change disables setting "COMDAT" attribute for CUDA stub functions in the HOST code.

Diff Detail

Repository
rC Clang

Event Timeline

kpyzhov created this revision.Jun 13 2019, 8:27 AM
kpyzhov edited the summary of this revision. (Show Details)Jun 13 2019, 8:33 AM
kpyzhov retitled this revision from Don't set "comdat" attribute for CUDA device stub functions. to [CUDA][HIP] Don't set "comdat" attribute for CUDA device stub functions..Jun 19 2019, 8:09 AM
kpyzhov added a reviewer: tra.
tra added a comment.Jun 19 2019, 10:38 AM

SGTM in principle. Folding the stubs would be bad as their addresses are implicitly used to identify the kernels to launch.

clang/lib/CodeGen/CodeGenModule.cpp
4294

Perhaps this should be pushed further down into shouldBeInCOMDAT() which seems to be the right place to decide whether something is not suitable for comdat.

kpyzhov added inline comments.Jun 19 2019, 11:06 AM
clang/lib/CodeGen/CodeGenModule.cpp
4294

Good idea. Let me do that.

kpyzhov updated this revision to Diff 205655.Jun 19 2019, 12:22 PM

This optimization is disabled for functions not in COMDAT sections? Is that documented somewhere?

tra accepted this revision.Jun 19 2019, 12:30 PM
This revision is now accepted and ready to land.Jun 19 2019, 12:30 PM

This optimization is disabled for functions not in COMDAT sections? Is that documented somewhere?

It is documented here: https://docs.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=vs-2019

rjmccall accepted this revision.Jun 19 2019, 6:23 PM

Alright, thanks. I agree that per the documentation this should be sufficient.