This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][HIP] Fix device variable linkage
ClosedPublic

Authored by yaxunl on Feb 2 2021, 3:19 PM.

Details

Summary

For -fgpu-rdc, shadow variables should not be internalized, otherwise
they cannot be accessed by other TUs. This is necessary because
the shadow variable of external device variables are always
emitted as undefined symbols, which need to resolve to a global
symbols.

Managed variables need to be emitted as undefined symbols
in device compilations.

Diff Detail

Event Timeline

yaxunl requested review of this revision.Feb 2 2021, 3:19 PM
yaxunl created this revision.
tra added a comment.Feb 3 2021, 9:58 AM

For -fgpu-rdc, shadow variables should not be internalized, otherwise they cannot be accessed by other TUs.
This is necessary because the shadow variable of external device variables are always emitted as undefined symbols, which need to resolve to a global symbols.

Does it mean that we always internalize shadows *now*, before this patch? That would indeed be wrong. Shadow's visibility should be the same as that of a variable without a __device__.

Next question is whether the behavior should be different for -fgpu-rdc vs non-RDC. I think shadows should behave the same regardless of RDC mode and make things work closer to what users would expect from regular C++ code. We already have mechanisms to uniquify the variables, if necessary, so we don't need to use visibility to avoid cross-TU linking conflicts.

clang/lib/AST/ASTContext.cpp
11437–11443

I can't parse it. :-( Perhaps we can split it.

In D95901#2539663, @tra wrote:

For -fgpu-rdc, shadow variables should not be internalized, otherwise they cannot be accessed by other TUs.
This is necessary because the shadow variable of external device variables are always emitted as undefined symbols, which need to resolve to a global symbols.

Does it mean that we always internalize shadows *now*, before this patch? That would indeed be wrong. Shadow's visibility should be the same as that of a variable without a __device__.

Yes before this patch we always internalize shadow vars.

Next question is whether the behavior should be different for -fgpu-rdc vs non-RDC. I think shadows should behave the same regardless of RDC mode and make things work closer to what users would expect from regular C++ code. We already have mechanisms to uniquify the variables, if necessary, so we don't need to use visibility to avoid cross-TU linking conflicts.

For -fno-gpu-rdc, two TU's can have global device variables with the same name, therefore the shadow variables need to be internalized, which is fine since -fno-gpu-rdc is not supposed to access device variables across TU's. So far users did not report issues about shadow var linkage for -fno-gpu-rdc (they reported shadow var linkage issues for -fgpu-rdc only). Let the shadow vars keep original linkage and use CUID to make them unique would be unnecessary and clutter the IR.

tra accepted this revision.Feb 3 2021, 12:14 PM

For -fno-gpu-rdc, two TU's can have global device variables with the same name, therefore the shadow variables need to be internalized, which is fine since -fno-gpu-rdc is not supposed to access device variables across TU's. So far users did not report issues about shadow var linkage for -fno-gpu-rdc (they reported shadow var linkage issues for -fgpu-rdc only). Let the shadow vars keep original linkage and use CUID to make them unique would be unnecessary and clutter the IR.

I've checked what NVCC does in this case and their behavior appears to match your approach. Shadows are internalized w/o RDC and become visible when it's on.

This revision is now accepted and ready to land.Feb 3 2021, 12:14 PM
yaxunl marked an inline comment as done.Feb 3 2021, 12:18 PM
yaxunl added inline comments.
clang/lib/AST/ASTContext.cpp
11437–11443

revised

tra added inline comments.Feb 3 2021, 12:47 PM
clang/lib/AST/ASTContext.cpp
11437–11443

I still can't read it. way too many nested operands to logical operators...

I was thinking of extracting parts of the equation into early returns or intermediate variables.

yaxunl updated this revision to Diff 321198.Feb 3 2021, 12:56 PM
yaxunl marked an inline comment as done.

Revised by Artem's comments.

yaxunl updated this revision to Diff 321203.Feb 3 2021, 1:13 PM

clang-format the patch

tra added inline comments.Feb 3 2021, 1:50 PM
clang/lib/AST/ASTContext.cpp
11437–11443

Thank you for refactoring this. Now it's soo much easier to see what's going on.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2021, 12:12 PM