This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][HIP] Fix linkage for -fgpu-rdc
ClosedPublic

Authored by yaxunl on Oct 28 2020, 7:58 AM.

Details

Summary

Currently for explicit template function instantiation in CUDA/HIP device
compilation clang emits instantiated kernel with external linkage
and instantiated device function with internal linkage.

This is fine for -fno-gpu-rdc since there is only one TU.

However this causes duplicate symbols for kernels for -fgpu-rdc if
the same instantiation happen in multiple TU. Or missing symbols
if a device function calls an explicitly instantiated template function
in a different TU.

To make explicit template function instantiation work for
-fgpu-rdc we need to follow the C++ linkage paradigm, i.e.
use weak_odr linkage.

Diff Detail

Event Timeline

yaxunl requested review of this revision.Oct 28 2020, 7:58 AM
yaxunl created this revision.
ashi1 added a subscriber: ashi1.Oct 28 2020, 8:09 AM
tra accepted this revision.Nov 2 2020, 2:25 PM

LGTM.

This revision is now accepted and ready to land.Nov 2 2020, 2:25 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2020, 5:08 AM