Templated kernels that were instantiated from the host code would normally be eliminated because they were never referenced on device side.
Add implicit 'used' attribute to global functions which prevents their elimination.
Details
Details
- Reviewers
eliben echristo - Commits
- rGc3fa25def761: [CUDA] Add implicit __attribute__((used)) to all __global__ functions.
rGb7e4aab40cd4: [CUDA] Add implicit __attribute__((used)) to all __global__ functions.
rC248293: [CUDA] Add implicit __attribute__((used)) to all __global__ functions.
rC244501: [CUDA] Add implicit __attribute__((used)) to all __global__ functions.
rL248293: [CUDA] Add implicit __attribute__((used)) to all __global__ functions.
rL244501: [CUDA] Add implicit __attribute__((used)) to all __global__ functions.
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
Couldn't you just add an implicit UsedAttr when processing the CUDAGlobalAttr
and LangOpts.CUDAIsDevice was set to true?