This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Give templated device functions internal linkage, templated kernels external linkage.
ClosedPublic

Authored by jlebar on Jun 14 2016, 12:55 PM.

Details

Summary

This lets LLVM perform IPO over these functions. In particular, it
allows LLVM to emit ld.global.nc for loads to __restrict pointers in
kernels that are never written to.

Diff Detail

Repository
rL LLVM

Event Timeline

jlebar updated this revision to Diff 60728.Jun 14 2016, 12:55 PM
jlebar retitled this revision from to [CUDA] Give templated device functions internal linkage, templated kernels external linkage..
jlebar updated this object.
jlebar added a reviewer: rsmith.
jlebar added subscribers: tra, cfe-commits.

tra makes the good point that maybe this should be done in ASTContext, where we already have a special case for global. (I think I gravitated to doing it this way because the GVA* enums have zero documentation -- at least I have a vague idea of what the LLVM attributes do. :)

Suggestions welcome, Richard, this patch is just a guess.

jlebar added a reviewer: rnk.Jun 20 2016, 4:42 PM

Friendly ping.

rnk accepted this revision.Jun 30 2016, 8:24 AM
rnk edited edge metadata.

lgtm Doing this in CodeGen seems right.

This revision is now accepted and ready to land.Jun 30 2016, 8:24 AM
This revision was automatically updated to reflect the committed changes.