In order to support CUDA we need a way to link in subset of functions from bitcode library.
Current way to do that suggested in NVPTX guide (http://llvm.org/docs/NVPTXUsage.html) requires linking in complete library, internalizing all symbols except those that were originally present in TU before linking and running GDCE pass to eliminate bitcode we don't need.
Considering that we only need fairly small subset of functions from the library, better way to do that would be to directly link in only the symbols needed by the destination module and internalize them in process, if required.
This patch adds two new linker flags to do exactly that.
- -only-needed -- links in only symbols needed by destination module
- -internalize -- internalize linked symbols.