Add device variables to llvm.compiler.used if they are
ODR-used by either host or device functions.
This is necessary to prevent them from being
eliminated by whole-program optimization
where the compiler has no way to know a device
variable is used by some host code.
Do we want to limit it further to only externally-visible variables?
I think we already externalize the variables we want to be visible across host/device boundary.
If the variable is not visible, there's no point keeping it around as the runtime will not be able to find it in the GPU binary.