[libomptarget][nfc] Explicitly static function scope shared variables
__shared__ in CUDA implies static in function scope. See e.g. D.2.1.1
in CUDA_C_Programming_Guide.pdf,
http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/
This is surprising for non-cuda developers, see e.g. D73239 where I thought
local variables would be thread local.
Tested by IR diff of libomptarget.bc (no change), running in tree tests,
and binary diff of the nvcc static archives (no significant change).