This header creates macros _DEVICE_ARCH and _DEVICE_GPU with values. This
header exists because compiler macros are inconsistent in specifying if a
compiliation is a device pass or a host pass. There is also inconsistency in
how the device architecture and type are specified during a device pass. The
inconsistencies are between OpenMP, CUDA, HIP, and OpenCL. The macro logic
in this header is aware of these inconsistencies and sets useful values for
_DEVICE_ARCH and _DEVICE_GPU during a device compilation. The macros will
not be defined during a host compilation pass. So "#ifndef _DEVICE_ARCH" can
be used by users to imply a host compilation. This header must remain a
preprocessing header only because it is intended to be used by different
languages.
Originally authored by Greg Rodgers (@gregrodgers).
After @MaskRay noticed this, I think this should be __offload_macros.h to make it clear this is an internal header.