GPUs do not have actual FP128 support, but we do need to be able to compile
host-side headers which use __float128. On the GPU side we'll downgrade __float128
to double, similarly to how we handle long double. Both types will have
different in-memory representation compared to their host counterparts and are
not expected to be interchangeable across host/device boundary.
Also see https://reviews.llvm.org/D78513 which applied equivalent change to
HIP/AMDGPU.