I didn't realize HIP was a distinct offloading kind, so the subtarget
was looking for -march, which isn't correct for HIP. We also have the
possibility of different denormal defaults in the case of multiple
offload targets, so we need to thread the JobAction through the target
hook.
Details
Diff Detail
Event Timeline
clang/lib/Driver/ToolChains/AMDGPU.cpp | ||
---|---|---|
286 | If there are multiple --cuda-gpu-arch, driver will create separate JobAction for launching separate clang -cc1 command for each arch. This function is called for each JobAction and getOffloadingArch contains the single arch. Therefore there is no issue for multiple --cuda-gpu-arch and this comment can be removed. | |
clang/test/Driver/cuda-flush-denormals-to-zero.cu | ||
27 | this will result in multiple clang -cc1 commands, each one corresponding to an arch. You need to check each arch. |
clang/test/Driver/cuda-flush-denormals-to-zero.cu | ||
---|---|---|
27 | Since the flag is not printed for the default case, having a second arch check line would interfere with the -NOT check, as there is no CHECK-SAME-NOT |
CUDA currently isn't changing the default FTZ mode based on the subtarget, which differs from nvcc according to the documentation
If there are multiple --cuda-gpu-arch, driver will create separate JobAction for launching separate clang -cc1 command for each arch. This function is called for each JobAction and getOffloadingArch contains the single arch. Therefore there is no issue for multiple --cuda-gpu-arch and this comment can be removed.