This patch detects system GPU and use them
in --offload-arch if 'native' is specified. If system GPU
cannot be detected clang will fall back to the default GPU arch.
Details
- Reviewers
tra - Commits
- rGe8fd998e6194: [HIP] support --offload-arch=native
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This patch detects system GPU and use them in --offload-arch if not specified. If system GPU cannot be detected clang will fall back to gfx803.
I don't think auto-probing is something we want to do by default.
IMO, we should be following -march=native behavior here and check the available hardware variant only when the user requested it. Otherwise, the same compiler command line will produce different outputs, without the user requesting or even being aware of that.
I understand your point. However, when users use gcc or clang and do not specify target or CPU, the compiled program will execute on their system. This is because gcc or clang has a default target and CPU that works for the system. This is not the case for HIP since the default gfx803 will not work for the system unless the system happens to have a gfx803. To let the users have a similar experience with the default target/CPU as gcc/clang, HIP needs to assume the system GPU as the default. Users could specify --offload-arch explicitly if they want to compile for certain GPU's.
This is not a new problem. I had the same problem with the choice of defaults for CUDA compilation and came to the conclusion that there's no such thing for GPUs and that the best we can do is to pick the most 'conservative' target. It may have worked for CUDA as PTX for an old GPU could in theory be JIT-compiled if the executable ran on a newer GPU, but the user is generally expected to specify the architectures they want to target. The default is, essentially, equally bad for everyone who happens to use it. In that sense gfx803 or any other architecture would do that job just fine.
The auto-detection assumes that the host where the build is done is also the target the executable is going to run on. This is not true in many cases.
I'd go as far as to say that in absolute terms most builds for GPUs out there are done on the machines without GPUs.
Given the lack of a universally sensible default, I would prefer to require the user to explicitly specify the targets they want, with auto/native as an option.
On the other hand, OpenMP already appears to do some automagic detection of GPUs, so perhaps it is useful enough to do for HIP, too.
In any case, I think this is something that may need a wider forum. Ask on LLVM discourse?
clang/lib/Driver/Driver.cpp | ||
---|---|---|
3058 | will do | |
3075 | Yes. amdgpu-arch calls ROCm runtime to get GPU arch, which uses canonical gfx names https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/adae6c61e10d371f7cbc3d0e94ae2c070cab18a4/src/core/runtime/isa.cpp#L245 |
Nit: It's used only once. May as well compare with "native" directly.