This is an archive of the discontinued LLVM Phabricator instance.

[HIP] support --offload-arch=native
ClosedPublic

Authored by yaxunl on Nov 30 2022, 2:41 PM.

Details

Summary

This patch detects system GPU and use them
in --offload-arch if 'native' is specified. If system GPU
cannot be detected clang will fall back to the default GPU arch.

Diff Detail

Event Timeline

yaxunl created this revision.Nov 30 2022, 2:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2022, 2:41 PM
yaxunl requested review of this revision.Nov 30 2022, 2:41 PM
yaxunl edited the summary of this revision. (Show Details)Nov 30 2022, 2:42 PM
tra added a comment.Nov 30 2022, 4:18 PM

This patch detects system GPU and use them in --offload-arch if not specified. If system GPU cannot be detected clang will fall back to gfx803.

I don't think auto-probing is something we want to do by default.

IMO, we should be following -march=native behavior here and check the available hardware variant only when the user requested it. Otherwise, the same compiler command line will produce different outputs, without the user requesting or even being aware of that.

This patch detects system GPU and use them in --offload-arch if not specified. If system GPU cannot be detected clang will fall back to gfx803.

I don't think auto-probing is something we want to do by default.

IMO, we should be following -march=native behavior here and check the available hardware variant only when the user requested it. Otherwise, the same compiler command line will produce different outputs, without the user requesting or even being aware of that.

I understand your point. However, when users use gcc or clang and do not specify target or CPU, the compiled program will execute on their system. This is because gcc or clang has a default target and CPU that works for the system. This is not the case for HIP since the default gfx803 will not work for the system unless the system happens to have a gfx803. To let the users have a similar experience with the default target/CPU as gcc/clang, HIP needs to assume the system GPU as the default. Users could specify --offload-arch explicitly if they want to compile for certain GPU's.

yaxunl updated this revision to Diff 479365.Dec 1 2022, 11:09 AM

fix error handling

tra added a comment.Dec 1 2022, 1:49 PM

I understand your point. However, when users use gcc or clang and do not specify target or CPU, the compiled program will execute on their system.
This is because gcc or clang has a default target and CPU that works for the system.
This is not the case for HIP since the default gfx803 will not work for the system unless the system happens to have a gfx803.
To let the users have a similar experience with the default target/CPU as gcc/clang, HIP needs to assume the system GPU as the default. Users could specify --offload-arch explicitly if they want to compile for certain GPU's.

This is not a new problem. I had the same problem with the choice of defaults for CUDA compilation and came to the conclusion that there's no such thing for GPUs and that the best we can do is to pick the most 'conservative' target. It may have worked for CUDA as PTX for an old GPU could in theory be JIT-compiled if the executable ran on a newer GPU, but the user is generally expected to specify the architectures they want to target. The default is, essentially, equally bad for everyone who happens to use it. In that sense gfx803 or any other architecture would do that job just fine.

The auto-detection assumes that the host where the build is done is also the target the executable is going to run on. This is not true in many cases.
I'd go as far as to say that in absolute terms most builds for GPUs out there are done on the machines without GPUs.
Given the lack of a universally sensible default, I would prefer to require the user to explicitly specify the targets they want, with auto/native as an option.

On the other hand, OpenMP already appears to do some automagic detection of GPUs, so perhaps it is useful enough to do for HIP, too.

In any case, I think this is something that may need a wider forum. Ask on LLVM discourse?

yaxunl added a comment.Dec 1 2022, 3:18 PM

In any case, I think this is something that may need a wider forum. Ask on LLVM discourse?

RFC opened at discord https://discourse.llvm.org/t/rfc-let-clang-use-system-gpu-as-default-offload-arch-for-hip/66950

yaxunl updated this revision to Diff 482234.Dec 12 2022, 12:54 PM
yaxunl retitled this revision from [HIP] use detected GPU in --offload-arch to [HIP] support --offload-arch=native.
yaxunl edited the summary of this revision. (Show Details)

use detected GPU when --offload-arch=native is specified based on RFC discussion

yaxunl updated this revision to Diff 482240.Dec 12 2022, 1:04 PM

fix error handling

tra accepted this revision.Dec 12 2022, 2:37 PM
tra added inline comments.
clang/lib/Driver/Driver.cpp
3058

Nit: It's used only once. May as well compare with "native" directly.

3075

Does it guarantee that a returned list of architectures contains only canonical names?

This revision is now accepted and ready to land.Dec 12 2022, 2:37 PM
yaxunl marked 2 inline comments as done.Dec 12 2022, 6:25 PM
yaxunl added inline comments.
clang/lib/Driver/Driver.cpp
3058

will do

3075

Yes. amdgpu-arch calls ROCm runtime to get GPU arch, which uses canonical gfx names https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/adae6c61e10d371f7cbc3d0e94ae2c070cab18a4/src/core/runtime/isa.cpp#L245

This revision was landed with ongoing or failed builds.Dec 13 2022, 7:10 AM
This revision was automatically updated to reflect the committed changes.
yaxunl marked 2 inline comments as done.
Herald added a project: Restricted Project. · View Herald TranscriptDec 13 2022, 7:10 AM