Download Raw Diff

Details

Reviewers

Commits

rGe8fd998e6194: [HIP] support --offload-arch=native

Summary

This patch detects system GPU and use them
in --offload-arch if 'native' is specified. If system GPU
cannot be detected clang will fall back to the default GPU arch.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

yaxunl created this revision.Nov 30 2022, 2:41 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2022, 2:41 PM

Herald added subscribers: kosarev, kerbowa, jvesely. · View Herald Transcript

yaxunl requested review of this revision.Nov 30 2022, 2:41 PM

Herald added a subscriber: MaskRay. · View Herald TranscriptNov 30 2022, 2:41 PM

yaxunl edited the summary of this revision. (Show Details)Nov 30 2022, 2:42 PM

Harbormaster completed remote builds in B200361: Diff 479071.Nov 30 2022, 3:19 PM

This patch detects system GPU and use them in --offload-arch if not specified. If system GPU cannot be detected clang will fall back to gfx803.

I don't think auto-probing is something we want to do by default.

IMO, we should be following -march=native behavior here and check the available hardware variant only when the user requested it. Otherwise, the same compiler command line will produce different outputs, without the user requesting or even being aware of that.

In D139045#3961931, @tra wrote:

This patch detects system GPU and use them in --offload-arch if not specified. If system GPU cannot be detected clang will fall back to gfx803.

I don't think auto-probing is something we want to do by default.

IMO, we should be following -march=native behavior here and check the available hardware variant only when the user requested it. Otherwise, the same compiler command line will produce different outputs, without the user requesting or even being aware of that.

I understand your point. However, when users use gcc or clang and do not specify target or CPU, the compiled program will execute on their system. This is because gcc or clang has a default target and CPU that works for the system. This is not the case for HIP since the default gfx803 will not work for the system unless the system happens to have a gfx803. To let the users have a similar experience with the default target/CPU as gcc/clang, HIP needs to assume the system GPU as the default. Users could specify --offload-arch explicitly if they want to compile for certain GPU's.

fix error handling

In D139045#3962354, @yaxunl wrote:

I understand your point. However, when users use gcc or clang and do not specify target or CPU, the compiled program will execute on their system.
This is because gcc or clang has a default target and CPU that works for the system.
This is not the case for HIP since the default gfx803 will not work for the system unless the system happens to have a gfx803.
To let the users have a similar experience with the default target/CPU as gcc/clang, HIP needs to assume the system GPU as the default. Users could specify --offload-arch explicitly if they want to compile for certain GPU's.

This is not a new problem. I had the same problem with the choice of defaults for CUDA compilation and came to the conclusion that there's no such thing for GPUs and that the best we can do is to pick the most 'conservative' target. It may have worked for CUDA as PTX for an old GPU could in theory be JIT-compiled if the executable ran on a newer GPU, but the user is generally expected to specify the architectures they want to target. The default is, essentially, equally bad for everyone who happens to use it. In that sense gfx803 or any other architecture would do that job just fine.

The auto-detection assumes that the host where the build is done is also the target the executable is going to run on. This is not true in many cases.
I'd go as far as to say that in absolute terms most builds for GPUs out there are done on the machines without GPUs.
Given the lack of a universally sensible default, I would prefer to require the user to explicitly specify the targets they want, with auto/native as an option.

On the other hand, OpenMP already appears to do some automagic detection of GPUs, so perhaps it is useful enough to do for HIP, too.

In any case, I think this is something that may need a wider forum. Ask on LLVM discourse?

Harbormaster completed remote builds in B200578: Diff 479365.Dec 1 2022, 2:22 PM

In D139045#3964875, @tra wrote:

In any case, I think this is something that may need a wider forum. Ask on LLVM discourse?

RFC opened at discord https://discourse.llvm.org/t/rfc-let-clang-use-system-gpu-as-default-offload-arch-for-hip/66950

use detected GPU when --offload-arch=native is specified based on RFC discussion

fix error handling

Harbormaster completed remote builds in B202669: Diff 482240.Dec 12 2022, 1:42 PM

tra accepted this revision.Dec 12 2022, 2:37 PM

tra added inline comments.

clang/lib/Driver/Driver.cpp
3058	Nit: It's used only once. May as well compare with "native" directly.
3075	Does it guarantee that a returned list of architectures contains only canonical names?

This revision is now accepted and ready to land.Dec 12 2022, 2:37 PM

yaxunl marked 2 inline comments as done.Dec 12 2022, 6:25 PM

yaxunl added inline comments.

clang/lib/Driver/Driver.cpp
3058	will do
3075	Yes. amdgpu-arch calls ROCm runtime to get GPU arch, which uses canonical gfx names https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/adae6c61e10d371f7cbc3d0e94ae2c070cab18a4/src/core/runtime/isa.cpp#L245

This revision was landed with ongoing or failed builds.Dec 13 2022, 7:10 AM

Closed by commit rGe8fd998e6194: [HIP] support --offload-arch=native (authored by yaxunl). · Explain Why

This revision was automatically updated to reflect the committed changes.

yaxunl marked 2 inline comments as done.

yaxunl added a commit: rGe8fd998e6194: [HIP] support --offload-arch=native.

Herald added a project: Restricted Project. · View Herald TranscriptDec 13 2022, 7:10 AM

This is an archive of the discontinued LLVM Phabricator instance.

[HIP] support --offload-arch=native
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 482463

clang/lib/Driver/Driver.cpp

clang/lib/Driver/ToolChains/AMDGPU.h

This is an archive of the discontinued LLVM Phabricator instance.

[HIP] support --offload-arch=nativeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 482463

clang/lib/Driver/Driver.cpp

clang/lib/Driver/ToolChains/AMDGPU.h

[HIP] support --offload-arch=native
ClosedPublic