This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][HIP] Add support for `--offload-arch=native` to CUDA and refactor
ClosedPublic

Authored by jhuber6 on Jan 5 2023, 6:35 AM.

Details

Summary

This patch adds basic support for --offload-arch=native to CUDA. This
is done using the nvptx-arch tool that was introduced previously. Some
of the logic for handling executing these tools was factored into a
common helper as well. This patch does not add support for OpenMP or the
"new" driver. That will be done later.

Diff Detail

Event Timeline

jhuber6 created this revision.Jan 5 2023, 6:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2023, 6:35 AM
jhuber6 requested review of this revision.Jan 5 2023, 6:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2023, 6:35 AM
jhuber6 retitled this revision from [CUDA]HIP] Add support for `--offload-arch=native` to CUDA and refactor to [CUDA][HIP] Add support for `--offload-arch=native` to CUDA and refactor.Jan 5 2023, 7:17 AM
jhuber6 updated this revision to Diff 486739.Jan 5 2023, 7:41 PM

Change error to print canonical arch name from Triple.

yaxunl added inline comments.Jan 9 2023, 10:40 AM
clang/test/Driver/amdgpu-hip-system-arch.c
25

comment incorrect?

jhuber6 added inline comments.Jan 9 2023, 10:40 AM
clang/test/Driver/amdgpu-hip-system-arch.c
25

Yes, thanks for catching that. I'll fix it.

jhuber6 updated this revision to Diff 487503.Jan 9 2023, 11:04 AM

Fix typo.

yaxunl accepted this revision.Jan 10 2023, 10:32 AM

LGTM. Thanks.

This revision is now accepted and ready to land.Jan 10 2023, 10:32 AM
This revision was landed with ongoing or failed builds.Jan 11 2023, 8:30 AM
This revision was automatically updated to reflect the committed changes.
srj added a subscriber: srj.EditedJan 12 2023, 12:34 PM

For reasons that aren't yet clear to me, this change is failing to compile when using gcc-7 and targeting 32-bit targets; the error is of the form

AMDGPU.cpp:773:10: error: could not convert ‘GPUArchs’ from ‘llvm::SmallVector<std::__cxx11::basic_string<char>, 1>’ to ‘llvm::Expected<llvm::SmallVector<std::__cxx11::basic_string<char> > >’
   return GPUArchs;

it's not (yet) clear to me whether this is specific to gcc-7 (which I realize is fairly old -- is it still supported?) or what -- investigating further.

EDIT: I am (so far) unable to repro the failure using the gcc-12 installed on my local Linux machine, so it may be specific to that compiler version -- I am going to try to install gcc-7 on my local machine to see if there's a reasonable workaround.

For reasons that aren't yet clear to me, this change is failing to compile when using gcc-7 and targeting 32-bit targets; the error is of the form

AMDGPU.cpp:773:10: error: could not convert ‘GPUArchs’ from ‘llvm::SmallVector<std::__cxx11::basic_string<char>, 1>’ to ‘llvm::Expected<llvm::SmallVector<std::__cxx11::basic_string<char> > >’
   return GPUArchs;

Probably the older GCC doesn't handle the implicit copy elision to the expected type well and thinks that it's copied. I'll put an explicit move on it.

For reasons that aren't yet clear to me, this change is failing to compile when using gcc-7 and targeting 32-bit targets; the error is of the form

AMDGPU.cpp:773:10: error: could not convert ‘GPUArchs’ from ‘llvm::SmallVector<std::__cxx11::basic_string<char>, 1>’ to ‘llvm::Expected<llvm::SmallVector<std::__cxx11::basic_string<char> > >’
   return GPUArchs;

it's not (yet) clear to me whether this is specific to gcc-7 (which I realize is fairly old -- is it still supported?) or what -- investigating further.

Let me know if rG26d62674cf50 solved it. I can't reproduce it on my system.