This patch adds basic support for --offload-arch=native to CUDA. This
is done using the nvptx-arch tool that was introduced previously. Some
of the logic for handling executing these tools was factored into a
common helper as well. This patch does not add support for OpenMP or the
"new" driver. That will be done later.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
clang/test/Driver/amdgpu-hip-system-arch.c | ||
---|---|---|
25 | comment incorrect? |
clang/test/Driver/amdgpu-hip-system-arch.c | ||
---|---|---|
25 | Yes, thanks for catching that. I'll fix it. |
For reasons that aren't yet clear to me, this change is failing to compile when using gcc-7 and targeting 32-bit targets; the error is of the form
AMDGPU.cpp:773:10: error: could not convert ‘GPUArchs’ from ‘llvm::SmallVector<std::__cxx11::basic_string<char>, 1>’ to ‘llvm::Expected<llvm::SmallVector<std::__cxx11::basic_string<char> > >’ return GPUArchs;
it's not (yet) clear to me whether this is specific to gcc-7 (which I realize is fairly old -- is it still supported?) or what -- investigating further.
EDIT: I am (so far) unable to repro the failure using the gcc-12 installed on my local Linux machine, so it may be specific to that compiler version -- I am going to try to install gcc-7 on my local machine to see if there's a reasonable workaround.
Probably the older GCC doesn't handle the implicit copy elision to the expected type well and thinks that it's copied. I'll put an explicit move on it.
comment incorrect?