- In HIP, just as the regular device-only compilation, the device-only relocatable code compilation should not involve offload bundle.
- In addition, that device-only relocatable code compilation should have the similar 3 steps, namely preprocessor, compile, and backend, to the regular code generation with -emit-llvm.
Details
- Reviewers
yaxunl tra - Commits
- rG8b6821a5843b: [hip] Fix device-only relocatable code compilation.
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
LGTM in general, but I'll let Sam stamp it.
clang/lib/Driver/Driver.cpp | ||
---|---|---|
4594–4607 | This is rather hard to understand. | |
clang/test/Driver/hip-device-only.hip | ||
6 | We appear to test -fgpu-rdc only and the test case seems to be fairly specific to the way that option is handled by HIP. |
This doesn't pass tests: http://45.33.8.238/linux/19977/step_7.txt
Please take a look, and please revert for now if fixing takes a while.
Thanks. I am building with PowerPC enabled to reproduce this issue. There should be a minor fix on that test case.
This is rather hard to understand.
Would it be simpler to specify when we shouldn't add .tmp?
At the very least I'd extract the newly added clause into a temporary variable and would add some comments explaining why -fgpu-rdc gets special treatment.