when -gsplit option is used with clang driver, clang driver will create
a filename with .dwo option based on the input file name and pass
it to clang -cc1. This file is used for storing the debug info. Since
CUDA/HIP generate separate object files for different GPU arch's,
this file should be different for different GPU arch. This patch
adds _ and GPU arch to the stem of the dwo file.
Details
- Reviewers
tra MaskRay - Commits
- rGe50465ecefc9: [HIP] Fix -gsplit-dwarf option
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Does this naming scheme the same as used for .o files? We may want to keep them in sync.
Other than that, LGTM.
clang/lib/Driver/ToolChains/CommonArgs.cpp | ||
---|---|---|
909 | I think the same approach would make sense for CUDA, too. |
.o file is different story.
For -fno-gpu-rdc, the .o files for device compilation are temporary files which are deleted after the device ISA are generated and embedded in host .o file. There is only one output .o file which is the host object file.
For -fgpu-rdc, the .o files for device compilation are also temporary files which are bundled into the clang-offload-bundle. There is only one output .o file which is a bundle.
Therefore in either case there is no need to rename the intermediate .o files since they are temporary files which have unique names.
The .dwo files are not temporary files. They are supposed to be shipped with .o files for debugging info.
Since .dwo files are not temporary files, it is not necessary to follow the -save-temps name convention. For the host object, we keep the original .dwo file name. For the device object, we add '_' and GPU arch to the stem, which is sufficient and concise.
clang/lib/Driver/ToolChains/CommonArgs.cpp | ||
---|---|---|
909 | will include OFK_CUDA. |
Ack.
BTW, is split-dwarf useful for AMD GPUs on device side? I don't think we can currently utilize DWO files on device side with CUDA at all. To think of it, it's probably going to break GPU-side debugging as CUDA can only deal with dwarf info embedded in the GPU binary.
If it does not work for AMD GPUs, perhaps we should just disable it for GPUs.
Since .dwo files are not temporary files, it is not necessary to follow the -save-temps name convention. For the host object, we keep the original .dwo file name. For the device object, we add '_' and GPU arch to the stem, which is sufficient and concise.
What will happen with -save-temps ? Will dwo files match object file names?
It is requested by our debugger team, so it should work with amdgpu.
Since .dwo files are not temporary files, it is not necessary to follow the -save-temps name convention. For the host object, we keep the original .dwo file name. For the device object, we add '_' and GPU arch to the stem, which is sufficient and concise.
What will happen with -save-temps ? Will dwo files match object file names?
with -save-temps, the saved temporary files will be like test-hip-amdgcn-amd-amdhsa-gfx906.o. The dwo file will be like test_gfx906.dwo.
Is the naming scheme for GPU-side DWO files dictated by debugger? If that's the case, it may be worth adding a comment about that.
LGTM.
(Note that ideally -gsplit-dwarf should not imply -g2 but it currents does so. And Clang and GCC have not agreed whether we should add a new flag like -fsplit-dwarf. /For -gsplit-dwarf builds, it is the best to ensure -g is also specified/.)
clang/lib/Driver/ToolChains/CommonArgs.cpp | ||
---|---|---|
920–921 | Does stem change the semantics? |
It is the preferred naming by the debugger. Will add a comment.
clang/lib/Driver/ToolChains/CommonArgs.cpp | ||
---|---|---|
920–921 | No. replace_extension will take stem first and then add new extension. |
I think the same approach would make sense for CUDA, too.