- Do not emit following assembler directives:
- .hsa_code_object_version
- .hsa_code_object_isa
- .amd_amdgpu_isa
- .amd_amdgpu_hsa_metadata
- .amd_amdgpu_pal_metadata
- Do not emit .note entries
- Cleanup and bring in sync kernel descriptor header file
- Emit kernel descriptor into .rodata with appropriate relocations and alignments
Details
Diff Detail
Event Timeline
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp | ||
---|---|---|
403–409 | Isn't the REL64 here redundant? The way I see it, we should either have an absolute reference to (start of kernel code) - (start of kernel descriptor), or a relative reference to (start of kernel code) - offsetof(kernel_descriptor_t, kernel_code_entry_byte_offset). I'm not deep enough in MC to say for certain which of these is really preferable. |
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp | ||
---|---|---|
403–409 | The field requires an offset to the entry point so an absolute relocation cannot be used. The Rel64 relocation record is what describes what is needed. Since the kernel descriptor and entry point are now in different sections, a static relocation record is needed so that it can be fixed up when the relocatable code object is linked to make a shared object. Previously the kernel descriptor was put in the same section as the code, and the offset was "hard-wired" when it was generated. |
include/llvm/Support/AMDHSAKernelDescriptor.h | ||
---|---|---|
13 | Suggest just linking to "https://llvm.org/docs/AMDGPUUsage.html#kernel-descriptor" as this may support other targets in the future. | |
44 | Should this be: ''' | |
51 | What does "Must be kept backwards compatible." mean? Arn't these just the meaning of the values? Or is the issue that they may change value on different targets in the future? | |
78 | uint32_t to match `uint32_t compute_pgm_rsrc1;`? | |
100 | uint32_t to match `uint32_t compute_pgm_rsrc2;`? | |
126 | Should this be uint16_t since the kernel descriptor field is `uint16_t kernel_code_properties;`? |
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp | ||
---|---|---|
403–409 | Right, I agree that a REL64 relocation is ultimately needed. My point is more about how to express that fact inside LLVM using the MCExpr framework. I read the expressing that is being created here as literally (start of kernel code) - (start of kernel descriptor). The fact that it's a relative relocation really ought to be implied by that already, it's not clear to me why VK_AMDGPU_REL64 is passed to one of the constructors in addition to that. |
docs/AMDGPUUsage.rst | ||
---|---|---|
689 | Yes, it seems reasonable to me. |
Address review feedback.
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp | ||
---|---|---|
403–409 | I put a fixme comment for now. |
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp | ||
---|---|---|
403–409 | I think the definition of VK_AMDGPU_REL64 got dropped in the most recent patch? |
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp | ||
---|---|---|
403–409 |
LGTM
include/llvm/Support/AMDHSAKernelDescriptor.h | ||
---|---|---|
51 | Suggest: // Floating point rounding modes. Must match hardware definition. |
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp | ||
---|---|---|
381 | Should this switch be happening here? Shouldn't the assembly writer be able to put this descriptor in any section? |
@b-sumner does this seem to be a reasonable extension to use? Using @ conflicts with its use for symbol versioning need to change to something else.