Page MenuHomePhabricator

AMDHSA: Code object v3 updates
ClosedPublic

Authored by kzhuravl on May 30 2018, 5:06 PM.

Details

Summary
  • Do not emit following assembler directives:
    • .hsa_code_object_version
    • .hsa_code_object_isa
    • .amd_amdgpu_isa
    • .amd_amdgpu_hsa_metadata
    • .amd_amdgpu_pal_metadata
  • Do not emit .note entries
  • Cleanup and bring in sync kernel descriptor header file
  • Emit kernel descriptor into .rodata with appropriate relocations and alignments

Diff Detail

Event Timeline

kzhuravl created this revision.May 30 2018, 5:06 PM

This change also requires a change in lld, which I plan to post tomorrow.

kzhuravl updated this revision to Diff 149339.May 31 2018, 12:30 PM
kzhuravl retitled this revision from AMDGPU: Code object v3 updates to AMDHSA: Code object v3 updates.
  • This only applies to AMDHSA for now
  • Update tests
nhaehnle added inline comments.Jun 4 2018, 7:17 AM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
403–409

Isn't the REL64 here redundant? The way I see it, we should either have an absolute reference to (start of kernel code) - (start of kernel descriptor), or a relative reference to (start of kernel code) - offsetof(kernel_descriptor_t, kernel_code_entry_byte_offset).

I'm not deep enough in MC to say for certain which of these is really preferable.

kzhuravl updated this revision to Diff 149817.Jun 4 2018, 11:40 AM

Fix symbol bindings and rebase.

t-tye added inline comments.Jun 4 2018, 11:46 AM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
403–409

The field requires an offset to the entry point so an absolute relocation cannot be used. The Rel64 relocation record is what describes what is needed. Since the kernel descriptor and entry point are now in different sections, a static relocation record is needed so that it can be fixed up when the relocatable code object is linked to make a shared object.

Previously the kernel descriptor was put in the same section as the code, and the offset was "hard-wired" when it was generated.

t-tye added inline comments.Jun 4 2018, 12:20 PM
include/llvm/Support/AMDHSAKernelDescriptor.h
13

Suggest just linking to "https://llvm.org/docs/AMDGPUUsage.html#kernel-descriptor" as this may support other targets in the future.

44

Should this be:

'''
DST &= ~MSK;
'''

51

What does "Must be kept backwards compatible." mean? Arn't these just the meaning of the values? Or is the issue that they may change value on different targets in the future?

78

uint32_t to match `uint32_t compute_pgm_rsrc1;`?

100

uint32_t to match `uint32_t compute_pgm_rsrc2;`?

126

Should this be uint16_t since the kernel descriptor field is `uint16_t kernel_code_properties;`?

nhaehnle added inline comments.Jun 4 2018, 1:26 PM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
403–409

Right, I agree that a REL64 relocation is ultimately needed. My point is more about how to express that fact inside LLVM using the MCExpr framework.

I read the expressing that is being created here as literally (start of kernel code) - (start of kernel descriptor). The fact that it's a relative relocation really ought to be implied by that already, it's not clear to me why VK_AMDGPU_REL64 is passed to one of the constructors in addition to that.

kzhuravl added inline comments.Jun 4 2018, 3:24 PM
include/llvm/Support/AMDHSAKernelDescriptor.h
51

May change value in future.

78

I get a compiler warning if enum is from uint32_t.

lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
403–409

If we just put (start of kernel code) - (start of kernel descriptor), it ends up being R_AMDGPU_ABS64.

t-tye added inline comments.Jun 4 2018, 10:52 PM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
403–409

I agree with @nhaehnle that this seems strange it is needed. Is that a limitation or bug in MCExpr handling? Perhaps it is being used in a way not seen before and so it mishandles this case, and ought to be fixed?

t-tye added a subscriber: b-sumner.Jun 4 2018, 10:55 PM
t-tye added inline comments.
docs/AMDGPUUsage.rst
689

@b-sumner does this seem to be a reasonable extension to use? Using @ conflicts with its use for symbol versioning need to change to something else.

b-sumner added inline comments.Jun 5 2018, 5:46 AM
docs/AMDGPUUsage.rst
689

Yes, it seems reasonable to me.

kzhuravl updated this revision to Diff 150060.Jun 5 2018, 5:42 PM
kzhuravl marked 2 inline comments as done.

Address review feedback.

lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
403–409

I put a fixme comment for now.

scott.linder added inline comments.Jun 7 2018, 12:31 PM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
403–409

I think the definition of VK_AMDGPU_REL64 got dropped in the most recent patch?

kzhuravl added inline comments.Jun 7 2018, 12:33 PM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
403–409
t-tye accepted this revision.Jun 7 2018, 2:51 PM

LGTM

include/llvm/Support/AMDHSAKernelDescriptor.h
51

Suggest:

// Floating point rounding modes. Must match hardware definition.
This revision is now accepted and ready to land.Jun 7 2018, 2:51 PM
scott.linder added inline comments.Jun 12 2018, 9:08 AM
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
381

Should this switch be happening here? Shouldn't the assembly writer be able to put this descriptor in any section?

kzhuravl closed this revision.Jun 12 2018, 11:42 AM
kzhuravl marked an inline comment as done.
lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
381