This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor and the kernel metadata map
ClosedPublic

Authored by abinavpp on Jun 22 2022, 8:07 AM.

Details

Summary

This change introduces the dynamic stack boolean field to code-object-v3
and above under the code properties of the kernel descriptor and under
the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to
the is_dynamic_callstack field of amd_kernel_code_t.

Diff Detail

Event Timeline

abinavpp created this revision.Jun 22 2022, 8:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2022, 8:07 AM
abinavpp requested review of this revision.Jun 22 2022, 8:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2022, 8:07 AM
kzhuravl requested changes to this revision.Jun 22 2022, 8:48 AM

Do we want to tie this bit to a particular code object version? Code object v4 and up? Why did you pick code object v3 and up?

llvm/docs/AMDGPUUsage.rst
4006

If it is "Reserved, must be 0.", then compiler does not set it. But you clearly set it in in AsmPrinter

llvm/include/llvm/Support/AMDHSAKernelDescriptor.h
165

Why not eat up the reserved bit at index 11 and decrease number of reserved bits to 4?

This revision now requires changes to proceed.Jun 22 2022, 8:48 AM
arsenm added inline comments.Jun 22 2022, 8:50 AM
llvm/include/llvm/Support/AMDHSAKernelDescriptor.h
181–182

I thought this was already defined here, which was the reason to set it. I'm not sure it makes sense to add the field if it's not already there

arsenm added inline comments.Jun 22 2022, 11:14 AM
llvm/docs/AMDGPUUsage.rst
4003–4006

Since we are actually re-introducing this and not recycling an existing field, we should take the chance to rename it. The call part of the name is misleading since we can need this in situations that do not have calls. How about just DYNAMIC_STACK_SIZE?

llvm/include/llvm/Support/AMDHSAKernelDescriptor.h
181–182

We apparently do need to place this here since rocr doesn't want to inspect metadata

Do we want to tie this bit to a particular code object version? Code object v4 and up? Why did you pick code object v3 and up?

My intention is to add this to the kernel descriptor after seeing that rocr relies on it. My understanding is that modifying kernel descriptor affects code object v3 and above. Should we only set this for v4 and above? If so, how should we modify amdgpu-amdhsa-kernel-descriptor-v3-table in AMDGPUUsage.rst? I'm not sure about adding v4+ specific fields there.

llvm/docs/AMDGPUUsage.rst
4003–4006

If there's no requirement to keep the naming consistent with amd_kernel_code_t, we should rename this. DYNAMIC_STACK_SIZE sounds like an integer field, how about IS_DYNAMIC_STACK?

4006

Sorry, IS_DYNAMIC_CALLSTACK should be at the 468th bit since we have KERNEL_CODE_PROPERTY(IS_DYNAMIC_CALLSTACK, 20, 1). I'll fix this once we agree on the bit position for this field.

llvm/include/llvm/Support/AMDHSAKernelDescriptor.h
165

I did that initially, but changed it to the 20th bit to keep things consistent after noticing the "Must be kept backwards compatible." comment and that these fields were at the same bit position as the ones in amd_kernel_code_t's code properties field (i.e. amd_code_property_mask_t of AMDKernelCodeT.h). My understanding is limited here, but do we need to make sure we use the same bit position as that of v2's amd_kernel_code_t? If not, then we should just just use the first bit of RESEREVED1 and avoid the extension of code property to 32 bits. Also, was there any relevant reason for keeping the bit positions conform to that of amd_kernel_code_t to date?

abinavpp updated this revision to Diff 439688.Jun 24 2022, 3:25 AM
  • As per internal discussion, use the 11th bit
  • Rename the field to is_dynamic_stack
abinavpp retitled this revision from [AMDGPU] Add the "is_dynamic_callstack" of amd_kernel_code_t to the kernel descriptor to [AMDGPU] Add the is_dynamic_stack field to the kernel descriptor.Jun 24 2022, 3:28 AM
abinavpp edited the summary of this revision. (Show Details)

The "is_" prefix is not helpful. I suggest either "uses_dynamic_stack" or just "dynamic_stack".

abinavpp updated this revision to Diff 439815.Jun 24 2022, 10:20 AM

Rename the field to uses_dynamic_stack

abinavpp retitled this revision from [AMDGPU] Add the is_dynamic_stack field to the kernel descriptor to [AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor.Jun 24 2022, 10:21 AM
arsenm requested changes to this revision.Jun 28 2022, 5:47 AM

I see the kernel descriptor changes reflected in the test updates, but I don't see the metadata entries tested

This revision now requires changes to proceed.Jun 28 2022, 5:47 AM

I see the kernel descriptor changes reflected in the test updates, but I don't see the metadata entries tested

By "metadata entries", do you mean an entry in the kernel metadata map under NT_AMDGPU_METADATA? I didn't add the corresponding field there.

I see the kernel descriptor changes reflected in the test updates, but I don't see the metadata entries tested

By "metadata entries", do you mean an entry in the kernel metadata map under NT_AMDGPU_METADATA? I didn't add the corresponding field there.

Yes. It has .private_segment_fixed_size, so should also have this for consistency

abinavpp updated this revision to Diff 441054.Jun 29 2022, 9:41 AM
  • Rebase
  • Add the field to the kernel metadata map
abinavpp retitled this revision from [AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor to [AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor and the kernel metadata map.Jun 29 2022, 9:43 AM
abinavpp edited the summary of this revision. (Show Details)
arsenm accepted this revision.Jun 29 2022, 10:08 AM

LGTM

kzhuravl accepted this revision.Jun 29 2022, 10:20 AM
This revision is now accepted and ready to land.Jun 29 2022, 10:20 AM
abinavpp updated this revision to Diff 442033.Jul 4 2022, 2:16 AM
  • Rebase
  • Make the field optional
arsenm added a comment.Jul 5 2022, 9:32 AM

Could use some assembler tests demonstrating the default behavior

Could use some assembler tests demonstrating the default behavior

The objdump assertions of the "minimal" symbol in hsa-v3.s and hsa-v4.s under MC/AMDGPU checks the default bits in the kernel descriptor.

arsenm accepted this revision.Jul 6 2022, 6:52 AM
This revision was landed with ongoing or failed builds.Jul 17 2022, 9:37 PM
This revision was automatically updated to reflect the committed changes.