This change introduces the dynamic stack boolean field to code-object-v3
and above under the code properties of the kernel descriptor and under
the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to
the is_dynamic_callstack field of amd_kernel_code_t.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
Do we want to tie this bit to a particular code object version? Code object v4 and up? Why did you pick code object v3 and up?
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
4006 | If it is "Reserved, must be 0.", then compiler does not set it. But you clearly set it in in AsmPrinter | |
llvm/include/llvm/Support/AMDHSAKernelDescriptor.h | ||
165 | Why not eat up the reserved bit at index 11 and decrease number of reserved bits to 4? |
llvm/include/llvm/Support/AMDHSAKernelDescriptor.h | ||
---|---|---|
181–182 | I thought this was already defined here, which was the reason to set it. I'm not sure it makes sense to add the field if it's not already there |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
4003–4006 | Since we are actually re-introducing this and not recycling an existing field, we should take the chance to rename it. The call part of the name is misleading since we can need this in situations that do not have calls. How about just DYNAMIC_STACK_SIZE? | |
llvm/include/llvm/Support/AMDHSAKernelDescriptor.h | ||
181–182 | We apparently do need to place this here since rocr doesn't want to inspect metadata |
My intention is to add this to the kernel descriptor after seeing that rocr relies on it. My understanding is that modifying kernel descriptor affects code object v3 and above. Should we only set this for v4 and above? If so, how should we modify amdgpu-amdhsa-kernel-descriptor-v3-table in AMDGPUUsage.rst? I'm not sure about adding v4+ specific fields there.
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
4003–4006 | If there's no requirement to keep the naming consistent with amd_kernel_code_t, we should rename this. DYNAMIC_STACK_SIZE sounds like an integer field, how about IS_DYNAMIC_STACK? | |
4006 | Sorry, IS_DYNAMIC_CALLSTACK should be at the 468th bit since we have KERNEL_CODE_PROPERTY(IS_DYNAMIC_CALLSTACK, 20, 1). I'll fix this once we agree on the bit position for this field. | |
llvm/include/llvm/Support/AMDHSAKernelDescriptor.h | ||
165 | I did that initially, but changed it to the 20th bit to keep things consistent after noticing the "Must be kept backwards compatible." comment and that these fields were at the same bit position as the ones in amd_kernel_code_t's code properties field (i.e. amd_code_property_mask_t of AMDKernelCodeT.h). My understanding is limited here, but do we need to make sure we use the same bit position as that of v2's amd_kernel_code_t? If not, then we should just just use the first bit of RESEREVED1 and avoid the extension of code property to 32 bits. Also, was there any relevant reason for keeping the bit positions conform to that of amd_kernel_code_t to date? |
The "is_" prefix is not helpful. I suggest either "uses_dynamic_stack" or just "dynamic_stack".
I see the kernel descriptor changes reflected in the test updates, but I don't see the metadata entries tested
By "metadata entries", do you mean an entry in the kernel metadata map under NT_AMDGPU_METADATA? I didn't add the corresponding field there.
The objdump assertions of the "minimal" symbol in hsa-v3.s and hsa-v4.s under MC/AMDGPU checks the default bits in the kernel descriptor.
If it is "Reserved, must be 0.", then compiler does not set it. But you clearly set it in in AsmPrinter