Add assembler directives for preloading kernel arguments that correspond
to new fields in the kernel descriptor for the length and offset of
arguments that will be placed in SGPRs prior to kernel launch. Alignment
of the arguments in SGPRs is equivalent to the kernarg segment when
accessed via the kernarg_segment_ptr. Kernarg SGPRs are allocated
directly after other user SGPRs.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Added on phab because of the dependencies on other patches and is split from other existing reviews.
Alignment of the arguments in SGPRs is equivalent to the kernarg segment when
accessed via the kernarg_segment_ptr.
The alignment should not be relevant to the registers. The registers should always be packed
Also diagnose out of bounds offsets?
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp | ||
---|---|---|
1953 | I think the disassembler should proceed to print whatever is there regardless of the support. It can skip printing if it's 0 | |
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp | ||
2149 | Probably should define a proper sub target feature for this | |
llvm/test/MC/AMDGPU/user-sgpr-count-diag.s | ||
6 | Why amdhsa_accum_offset? I don't recognize this one |
I don't think they are with the current FW/runtime. The placement of the arguments is exactly the same as the kernarg segment. In the future, the runtime could fix the alignment so that the arguments are packed but Jack said he didn't want to pursue this yet.
llvm/test/MC/AMDGPU/user-sgpr-count-diag.s | ||
---|---|---|
6 | It errors out if it's not there. |
I think the disassembler should proceed to print whatever is there regardless of the support. It can skip printing if it's 0