Add a section to AMDGPUUsage.rst about calling conventions and list the
ones from the CallingConv enum. Full descriptions can come later (help
appreciated).
Details
- Reviewers
nhaehnle foad dstuttard t-tye arsenm - Group Reviewers
Restricted Project - Commits
- rG5c5eff44ce81: [AMDGPU] Start documenting calling conventions. NFC
rGaa7b127cb731: [AMDGPU] Start documenting calling conventions. NFC
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
1084 | "graphics targets?" Doesn't really say how it's not an entry point |
There could probably be a general section here to state that in the graphics calling conventions, inreg is used to denote arguments mapped to SGPRs, while other arguments are mapped to VGPRs.
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
1050–1054 | The logic between Mesa and AMDPAL here is actually the same. |
Can you reupload with full context? If you're using the web interface I think https://www.llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface is still accurate, so something like:
git show HEAD -U999999 > mypatch.patch
I ask because I think this should be cross-referenced with (or maybe even combined with) the existing Call Convention docs around line ~13500:
Call Convention ~~~~~~~~~~~~~~~ .. note:: This section is currently incomplete and has inaccuracies. It is WIP that will be updated as information is determined. See :ref:`amdgpu-dwarf-address-space-identifier` for information on swizzled addresses. Unswizzled addresses are normal linear addresses. .. _amdgpu-amdhsa-function-call-convention-kernel-functions: Kernel Functions ++++++++++++++++ This section describes the call convention ABI for the outer kernel function. ...
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
1039 | Nit: capitalize | |
1049 | The omission of the default ccc convention seems like it might lead to confusion. The backend supports it, so I would suggest just adding it at the front of the list and making it clear that it is the default. |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
1048 | Nit: this list seems to be in a random order, can we sort this in alphabetical order or organize it in a more meaningful way? |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
1048 | Ok, I alphabetized it for now, except for the ccc which looks good at the top imo. I'm open to other suggestions. | |
1084 |
...apparently :) I just copied the comments from the enum, I have no idea what this is used for (one of the reasons why I thought we should start writing these things up in more detail).
I mentioned that now. Ideally we'd describe it more thoroughly in a follow-up patch. |
Sorry, I haven't used the web interface in a while so I forgot that part... Anyway, I got arc working again now, so it shouldn't happen again in the future.
I ask because I think this should be cross-referenced with (or maybe even combined with) the existing Call Convention docs around line ~13500:
Call Convention ~~~~~~~~~~~~~~~ .. note:: This section is currently incomplete and has inaccuracies. It is WIP that will be updated as information is determined. See :ref:`amdgpu-dwarf-address-space-identifier` for information on swizzled addresses. Unswizzled addresses are normal linear addresses. .. _amdgpu-amdhsa-function-call-convention-kernel-functions: Kernel Functions ++++++++++++++++ This section describes the call convention ABI for the outer kernel function. ...
I added a :ref: to this from `amdgpu_kernel. There should probably be a reference to amdgpu-amdhsa-function-call-convention-non-kernel-functions` somewhere but I'm not sure where. Is this actually the ccc?
No worries!
I ask because I think this should be cross-referenced with (or maybe even combined with) the existing Call Convention docs around line ~13500:
Call Convention ~~~~~~~~~~~~~~~ .. note:: This section is currently incomplete and has inaccuracies. It is WIP that will be updated as information is determined. See :ref:`amdgpu-dwarf-address-space-identifier` for information on swizzled addresses. Unswizzled addresses are normal linear addresses. .. _amdgpu-amdhsa-function-call-convention-kernel-functions: Kernel Functions ++++++++++++++++ This section describes the call convention ABI for the outer kernel function. ...I added a :ref: to this from `amdgpu_kernel. There should probably be a reference to amdgpu-amdhsa-function-call-convention-non-kernel-functions` somewhere but I'm not sure where. Is this actually the ccc?
Perfect, thank you!
And yes, AFAIU we just use the default ccc for non-kernel functions. I am actually only basing this off the observation that the IR doesn't print an explicit CC, though; I'm not really sure what else it could be?
Add ref to the description of the calling convention for AMDHSA non-kernel functions to the entry about the ccc.
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
1080 | Also fastcc but currently treated identically to the default ccc |
Added the fast and cold calling conventions. AFAICT the only difference between these and the C calling convention is how hard we try to TCO.
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
1096–1099 | For the description: A difference between amdgpu_gfx and the C calling convention is that in amdgpu_gfx, SGPR arguments are callee-save (preserved), in the C cc, they are not. |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
1096–1099 | Surely that's one difference, but I'm not 100% convinced it's the only one. I think we should leave the more detailed description for a future patch. |
@arsenm does compute have this available? I've periodically wanted to be able to write a function which assumes a parameter is passed in SGPRs and hope for a compile time error calling it if that doesn't work out.
No, I just consider this a bug. The handling for inreg was artificially limited to amdgpu_gfx. You would also get a waterfall loop, not an error.
Semi-related, I think we should start interpreting i1 argument values as SGPR masks (i1 inreg would be extended to sreg32)
Semi-related, I think we should start interpreting i1 argument values as SGPR masks (i1 inreg would be extended to sreg32)
Yeah, that makes a lot of sense. I think one challenge for compute to truly leverage this is that I believe Clang emits bool as i8. Or is that only in struct types?
If it's a pure bool I know you get zeroext i1. It does look like struct elements give i8
Nit: capitalize