The documentation for the AMDGPU assembler's examples don't show the
.args section, which, if ommitted, will cause arguments to silently
not be passed into the kernel. This commit fixes this issue.
Details
- Reviewers
scott.linder - Group Reviewers
Restricted Project - Commits
- rGbef5ed1eeac3: [AMDGPU][Docs] Update Code Object V3 example to includes args section
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thank you for improving the docs! This definitely just looks like an oversight on my part when implementing the new metadata.
I have a couple small requests, but otherwise LGTM
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
12190–12197 | The argument in this case it a pointer, which I believe we would describe with .value_kind: global_buffer. We are also using flat store, so this likely should have .address_space: generic, although I think in the frontend we coerce pointer arguments to the global address space because we know the arguments are not in e.g. scratch; with that in mind I think .address_space: global may actually be correct? The TODO in the description of .address_space reads: .. TODO Is "global_buffer" only "global" or "constant"? Is "dynamic_shared_pointer" always "local"? Can HCC allow "generic"? How can "private" or "region" ever happen? Maybe one of @kzhuravl @t-tye @rampitec know what the .address_space should be in this case? |
Also, expanding on this, would it be useful for us to include a source-language "equivalent" for the example? I.e. the following HIP program should compile to something essentially equivalent:
__global__ void hello_world(float *p) { *p = 3.14159f; }
My understanding of the documentation is that global_buffer causes the kernel launcher to put the argument into a buffer. Is that not the case?
For compute we are not using buffers in the sense of graphics. I think buffer is more in the spirit (and was the original motivation for) the runtime managing OpenCL buffers.
@kzhuravl or @scott.linder can you actually get a dump of a real HIP program to see what is being used? @yaxunl may also be able to advise.
We seem to generate the global_buffer kind for a HIP kernel with a pointer argument: https://gcc.godbolt.org/z/q71nfEs91
I also looked into the address_space question briefly, and at least our OpenCL runtime does not seem to support the generic address space on global_buffer arguments.
@scott.linder Thanks for checking up on that, I've updated the documentation (and the code that I was writing that spawned this PR) accordingly.
LGTM, thank you!
If someone notices something I missed, please just follow up with another review, or mention it here.
As a note to to myself, it would likely be a good idea to also pin down the remaining TODO's in the relevant docs for the metadata keys. We could also reword the description of the global_buffer or add some text to make it more obvious to the reader how it relates to source language semantics and runtime concepts. The relevant code in ROCclr seems to be:
- Classification of global_buffer arguments as T_POINTER (the enum value MemoryObject corresponds to the "global_buffer" argument to .value_kind): https://github.com/ROCm-Developer-Tools/ROCclr/blob/5cefcaf62893fcd86c8feed6bb1ebb84850fcd2f/device/devkernel.cpp#L1309
- ROCr handling of T_POINTER parameters: https://github.com/ROCm-Developer-Tools/ROCclr/blob/5cefcaf62893fcd86c8feed6bb1ebb84850fcd2f/device/rocm/rocvirtual.cpp#L297
- Assertion which seems to preclude use of generic address space for global_buffer arguments: https://github.com/ROCm-Developer-Tools/ROCclr/blob/5cefcaf62893fcd86c8feed6bb1ebb84850fcd2f/device/rocm/rocvirtual.cpp#L344
There also seems to be a workaround for Image/Pipe arguments at https://github.com/ROCm-Developer-Tools/ROCclr/blob/5cefcaf62893fcd86c8feed6bb1ebb84850fcd2f/device/devkernel.cpp#L1336 that should be looked into (assuming we haven't already addressed it).
Edit: it turns out ROCclr actually just considers "generic" to be "global" for the purposes of global_buffer arguments: https://github.com/ROCm-Developer-Tools/ROCclr/blob/5cefcaf62893fcd86c8feed6bb1ebb84850fcd2f/device/devkernel.hpp#L277
The argument in this case it a pointer, which I believe we would describe with .value_kind: global_buffer.
We are also using flat store, so this likely should have .address_space: generic, although I think in the frontend we coerce pointer arguments to the global address space because we know the arguments are not in e.g. scratch; with that in mind I think .address_space: global may actually be correct? The TODO in the description of .address_space reads:
Maybe one of @kzhuravl @t-tye @rampitec know what the .address_space should be in this case?