HSA headers might be under a hsa/ directory or might not.
This scheme matches the one used by the openmp amdgpu plugin.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Potentially dumb question: Should this complexity be put in one place that can be used from everywhere?
This one is useful for building with reference to the HSA source tree, where the headers are in a folder called 'inc'.
I think ROCm has sometimes installed with a leading hsa directory and sometimes not.
The dynamic_hsa setup openmp currently uses in some configurations doesn't have a leading hsa directory at present.
Debian uses /usr/include/hsa/hsa.h at present https://packages.debian.org/sid/amd64/libhsa-runtime-dev/filelist
libc/utils/gpu/loader/amdgpu/Loader.cpp | ||
---|---|---|
30 | ^ I'm not sure the defined(__has_include) check is necessary but it doesn't do any harm. This is a copy&paste from the openmp plugin where I think it's been through a few variations. amdgpu-arch used to have the same pattern before https://reviews.llvm.org/D150807. |
There's some institutional resistance / fear of adding a header file which is used by multiple projects within the monorepo. This is particularly annoying for the GPU runtimes.
So far I've been able to get a "maybe, if we absolutely must" response to pure header files that don't have any source component, but the current preference is for copy&paste everywhere. There are some hazards around people packaging parts of the monorepo separately to the whole. There used to be licensing barriers.
The HSA headers are a stable C interface which is fairly annoying to program against. We've ended up with callbacks-wrapped-in-templates-that-pass-lambdas copy&pasted between libc and the openmp-plugin. Possibly also amdgpu-arch, haven't checked recently. I'm currently leaning towards writing "hsa.hpp" as a wrapper that puts a C++ interface over it, and deals with this #include stuff as well, which gets copy&pasted between openmp and libc whenever someone notices divergence.
Interesting question is whether that should be unified with dynamic_hsa as well.
edit: it's code like this I mean
template <typename C> hsa_status_t iterate_agents(C cb) { requires_invocable_r(hsa_status_t, C, hsa_agent_t); auto L = [](hsa_agent_t agent, void* data) -> hsa_status_t { C* unwrapped = static_cast<C*>(data); return (*unwrapped)(agent); }; return hsa_iterate_agents(L, static_cast<void*>(&cb)); }
though a lot of the accessors would be better written as something that returns an optional value, or possibly a pair of hsa_status_t and value. Locally I've been using functions like:
inline uint32_t agent_get_info_queues_max(hsa_agent_t agent) { return agent_get_info<uint32_t, HSA_AGENT_INFO_QUEUES_MAX>::call(agent); }
but that doesn't do error reporting, need to make up my mind whether success/failure is sufficient information for the accessors.
Thanks for elaborating. Given that history and potential way forward, I think this patch looks good as it matches what's in the nextgen plugin. So we keep it (at least initially) consistent.
^ I'm not sure the defined(__has_include) check is necessary but it doesn't do any harm. This is a copy&paste from the openmp plugin where I think it's been through a few variations. amdgpu-arch used to have the same pattern before https://reviews.llvm.org/D150807.