This is an archive of the discontinued LLVM Phabricator instance.

[libc][amdgpu] Tolerate different install directories for hsa.h
ClosedPublic

Authored by JonChesterfield on Jul 20 2023, 3:19 AM.

Download Raw Diff

Details

Reviewers

jhuber6
jdoerfert
jplehr

Commits

rGd483824fc8ed: [libc][amdgpu] Tolerate different install directories for hsa.h

Summary

HSA headers might be under a hsa/ directory or might not.
This scheme matches the one used by the openmp amdgpu plugin.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

JonChesterfield created this revision.Jul 20 2023, 3:19 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 20 2023, 3:19 AM

Herald added subscribers: libc-commits, kerbowa, tpr and 4 others. · View Herald Transcript

JonChesterfield requested review of this revision.Jul 20 2023, 3:19 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptJul 20 2023, 3:19 AM

Herald added subscribers: jplehr, sstefan1, wdng. · View Herald Transcript

Potentially dumb question: Should this complexity be put in one place that can be used from everywhere?

This one is useful for building with reference to the HSA source tree, where the headers are in a folder called 'inc'.

I think ROCm has sometimes installed with a leading hsa directory and sometimes not.

The dynamic_hsa setup openmp currently uses in some configurations doesn't have a leading hsa directory at present.

Debian uses /usr/include/hsa/hsa.h at present https://packages.debian.org/sid/amd64/libhsa-runtime-dev/filelist

libc/utils/gpu/loader/amdgpu/Loader.cpp
30	^ I'm not sure the defined(__has_include) check is necessary but it doesn't do any harm. This is a copy&paste from the openmp plugin where I think it's been through a few variations. amdgpu-arch used to have the same pattern before https://reviews.llvm.org/D150807.

In D155812#4518152, @jplehr wrote:

Potentially dumb question: Should this complexity be put in one place that can be used from everywhere?

There's some institutional resistance / fear of adding a header file which is used by multiple projects within the monorepo. This is particularly annoying for the GPU runtimes.

So far I've been able to get a "maybe, if we absolutely must" response to pure header files that don't have any source component, but the current preference is for copy&paste everywhere. There are some hazards around people packaging parts of the monorepo separately to the whole. There used to be licensing barriers.

The HSA headers are a stable C interface which is fairly annoying to program against. We've ended up with callbacks-wrapped-in-templates-that-pass-lambdas copy&pasted between libc and the openmp-plugin. Possibly also amdgpu-arch, haven't checked recently. I'm currently leaning towards writing "hsa.hpp" as a wrapper that puts a C++ interface over it, and deals with this #include stuff as well, which gets copy&pasted between openmp and libc whenever someone notices divergence.

Interesting question is whether that should be unified with dynamic_hsa as well.

edit: it's code like this I mean

template <typename C>
hsa_status_t iterate_agents(C cb)
{
  requires_invocable_r(hsa_status_t, C, hsa_agent_t);
  auto L = [](hsa_agent_t agent, void* data) -> hsa_status_t {
    C* unwrapped = static_cast<C*>(data);
    return (*unwrapped)(agent);
  };
  return hsa_iterate_agents(L, static_cast<void*>(&cb));
}

though a lot of the accessors would be better written as something that returns an optional value, or possibly a pair of hsa_status_t and value. Locally I've been using functions like:

inline uint32_t agent_get_info_queues_max(hsa_agent_t agent)
{
  return agent_get_info<uint32_t, HSA_AGENT_INFO_QUEUES_MAX>::call(agent);
}

but that doesn't do error reporting, need to make up my mind whether success/failure is sufficient information for the accessors.

In D155812#4518208, @JonChesterfield wrote:

In D155812#4518152, @jplehr wrote:

Potentially dumb question: Should this complexity be put in one place that can be used from everywhere?

There's some institutional resistance / fear of adding a header file which is used by multiple projects within the monorepo. This is particularly annoying for the GPU runtimes.

So far I've been able to get a "maybe, if we absolutely must" response to pure header files that don't have any source component, but the current preference is for copy&paste everywhere. There are some hazards around people packaging parts of the monorepo separately to the whole. There used to be licensing barriers.

The HSA headers are a stable C interface which is fairly annoying to program against. We've ended up with callbacks-wrapped-in-templates-that-pass-lambdas copy&pasted between libc and the openmp-plugin. Possibly also amdgpu-arch, haven't checked recently. I'm currently leaning towards writing "hsa.hpp" as a wrapper that puts a C++ interface over it, and deals with this #include stuff as well, which gets copy&pasted between openmp and libc whenever someone notices divergence.

Interesting question is whether that should be unified with dynamic_hsa as well.

Thanks for elaborating. Given that history and potential way forward, I think this patch looks good as it matches what's in the nextgen plugin. So we keep it (at least initially) consistent.

This revision is now accepted and ready to land.Jul 20 2023, 3:48 AM

Harbormaster completed remote builds in B246841: Diff 542410.Jul 20 2023, 3:51 AM

jhuber6 accepted this revision.Jul 20 2023, 5:12 AM

Closed by commit rGd483824fc8ed: [libc][amdgpu] Tolerate different install directories for hsa.h (authored by JonChesterfield). · Explain WhyJul 20 2023, 5:43 AM

This revision was automatically updated to reflect the committed changes.

JonChesterfield added a commit: rGd483824fc8ed: [libc][amdgpu] Tolerate different install directories for hsa.h.

Revision Contents

Path

Size

libc/

utils/

gpu/

loader/

amdgpu/

Loader.cpp

14 lines

Diff 542459

libc/utils/gpu/loader/amdgpu/Loader.cpp

	Show All 9 Lines
	// architecture. The file launches the '_start' kernel which should be provided			// architecture. The file launches the '_start' kernel which should be provided
	// by the device application start code and call ultimately call the 'main'			// by the device application start code and call ultimately call the 'main'
	// function.			// function.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "Loader.h"			#include "Loader.h"

	#include <hsa/hsa.h>			#if defined(__has_include)
	#include <hsa/hsa_ext_amd.h>			#if __has_include("hsa/hsa.h")
				#include "hsa/hsa.h"
				#include "hsa/hsa_ext_amd.h"
				#elif __has_include("hsa.h")
				#include "hsa.h"
				#include "hsa_ext_amd.h"
				#endif
				#else
				#include "hsa/hsa.h"
				#include "hsa/hsa_ext_amd.h"
				#endif

				JonChesterfieldAuthorUnsubmitted Done Reply Inline Actions ^ I'm not sure the defined(__has_include) check is necessary but it doesn't do any harm. This is a copy&paste from the openmp plugin where I think it's been through a few variations. amdgpu-arch used to have the same pattern before https://reviews.llvm.org/D150807. JonChesterfield: ^ I'm not sure the defined(__has_include) check is necessary but it doesn't do any harm. This…
	#include <cstdio>			#include <cstdio>
	#include <cstdlib>			#include <cstdlib>
	#include <cstring>			#include <cstring>
	#include <tuple>			#include <tuple>
	#include <utility>			#include <utility>

	/// Print the error code and exit if \p code indicates an error.			/// Print the error code and exit if \p code indicates an error.
	static void handle_error(hsa_status_t code) {			static void handle_error(hsa_status_t code) {
	▲ Show 20 Lines • Show All 489 Lines • Show Last 20 Lines