This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
libc/cmake/modules/
-
cmake/
-
modules/
3/6
prepare_libc_gpu_build.cmake

Differential D146994

[libc] Support setting 'native' GPU architecture for libc
ClosedPublic

Authored by jhuber6 on Mar 27 2023, 12:00 PM.

Download Raw Diff

Details

Reviewers

jdoerfert
sivachandra
lntue
michaelrj
tra

Commits

rGc57a58cd0cfe: [libc] Support setting 'native' GPU architecture for libc

Summary

We already use the amdgpu-arch and nvptx-arch tools to determine the
GPU architectures the user's system supports. We can provide
LIBC_GPU_ARCHITECTURES=native to allow users to easily build support
for only the one found on their system. This also cleans up the code
somewhat.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jhuber6 created this revision.Mar 27 2023, 12:00 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 27 2023, 12:00 PM

Herald added subscribers: libc-commits, kosarev, ecnelises and 2 others. · View Herald Transcript

jhuber6 requested review of this revision.Mar 27 2023, 12:00 PM

Harbormaster completed remote builds in B222078: Diff 508748.Mar 27 2023, 12:07 PM

Add if incase the tools aren't found for some reason. (This should never reasonably happen).

Harbormaster completed remote builds in B222278: Diff 509049.Mar 28 2023, 9:44 AM

tra added a subscriber: tra.Mar 28 2023, 11:15 AM

tra added inline comments.

libc/cmake/modules/prepare_libc_gpu_build.cmake
74	Is that a singular architecture, or can we supply a list? We may want to be able to build for a set of user-specified GPUs and, if I can dream out loud, allow specifying which GPU to run them on, so I could just do `niunja check-libc-sm_70` and it would run the tests with CUDA_VISIBLE_DEVICES=<ID of sm_70 GPU>

jhuber6 added inline comments.Mar 28 2023, 11:19 AM

libc/cmake/modules/prepare_libc_gpu_build.cmake
74	Right now it's a singular architecture. The reason for this is because the internal objects used for testing and the ones that will be exported via the `libcgpu.a` library are separate compilations. Since the existing `libc` testing architecture only ever expected a single executable, it was easiest to implement it this way. Basically, the `libcgpu.a` library contains `N` architectures packed into a single file and the internal testing implementation just builds for one. I don't think it's impossible to support what you're asking, it would just require a lot more CMake.

LGTM in general.

libc/cmake/modules/prepare_libc_gpu_build.cmake
74	Can we build a set of per-GPU test executables and then just pick one of them to run libc tests?

This revision is now accepted and ready to land.Mar 28 2023, 11:29 AM

jhuber6 added inline comments.Mar 28 2023, 11:32 AM

libc/cmake/modules/prepare_libc_gpu_build.cmake
74	So, `libc` works by making a bunch of target according to the directories and filename. So we compile some file and create the CMake target `libc.src.string.memcmp`. We then attach the compiled files to this target via some CMake property. We could potentially just attach several files to different properties and use the list of architectures to pull them out while building the tests. I didn't spend much time on changing it because I figured it would be sufficient to just test a single one at a time. But it would be interesting to be able to test an AMD and NVIDIA GPU at the same time.

Closed by commit rGc57a58cd0cfe: [libc] Support setting 'native' GPU architecture for libc (authored by jhuber6). · Explain WhyMar 28 2023, 11:50 AM

This revision was automatically updated to reflect the committed changes.

jhuber6 added a commit: rGc57a58cd0cfe: [libc] Support setting 'native' GPU architecture for libc.

tra added inline comments.Mar 28 2023, 11:56 AM

libc/cmake/modules/prepare_libc_gpu_build.cmake
74	My strawman idea was to, roughly, wrap the part of cmake which generates test targets into a loop iterating over GPUs, which would create the same targets, but with a GPU-specifc suffix, which should in theory make it a largely mechanical change. But I'm also not familiar with the details of libc cmake, so if it's more invasive than that, targeting single arch for tests is fine. It just looks a bit odd, considering that we do allow targeting multiple GPUs for the library itself. One would assume that we'd want to test multiple GPU variants, too.

jhuber6 added inline comments.Mar 28 2023, 11:59 AM

libc/cmake/modules/prepare_libc_gpu_build.cmake
74	Yeah, it's mostly just a convenience. The test files are built differently, they're intended to be compiled directly to a GPU image and then executed via the loader. The GPU library on the other hand is all LLVM-IR for LTO that's packaged into a fatbinary for the "new driver". It's certainly doable, but it's not very high on my priorities right now.

Revision Contents

Path

Size

libc/

cmake/

modules/

prepare_libc_gpu_build.cmake

125 lines

Diff 508748

libc/cmake/modules/prepare_libc_gpu_build.cmake

	if(NOT LIBC_TARGET_ARCHITECTURE_IS_GPU)			if(NOT LIBC_TARGET_ARCHITECTURE_IS_GPU)
	message(FATAL_ERROR			message(FATAL_ERROR
	"libc build: Invalid attempt to set up GPU architectures.")			"libc build: Invalid attempt to set up GPU architectures.")
	endif()			endif()

	# Set up the target architectures to build the GPU libc for.			# Set up the target architectures to build the GPU libc for.
	set(all_amdgpu_architectures "gfx700;gfx701;gfx801;gfx803;gfx900;gfx902;gfx906"			set(all_amdgpu_architectures "gfx700;gfx701;gfx801;gfx803;gfx900;gfx902;gfx906"
	"gfx908;gfx90a;gfx90c;gfx940;gfx1010;gfx1030"			"gfx908;gfx90a;gfx90c;gfx940;gfx1010;gfx1030"
	"gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036"			"gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036"
	"gfx1100;gfx1101;gfx1102;gfx1103")			"gfx1100;gfx1101;gfx1102;gfx1103")
	set(all_nvptx_architectures "sm_35;sm_37;sm_50;sm_52;sm_53;sm_60;sm_61;sm_62"			set(all_nvptx_architectures "sm_35;sm_37;sm_50;sm_52;sm_53;sm_60;sm_61;sm_62"
	"sm_70;sm_72;sm_75;sm_80;sm_86")			"sm_70;sm_72;sm_75;sm_80;sm_86")
	set(all_gpu_architectures			set(all_gpu_architectures
	"${all_amdgpu_architectures};${all_nvptx_architectures}")			"${all_amdgpu_architectures};${all_nvptx_architectures}")
	set(LIBC_GPU_ARCHITECTURES ${all_gpu_architectures} CACHE STRING			set(LIBC_GPU_ARCHITECTURES "all" CACHE STRING
	"List of GPU architectures to build the libc for.")			"List of GPU architectures to build the libc for.")
	if(LIBC_GPU_ARCHITECTURES STREQUAL "all")
	set(LIBC_GPU_ARCHITECTURES ${all_gpu_architectures} FORCE)
	endif()
	message(STATUS "Building libc for the following GPU architectures: "
	"${LIBC_GPU_ARCHITECTURES}")

	# Ensure the compiler is a valid clang when building the GPU target.			# Ensure the compiler is a valid clang when building the GPU target.
	set(req_ver "${LLVM_VERSION_MAJOR}.${LLVM_VERSION_MINOR}.${LLVM_VERSION_PATCH}")			set(req_ver "${LLVM_VERSION_MAJOR}.${LLVM_VERSION_MINOR}.${LLVM_VERSION_PATCH}")
	if(NOT (CMAKE_CXX_COMPILER_ID MATCHES "[Cc]lang" AND			if(NOT (CMAKE_CXX_COMPILER_ID MATCHES "[Cc]lang" AND
	${CMAKE_CXX_COMPILER_VERSION} VERSION_EQUAL "${req_ver}"))			${CMAKE_CXX_COMPILER_VERSION} VERSION_EQUAL "${req_ver}"))
	message(FATAL_ERROR "Cannot build libc for GPU. CMake compiler "			message(FATAL_ERROR "Cannot build libc for GPU. CMake compiler "
	"'${CMAKE_CXX_COMPILER_ID} ${CMAKE_CXX_COMPILER_VERSION}' "			"'${CMAKE_CXX_COMPILER_ID} ${CMAKE_CXX_COMPILER_VERSION}' "
	" is not `Clang ${req_ver}.")			" is not `Clang ${req_ver}.")
	endif()			endif()
	if(NOT LLVM_LIBC_FULL_BUILD)			if(NOT LLVM_LIBC_FULL_BUILD)
	message(STATUS "LLVM_LIBC_FULL_BUILD must be enabled to build libc for GPU. "			message(FATAL_ERROR "LLVM_LIBC_FULL_BUILD must be enabled to build libc for "
	"Overriding LLVM_LIBC_FULL_BUILD to ON.")			"GPU.")
	set(LLVM_LIBC_FULL_BUILD ON FORCE)
	endif()			endif()

				# Identify any locally installed AMD GPUs on the system using 'amdgpu-arch'.
				find_program(LIBC_AMDGPU_ARCH
				NAMES amdgpu-arch
				PATHS ${LLVM_BINARY_DIR}/bin /opt/rocm/llvm/bin/)

				# Identify any locally installed NVIDIA GPUs on the system using 'nvptx-arch'.
				find_program(LIBC_NVPTX_ARCH
				NAMES nvptx-arch
				PATHS ${LLVM_BINARY_DIR}/bin)

				# Get the list of all natively supported GPU architectures.
				set(detected_gpu_architectures "")
				foreach(arch_tool ${LIBC_NVPTX_ARCH} ${LIBC_AMDGPU_ARCH})
				execute_process(COMMAND ${arch_tool}
				OUTPUT_VARIABLE arch_tool_output
				OUTPUT_STRIP_TRAILING_WHITESPACE)
				string(REPLACE "\n" ";" arch_list "${arch_tool_output}")
				list(APPEND detected_gpu_architectures "${arch_list}")
				endforeach()

				if(LIBC_GPU_ARCHITECTURES STREQUAL "all")
				set(LIBC_GPU_ARCHITECTURES ${all_gpu_architectures})
				elseif(LIBC_GPU_ARCHITECTURES STREQUAL "native")
				if(NOT detected_gpu_architectures)
				message(FATAL_ERROR "No GPUs found on the system when using 'native'")
				endif()
				set(LIBC_GPU_ARCHITECTURES ${detected_gpu_architectures})
				endif()
				message(STATUS "Building libc for the following GPU architectures: "
				"${LIBC_GPU_ARCHITECTURES}")

	# Identify the program used to package multiple images into a single binary.			# Identify the program used to package multiple images into a single binary.
	find_program(LIBC_CLANG_OFFLOAD_PACKAGER			find_program(LIBC_CLANG_OFFLOAD_PACKAGER
	NAMES clang-offload-packager			NAMES clang-offload-packager
	PATHS ${LLVM_BINARY_DIR}/bin)			PATHS ${LLVM_BINARY_DIR}/bin)
	if(NOT LIBC_CLANG_OFFLOAD_PACKAGER)			if(NOT LIBC_CLANG_OFFLOAD_PACKAGER)
	message(FATAL_ERROR "Cannot find the 'clang-offload-packager' for the GPU "			message(FATAL_ERROR "Cannot find the 'clang-offload-packager' for the GPU "
	"build")			"build")
	endif()			endif()

	set(LIBC_GPU_TEST_ARCHITECTURE "" CACHE STRING "Architecture for the GPU tests")			set(LIBC_GPU_TEST_ARCHITECTURE "" CACHE STRING "Architecture for the GPU tests")

				set(gpu_test_architecture "")
	if(LIBC_GPU_TEST_ARCHITECTURE)			if(LIBC_GPU_TEST_ARCHITECTURE)
				traUnsubmitted Not Done Reply Inline Actions Is that a singular architecture, or can we supply a list? We may want to be able to build for a set of user-specified GPUs and, if I can dream out loud, allow specifying which GPU to run them on, so I could just do `niunja check-libc-sm_70` and it would run the tests with CUDA_VISIBLE_DEVICES=<ID of sm_70 GPU> tra: Is that a singular architecture, or can we supply a list? We may want to be able to build for…
				jhuber6AuthorUnsubmitted Done Reply Inline Actions Right now it's a singular architecture. The reason for this is because the internal objects used for testing and the ones that will be exported via the `libcgpu.a` library are separate compilations. Since the existing `libc` testing architecture only ever expected a single executable, it was easiest to implement it this way. Basically, the `libcgpu.a` library contains `N` architectures packed into a single file and the internal testing implementation just builds for one. I don't think it's impossible to support what you're asking, it would just require a lot more CMake. jhuber6: Right now it's a singular architecture. The reason for this is because the internal objects…
				traUnsubmitted Not Done Reply Inline Actions Can we build a set of per-GPU test executables and then just pick one of them to run libc tests? tra: Can we build a set of per-GPU test executables and then just pick one of them to run libc tests?
				jhuber6AuthorUnsubmitted Done Reply Inline Actions So, `libc` works by making a bunch of target according to the directories and filename. So we compile some file and create the CMake target `libc.src.string.memcmp`. We then attach the compiled files to this target via some CMake property. We could potentially just attach several files to different properties and use the list of architectures to pull them out while building the tests. I didn't spend much time on changing it because I figured it would be sufficient to just test a single one at a time. But it would be interesting to be able to test an AMD and NVIDIA GPU at the same time. jhuber6: So, `libc` works by making a bunch of target according to the directories and filename. So we…
				traUnsubmitted Not Done Reply Inline Actions My strawman idea was to, roughly, wrap the part of cmake which generates test targets into a loop iterating over GPUs, which would create the same targets, but with a GPU-specifc suffix, which should in theory make it a largely mechanical change. But I'm also not familiar with the details of libc cmake, so if it's more invasive than that, targeting single arch for tests is fine. It just looks a bit odd, considering that we do allow targeting multiple GPUs for the library itself. One would assume that we'd want to test multiple GPU variants, too. tra: My strawman idea was to, roughly, wrap the part of cmake which generates test targets into a…
				jhuber6AuthorUnsubmitted Done Reply Inline Actions Yeah, it's mostly just a convenience. The test files are built differently, they're intended to be compiled directly to a GPU image and then executed via the loader. The GPU library on the other hand is all LLVM-IR for LTO that's packaged into a fatbinary for the "new driver". It's certainly doable, but it's not very high on my priorities right now. jhuber6: Yeah, it's mostly just a convenience. The test files are built differently, they're intended to…
	message(STATUS "Using user-specified GPU architecture for testing "			set(gpu_test_architecture ${LIBC_GPU_TEST_ARCHITECTURE})
	"'${LIBC_GPU_TEST_ARCHITECTURE}'")			message(STATUS "Using user-specified GPU architecture for testing: "
	if("${LIBC_GPU_TEST_ARCHITECTURE}" IN_LIST all_amdgpu_architectures)			"'${gpu_test_architecture}'")
	set(LIBC_GPU_TARGET_ARCHITECTURE_IS_AMDGPU TRUE)			elseif(detected_gpu_architectures)
	set(LIBC_GPU_TARGET_TRIPLE "amdgcn-amd-amdhsa")			list(GET detected_gpu_architectures 0 gpu_test_architecture)
	set(LIBC_GPU_TARGET_ARCHITECTURE "${LIBC_GPU_TEST_ARCHITECTURE}")			message(STATUS "Using GPU architecture detected on the system for testing: "
	elseif("${LIBC_GPU_TEST_ARCHITECTURE}" IN_LIST all_nvptx_architectures)			"'${gpu_test_architecture}'")
	set(LIBC_GPU_TARGET_ARCHITECTURE_IS_NVPTX TRUE)
	set(LIBC_GPU_TARGET_TRIPLE "nvptx64-nvidia-cuda")
	set(LIBC_GPU_TARGET_ARCHITECTURE "${LIBC_GPU_TEST_ARCHITECTURE}")
	else()			else()
	message(FATAL_ERROR			message(STATUS "No GPU architecture set for testing. GPU tests will not be "
	"Unknown GPU architecture '${LIBC_GPU_TARGET_ARCHITECTURE}'")			"availibe.")
	endif()
	return()			return()
	endif()			endif()

	# Identify any locally installed AMD GPUs on the system to use for testing.			if("${gpu_test_architecture}" IN_LIST all_amdgpu_architectures)
	find_program(LIBC_AMDGPU_ARCH
	NAMES amdgpu-arch
	PATHS ${LLVM_BINARY_DIR}/bin /opt/rocm/llvm/bin/)
	if(LIBC_AMDGPU_ARCH)
	execute_process(COMMAND ${LIBC_AMDGPU_ARCH}
	OUTPUT_VARIABLE LIBC_AMDGPU_ARCH_OUTPUT
	OUTPUT_STRIP_TRAILING_WHITESPACE)
	string(FIND "${LIBC_AMDGPU_ARCH_OUTPUT}" "\n" first_arch_string)
	string(SUBSTRING "${LIBC_AMDGPU_ARCH_OUTPUT}" 0 ${first_arch_string}
	arch_string)
	if(arch_string)
	set(LIBC_GPU_TARGET_ARCHITECTURE_IS_AMDGPU TRUE)			set(LIBC_GPU_TARGET_ARCHITECTURE_IS_AMDGPU TRUE)
	set(LIBC_GPU_TARGET_TRIPLE "amdgcn-amd-amdhsa")			set(LIBC_GPU_TARGET_TRIPLE "amdgcn-amd-amdhsa")
	set(LIBC_GPU_TARGET_ARCHITECTURE "${arch_string}")			set(LIBC_GPU_TARGET_ARCHITECTURE "${gpu_test_architecture}")
	endif()			elseif("${gpu_test_architecture}" IN_LIST all_nvptx_architectures)
	endif()

	if(LIBC_GPU_TARGET_ARCHITECTURE_IS_AMDGPU)
	message(STATUS "Found an installed AMD GPU on the system with target "
	"architecture ${LIBC_GPU_TARGET_ARCHITECTURE} ")
	return()
	endif()

	# Identify any locally installed NVIDIA GPUs on the system to use for testing.
	find_program(LIBC_NVPTX_ARCH
	NAMES nvptx-arch
	PATHS ${LLVM_BINARY_DIR}/bin)
	if(LIBC_NVPTX_ARCH)
	execute_process(COMMAND ${LIBC_NVPTX_ARCH}
	OUTPUT_VARIABLE LIBC_NVPTX_ARCH_OUTPUT
	OUTPUT_STRIP_TRAILING_WHITESPACE)
	string(FIND "${LIBC_NVPTX_ARCH_OUTPUT}" "\n" first_arch_string)
	string(SUBSTRING "${LIBC_NVPTX_ARCH_OUTPUT}" 0 ${first_arch_string}
	arch_string)
	if(arch_string)
	set(LIBC_GPU_TARGET_ARCHITECTURE_IS_NVPTX TRUE)			set(LIBC_GPU_TARGET_ARCHITECTURE_IS_NVPTX TRUE)
	set(LIBC_GPU_TARGET_TRIPLE "nvptx64-nvidia-cuda")			set(LIBC_GPU_TARGET_TRIPLE "nvptx64-nvidia-cuda")
	set(LIBC_GPU_TARGET_ARCHITECTURE "${arch_string}")			set(LIBC_GPU_TARGET_ARCHITECTURE "${gpu_test_architecture}")
	endif()			else()
	endif()			message(FATAL_ERROR "Unknown GPU architecture '${gpu_test_architecture}'")

	if(LIBC_GPU_TARGET_ARCHITECTURE_IS_NVPTX)
	message(STATUS "Found an installed NVIDIA GPU on the system with target "
	"architecture ${LIBC_GPU_TARGET_ARCHITECTURE} ")
	return()
	endif()			endif()