Download Raw Diff

Details

Reviewers

jdoerfert
AndreyChurbanov
Hahnfeld

Commits

rG676c7cb0c0d4: [OpenMP] Added the support for cache line size 256 for A64FX

Summary

Fugaku supercomputer is built with the Fujitsu A64FX microprocessor, whose cache line is 256. In current libomp, we only have cache line size 128 for PPC64 and otherwise 64. This patch added the support of cache line 256 for A64FX. It's worth noting that although A64FX is a variant of AArch64, this property is not shared. As a result, in light of UCX source code (https://github.com/openucx/ucx/blob/392443ab92626412605dee1572056f79c897c6c3/src/ucs/arch/aarch64/cpu.c#L17), we can only determine by checking whether the CPU is FUJITSU A64FX.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tianshilei1992 created this revision.Dec 12 2020, 1:36 PM

Herald added subscribers: s.egerton, guansong, simoncook and 3 others. · View Herald TranscriptDec 12 2020, 1:36 PM

tianshilei1992 requested review of this revision.Dec 12 2020, 1:36 PM

Herald added a reviewer: jdoerfert. · View Herald TranscriptDec 12 2020, 1:36 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: openmp-commits, sstefan1. · View Herald Transcript

Fixed a typo in comments

tianshilei1992 added a reviewer: AndreyChurbanov.Dec 12 2020, 1:43 PM

Harbormaster completed remote builds in B82160: Diff 311410.Dec 12 2020, 2:03 PM

Harbormaster completed remote builds in B82161: Diff 311411.

LGTM, at least I don't see anything obviously problematic

This revision is now accepted and ready to land.Jan 7 2021, 9:01 AM

Why is it necessary to write and compile a C program just to parse /proc/cpuinfo? Can this be done directly from CMake?

In D93169#2484725, @Hahnfeld wrote:

Why is it necessary to write and compile a C program just to parse /proc/cpuinfo? Can this be done directly from CMake?

Unfortunately we can't. CMAKE_SYSTEM_PROCESSOR just reports aarch64.

Updated to make OMPT work

In D93169#2485228, @tianshilei1992 wrote:

In D93169#2484725, @Hahnfeld wrote:

Why is it necessary to write and compile a C program just to parse /proc/cpuinfo? Can this be done directly from CMake?

Unfortunately we can't. CMAKE_SYSTEM_PROCESSOR just reports aarch64.

I understand that the predefined variables don't work, but we can surely parse /proc/cpuinfo with native CMake commands instead of compiling a C program, no?

Harbormaster completed remote builds in B84376: Diff 315221.Jan 7 2021, 1:58 PM

Parse /proc/cpuinfo with CMake directly

Optimized string match

Removed useless comment

Harbormaster completed remote builds in B84389: Diff 315242.Jan 7 2021, 2:50 PM

Harbormaster completed remote builds in B84392: Diff 315245.Jan 7 2021, 3:04 PM

Harbormaster completed remote builds in B84393: Diff 315246.Jan 7 2021, 3:13 PM

Hahnfeld added inline comments.Jan 7 2021, 11:30 PM

openmp/runtime/cmake/LibompGetArchitecture.cmake
78–81	Can you use `TRUE` and `FALSE` here? This also avoids the overly generic `MATCHES "1"` at call site.

Used TRUE/FALSE instead of 0/1 for LIBOMP_DETECT_AARCH64_A64FX

tianshilei1992 marked an inline comment as done.Jan 9 2021, 7:17 AM

Harbormaster completed remote builds in B84579: Diff 315600.Jan 9 2021, 7:45 AM

LGTM, thanks for the changes!

Closed by commit rG676c7cb0c0d4: [OpenMP] Added the support for cache line size 256 for A64FX (authored by tianshilei1992). · Explain WhyJan 9 2021, 8:58 AM

This revision was automatically updated to reflect the committed changes.

tianshilei1992 added a commit: rG676c7cb0c0d4: [OpenMP] Added the support for cache line size 256 for A64FX.

I'm going to revert this as it breaks CMake on systems which do not have /proc/cpuinfo such as macOS.

This may be a bit hard to see because the code isn't reached unless the architecture is aarch64, but on an ARM macOS system that path hits. It would also hit on other BSDs or other OSes running on AArch64 but without /proc/cpuinfo.

For your reference, here is the error message from CMake for me:

CMake Error at /Users/chandlerc/src/llvm/llvm-project/openmp/runtime/cmake/LibompGetArchitecture.cmake:74 (file):
  file failed to open for reading (No such file or directory):

    /proc/cpuinfo
Call Stack (most recent call first):
  /Users/chandlerc/src/llvm/llvm-project/openmp/runtime/CMakeLists.txt:73 (libomp_is_aarch64_a64fx)

In D93169#2503946, @chandlerc wrote:
I'm going to revert this as it breaks CMake on systems which do not have /proc/cpuinfo such as macOS.

This may be a bit hard to see because the code isn't reached unless the architecture is aarch64, but on an ARM macOS system that path hits. It would also hit on other BSDs or other OSes running on AArch64 but without /proc/cpuinfo.

For your reference, here is the error message from CMake for me:
CMake Error at /Users/chandlerc/src/llvm/llvm-project/openmp/runtime/cmake/LibompGetArchitecture.cmake:74 (file):
  file failed to open for reading (No such file or directory):

    /proc/cpuinfo
Call Stack (most recent call first):
  /Users/chandlerc/src/llvm/llvm-project/openmp/runtime/CMakeLists.txt:73 (libomp_is_aarch64_a64fx)

Will fix it right away.

In D93169#2503947, @tianshilei1992 wrote:
In D93169#2503946, @chandlerc wrote:
I'm going to revert this as it breaks CMake on systems which do not have /proc/cpuinfo such as macOS.

This may be a bit hard to see because the code isn't reached unless the architecture is aarch64, but on an ARM macOS system that path hits. It would also hit on other BSDs or other OSes running on AArch64 but without /proc/cpuinfo.

For your reference, here is the error message from CMake for me:
CMake Error at /Users/chandlerc/src/llvm/llvm-project/openmp/runtime/cmake/LibompGetArchitecture.cmake:74 (file):
  file failed to open for reading (No such file or directory):

    /proc/cpuinfo
Call Stack (most recent call first):
  /Users/chandlerc/src/llvm/llvm-project/openmp/runtime/CMakeLists.txt:73 (libomp_is_aarch64_a64fx)
Will fix it right away.

I didn't think my CMake fu was up to it, but I think I have a fix, WDYT: https://reviews.llvm.org/D94889

In D93169#2503960, @chandlerc wrote:
In D93169#2503947, @tianshilei1992 wrote:
In D93169#2503946, @chandlerc wrote:
I'm going to revert this as it breaks CMake on systems which do not have /proc/cpuinfo such as macOS.

This may be a bit hard to see because the code isn't reached unless the architecture is aarch64, but on an ARM macOS system that path hits. It would also hit on other BSDs or other OSes running on AArch64 but without /proc/cpuinfo.

For your reference, here is the error message from CMake for me:
CMake Error at /Users/chandlerc/src/llvm/llvm-project/openmp/runtime/cmake/LibompGetArchitecture.cmake:74 (file):
  file failed to open for reading (No such file or directory):

    /proc/cpuinfo
Call Stack (most recent call first):
  /Users/chandlerc/src/llvm/llvm-project/openmp/runtime/CMakeLists.txt:73 (libomp_is_aarch64_a64fx)
Will fix it right away.
I didn't think my CMake fu was up to it, but I think I have a fix, WDYT: https://reviews.llvm.org/D94889

And thanks to the speedy review, landed fix in: https://github.com/llvm/llvm-project/commit/f855751c1284c82c1c46b98f6d1b3ca2021d6cb9

Diff 315611

openmp/runtime/CMakeLists.txt

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	else() # Part of LLVM build
elseif(LIBOMP_NATIVE_ARCH MATCHES "riscv64")		elseif(LIBOMP_NATIVE_ARCH MATCHES "riscv64")
set(LIBOMP_ARCH riscv64)		set(LIBOMP_ARCH riscv64)
else()		else()
# last ditch effort		# last ditch effort
libomp_get_architecture(LIBOMP_ARCH)		libomp_get_architecture(LIBOMP_ARCH)
endif ()		endif ()
set(LIBOMP_ENABLE_ASSERTIONS ${LLVM_ENABLE_ASSERTIONS})		set(LIBOMP_ENABLE_ASSERTIONS ${LLVM_ENABLE_ASSERTIONS})
endif()		endif()
libomp_check_variable(LIBOMP_ARCH 32e x86_64 32 i386 arm ppc64 ppc64le aarch64 mic mips mips64 riscv64)
		# FUJITSU A64FX is a special processor because its cache line size is 256.
		# We need to pass this information into kmp_config.h.
		if(LIBOMP_ARCH STREQUAL "aarch64")
		libomp_is_aarch64_a64fx(LIBOMP_DETECT_AARCH64_A64FX)
		if (LIBOMP_DETECT_AARCH64_A64FX)
		set(LIBOMP_ARCH "aarch64_a64fx")
		set(LIBOMP_ARCH_AARCH64_A64FX TRUE)
		endif()
		endif()

		libomp_check_variable(LIBOMP_ARCH 32e x86_64 32 i386 arm ppc64 ppc64le aarch64 aarch64_a64fx mic mips mips64 riscv64)

set(LIBOMP_LIB_TYPE normal CACHE STRING		set(LIBOMP_LIB_TYPE normal CACHE STRING
"Performance,Profiling,Stubs library (normal/profile/stubs)")		"Performance,Profiling,Stubs library (normal/profile/stubs)")
libomp_check_variable(LIBOMP_LIB_TYPE normal profile stubs)		libomp_check_variable(LIBOMP_LIB_TYPE normal profile stubs)
# Set the OpenMP Year and Month associated with version		# Set the OpenMP Year and Month associated with version
set(LIBOMP_OMP_YEAR_MONTH 201611)		set(LIBOMP_OMP_YEAR_MONTH 201611)
set(LIBOMP_MIC_ARCH knc CACHE STRING		set(LIBOMP_MIC_ARCH knc CACHE STRING
"Intel(R) Many Integrated Core Architecture (Intel(R) MIC Architecture) (knf/knc). Ignored if not Intel(R) MIC Architecture build.")		"Intel(R) Many Integrated Core Architecture (Intel(R) MIC Architecture) (knf/knc). Ignored if not Intel(R) MIC Architecture build.")
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
# Currently don't record any timestamps		# Currently don't record any timestamps
set(LIBOMP_BUILD_DATE "No_Timestamp")		set(LIBOMP_BUILD_DATE "No_Timestamp")

# Architecture		# Architecture
set(IA32 FALSE)		set(IA32 FALSE)
set(INTEL64 FALSE)		set(INTEL64 FALSE)
set(ARM FALSE)		set(ARM FALSE)
set(AARCH64 FALSE)		set(AARCH64 FALSE)
		set(AARCH64_A64FX FALSE)
set(PPC64BE FALSE)		set(PPC64BE FALSE)
set(PPC64LE FALSE)		set(PPC64LE FALSE)
set(PPC64 FALSE)		set(PPC64 FALSE)
set(MIC FALSE)		set(MIC FALSE)
set(MIPS64 FALSE)		set(MIPS64 FALSE)
set(MIPS FALSE)		set(MIPS FALSE)
set(RISCV64 FALSE)		set(RISCV64 FALSE)
if("${LIBOMP_ARCH}" STREQUAL "i386" OR "${LIBOMP_ARCH}" STREQUAL "32") # IA-32 architecture		if("${LIBOMP_ARCH}" STREQUAL "i386" OR "${LIBOMP_ARCH}" STREQUAL "32") # IA-32 architecture
set(IA32 TRUE)		set(IA32 TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "x86_64" OR "${LIBOMP_ARCH}" STREQUAL "32e") # Intel(R) 64 architecture		elseif("${LIBOMP_ARCH}" STREQUAL "x86_64" OR "${LIBOMP_ARCH}" STREQUAL "32e") # Intel(R) 64 architecture
set(INTEL64 TRUE)		set(INTEL64 TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "arm") # ARM architecture		elseif("${LIBOMP_ARCH}" STREQUAL "arm") # ARM architecture
set(ARM TRUE)		set(ARM TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "ppc64") # PPC64BE architecture		elseif("${LIBOMP_ARCH}" STREQUAL "ppc64") # PPC64BE architecture
set(PPC64BE TRUE)		set(PPC64BE TRUE)
set(PPC64 TRUE)		set(PPC64 TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "ppc64le") # PPC64LE architecture		elseif("${LIBOMP_ARCH}" STREQUAL "ppc64le") # PPC64LE architecture
set(PPC64LE TRUE)		set(PPC64LE TRUE)
set(PPC64 TRUE)		set(PPC64 TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "aarch64") # AARCH64 architecture		elseif("${LIBOMP_ARCH}" STREQUAL "aarch64") # AARCH64 architecture
set(AARCH64 TRUE)		set(AARCH64 TRUE)
		elseif("${LIBOMP_ARCH}" STREQUAL "aarch64_a64fx") # AARCH64_A64FX architecture
		set(AARCH64_A64FX TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "mic") # Intel(R) Many Integrated Core Architecture		elseif("${LIBOMP_ARCH}" STREQUAL "mic") # Intel(R) Many Integrated Core Architecture
set(MIC TRUE)		set(MIC TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "mips") # MIPS architecture		elseif("${LIBOMP_ARCH}" STREQUAL "mips") # MIPS architecture
set(MIPS TRUE)		set(MIPS TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "mips64") # MIPS64 architecture		elseif("${LIBOMP_ARCH}" STREQUAL "mips64") # MIPS64 architecture
set(MIPS64 TRUE)		set(MIPS64 TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "riscv64") # RISCV64 architecture		elseif("${LIBOMP_ARCH}" STREQUAL "riscv64") # RISCV64 architecture
set(RISCV64 TRUE)		set(RISCV64 TRUE)
▲ Show 20 Lines • Show All 219 Lines • Show Last 20 Lines

openmp/runtime/cmake/LibompGetArchitecture.cmake

Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	function(libomp_get_architecture return_arch)
string(REPLACE "ARCHITECTURE=" "" local_architecture "${local_architecture}")		string(REPLACE "ARCHITECTURE=" "" local_architecture "${local_architecture}")

# set the return value to the architecture detected (e.g., 32e, 32, arm, ppc64, etc.)		# set the return value to the architecture detected (e.g., 32e, 32, arm, ppc64, etc.)
set(${return_arch} "${local_architecture}" PARENT_SCOPE)		set(${return_arch} "${local_architecture}" PARENT_SCOPE)

# Remove ${detect_arch_src_txt} from cmake/ subdirectory		# Remove ${detect_arch_src_txt} from cmake/ subdirectory
file(REMOVE "${CMAKE_CURRENT_BINARY_DIR}/libomp_detect_arch.c")		file(REMOVE "${CMAKE_CURRENT_BINARY_DIR}/libomp_detect_arch.c")
endfunction()		endfunction()

		function(libomp_is_aarch64_a64fx return_is_aarch64_a64fx)
		file(READ "/proc/cpuinfo" cpu_info_content)
		string(REGEX MATCH "CPU implementer[ \t]*: 0x46\n" cpu_implementer ${cpu_info_content})
		string(REGEX MATCH "CPU architecture[ \t]*: 8\n" cpu_architecture ${cpu_info_content})

		set(is_aarch64_a64fx FALSE)
		if (cpu_architecture AND cpu_implementer)
		set(is_aarch64_a64fx TRUE)
		endif()
		HahnfeldUnsubmitted Done Reply Inline Actions Can you use `TRUE` and `FALSE` here? This also avoids the overly generic `MATCHES "1"` at call site. Hahnfeld: Can you use `TRUE` and `FALSE` here? This also avoids the overly generic `MATCHES "1"` at call…

		set(${return_is_aarch64_a64fx} "${is_aarch64_a64fx}" PARENT_SCOPE)
		endfunction(libomp_is_aarch64_a64fx)

openmp/runtime/cmake/LibompUtils.cmake

Show First 20 Lines • Show All 95 Lines • ▼ Show 20 Lines	function(libomp_get_legal_arch return_arch_string)
elseif(${ARM})		elseif(${ARM})
set(${return_arch_string} "ARM" PARENT_SCOPE)		set(${return_arch_string} "ARM" PARENT_SCOPE)
elseif(${PPC64BE})		elseif(${PPC64BE})
set(${return_arch_string} "PPC64BE" PARENT_SCOPE)		set(${return_arch_string} "PPC64BE" PARENT_SCOPE)
elseif(${PPC64LE})		elseif(${PPC64LE})
set(${return_arch_string} "PPC64LE" PARENT_SCOPE)		set(${return_arch_string} "PPC64LE" PARENT_SCOPE)
elseif(${AARCH64})		elseif(${AARCH64})
set(${return_arch_string} "AARCH64" PARENT_SCOPE)		set(${return_arch_string} "AARCH64" PARENT_SCOPE)
		elseif(${AARCH64_A64FX})
		set(${return_arch_string} "AARCH64_A64FX" PARENT_SCOPE)
elseif(${MIPS})		elseif(${MIPS})
set(${return_arch_string} "MIPS" PARENT_SCOPE)		set(${return_arch_string} "MIPS" PARENT_SCOPE)
elseif(${MIPS64})		elseif(${MIPS64})
set(${return_arch_string} "MIPS64" PARENT_SCOPE)		set(${return_arch_string} "MIPS64" PARENT_SCOPE)
elseif(${RISCV64})		elseif(${RISCV64})
set(${return_arch_string} "RISCV64" PARENT_SCOPE)		set(${return_arch_string} "RISCV64" PARENT_SCOPE)
else()		else()
set(${return_arch_string} "${LIBOMP_ARCH}" PARENT_SCOPE)		set(${return_arch_string} "${LIBOMP_ARCH}" PARENT_SCOPE)
▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

openmp/runtime/cmake/config-ix.cmake

	Show First 20 Lines • Show All 285 Lines • ▼ Show 20 Lines
	if(NOT LIBOMP_HAVE___BUILTIN_FRAME_ADDRESS)			if(NOT LIBOMP_HAVE___BUILTIN_FRAME_ADDRESS)
	set(LIBOMP_HAVE_OMPT_SUPPORT FALSE)			set(LIBOMP_HAVE_OMPT_SUPPORT FALSE)
	else()			else()
	if( # hardware architecture supported?			if( # hardware architecture supported?
	((LIBOMP_ARCH STREQUAL x86_64) OR			((LIBOMP_ARCH STREQUAL x86_64) OR
	(LIBOMP_ARCH STREQUAL i386) OR			(LIBOMP_ARCH STREQUAL i386) OR
	# (LIBOMP_ARCH STREQUAL arm) OR			# (LIBOMP_ARCH STREQUAL arm) OR
	(LIBOMP_ARCH STREQUAL aarch64) OR			(LIBOMP_ARCH STREQUAL aarch64) OR
				(LIBOMP_ARCH STREQUAL aarch64_a64fx) OR
	(LIBOMP_ARCH STREQUAL ppc64le) OR			(LIBOMP_ARCH STREQUAL ppc64le) OR
	(LIBOMP_ARCH STREQUAL ppc64) OR			(LIBOMP_ARCH STREQUAL ppc64) OR
	(LIBOMP_ARCH STREQUAL riscv64))			(LIBOMP_ARCH STREQUAL riscv64))
	AND # OS supported?			AND # OS supported?
	((WIN32 AND LIBOMP_HAVE_PSAPI) OR APPLE OR (NOT WIN32 AND LIBOMP_HAVE_WEAK_ATTRIBUTE)))			((WIN32 AND LIBOMP_HAVE_PSAPI) OR APPLE OR (NOT WIN32 AND LIBOMP_HAVE_WEAK_ATTRIBUTE)))
	set(LIBOMP_HAVE_OMPT_SUPPORT TRUE)			set(LIBOMP_HAVE_OMPT_SUPPORT TRUE)
	else()			else()
	set(LIBOMP_HAVE_OMPT_SUPPORT FALSE)			set(LIBOMP_HAVE_OMPT_SUPPORT FALSE)
	Show All 30 Lines

openmp/runtime/src/kmp_config.h.cmake

	Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
	#cmakedefine01 LIBOMP_HAVE_IMMINTRIN_H			#cmakedefine01 LIBOMP_HAVE_IMMINTRIN_H
	#define KMP_HAVE_IMMINTRIN_H LIBOMP_HAVE_IMMINTRIN_H			#define KMP_HAVE_IMMINTRIN_H LIBOMP_HAVE_IMMINTRIN_H
	#cmakedefine01 LIBOMP_HAVE_INTRIN_H			#cmakedefine01 LIBOMP_HAVE_INTRIN_H
	#define KMP_HAVE_INTRIN_H LIBOMP_HAVE_INTRIN_H			#define KMP_HAVE_INTRIN_H LIBOMP_HAVE_INTRIN_H
	#cmakedefine01 LIBOMP_HAVE_ATTRIBUTE_WAITPKG			#cmakedefine01 LIBOMP_HAVE_ATTRIBUTE_WAITPKG
	#define KMP_HAVE_ATTRIBUTE_WAITPKG LIBOMP_HAVE_ATTRIBUTE_WAITPKG			#define KMP_HAVE_ATTRIBUTE_WAITPKG LIBOMP_HAVE_ATTRIBUTE_WAITPKG
	#cmakedefine01 LIBOMP_HAVE_ATTRIBUTE_RTM			#cmakedefine01 LIBOMP_HAVE_ATTRIBUTE_RTM
	#define KMP_HAVE_ATTRIBUTE_RTM LIBOMP_HAVE_ATTRIBUTE_RTM			#define KMP_HAVE_ATTRIBUTE_RTM LIBOMP_HAVE_ATTRIBUTE_RTM
				#cmakedefine01 LIBOMP_ARCH_AARCH64_A64FX
				#define KMP_ARCH_AARCH64_A64FX LIBOMP_ARCH_AARCH64_A64FX

	// Configured cache line based on architecture			// Configured cache line based on architecture
	#if KMP_ARCH_PPC64			#if KMP_ARCH_PPC64
	# define CACHE_LINE 128			# define CACHE_LINE 128
				#elif KMP_ARCH_AARCH64_A64FX
				# define CACHE_LINE 256
	#else			#else
	# define CACHE_LINE 64			# define CACHE_LINE 64
	#endif			#endif

	#if ! KMP_32_BIT_ARCH			#if ! KMP_32_BIT_ARCH
	# define BUILD_I8 1			# define BUILD_I8 1
	#endif			#endif

	Show All 27 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Added the support for cache line size 256 for A64FX
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 315611

openmp/runtime/CMakeLists.txt

openmp/runtime/cmake/LibompGetArchitecture.cmake

openmp/runtime/cmake/LibompUtils.cmake

openmp/runtime/cmake/config-ix.cmake

openmp/runtime/src/kmp_config.h.cmake

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Added the support for cache line size 256 for A64FXClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 315611

openmp/runtime/CMakeLists.txt

openmp/runtime/cmake/LibompGetArchitecture.cmake

openmp/runtime/cmake/LibompUtils.cmake

openmp/runtime/cmake/config-ix.cmake

openmp/runtime/src/kmp_config.h.cmake

[OpenMP] Added the support for cache line size 256 for A64FX
ClosedPublic