This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
libc/
-
cmake/modules/
-
modules/
1/1
LLVMLibCCheckCpuFeatures.cmake
-
src/string/
-
string/
2/3
CMakeLists.txt
-
aarch64/
-
CMakeLists.txt
-
x86_64/
-
CMakeLists.txt
-
test/src/string/
-
src/
-
string/
-
CMakeLists.txt

Differential D101895

[libc] Simplifies multi implementations
ClosedPublic

Authored by gchatelet on May 5 2021, 4:40 AM.

Download Raw Diff

Details

Reviewers

sivachandra
avieira

Commits

rG541f107871bc: [libc] Simplifies multi implementations and benchmarks

Summary

This is a follow up on D101524 which:

simplifies cpu features detection and usage,
flattens target dependent optimizations so it's obvious which implementations are generated,
provides an implementation targeting the host (march/mtune=native) for the mem* functions,
makes sure all implementations are unittested (provided the host can run them).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

gchatelet created this revision.May 5 2021, 4:40 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2021, 4:40 AM

Herald added subscribers: libc-commits, ecnelises, tschuett, mgorny. · View Herald Transcript

use D101524 as base revision

Harbormaster completed remote builds in B102708: Diff 342999.May 5 2021, 4:42 AM

revert local change

Harbormaster completed remote builds in B102710: Diff 343001.May 5 2021, 4:44 AM

rebasing

Differ changes to the benchmarking system to a later patch

Differ changes to the benchmarking system to a later patch (for real)

Amending commit to reflect the change (hopefully)

One more time

gchatelet edited the summary of this revision. (Show Details)May 5 2021, 9:12 AM

LGTM otherwise.

libc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake
9	I'd really use -mcpu here though, as it will also enable available features.

This is now ready for review (sorry for the many updates)

@sivachandra with this patch we now produce a memcpy_opt_host (resp. memset_opt_host, bzero_opt_host) variant that compiles to the host native architecture in addition to the memcpy, memset, bzero functions that compiles to the compiler default architecture.

I think we have the following scenarios:

we compile a release and we want to specify which target to compile for (right now this is what the compiler decide is the default).
we compile a release via cross compilation (ARM compiled on x86 or vice versa), so we want to explicitly specify which arch to target.
we want to run benchmarks on a specific machine and want it to leverage -march/-mtune = native.

Harbormaster completed remote builds in B102768: Diff 343076.May 5 2021, 10:11 AM

Use mcpu instead of mtune for ARM

gchatelet retitled this revision from [libc] Simplifies multi implementations and benchmarks to [libc] Simplifies multi implementations.May 6 2021, 2:41 AM

Harbormaster completed remote builds in B102940: Diff 343333.May 6 2021, 3:13 AM

avieira added inline comments.May 6 2021, 6:33 AM

libc/src/string/CMakeLists.txt
234	During some of my experiments I learned that CMAKE prefers if you pass such options with "SHELL:-mllvm --tail-mege-threshold=0" just in case there is another '-mllvm opt' or '-A B' like option. Having said this, I am not seeing (with either SHELL or not) this option being passed when I do `ninja libc-benchmark-main -v -v', but don't quote me on this just yet, I might need to update all the patches you've put up for review. Any chance you can verify you are seeing the same on your side?

gchatelet marked an inline comment as done.May 6 2021, 7:29 AM

gchatelet added inline comments.

libc/src/string/CMakeLists.txt
234	I confirm that it does not work : D For one, I need to remove the double quotes, then SHELL does not seem to work. I'm looking into it and will provide an update shortly.

gchatelet added inline comments.May 6 2021, 7:45 AM

libc/src/string/CMakeLists.txt
234	So the proper way to do it is to double quote AND use `SHELL` at the same time, e.g. add_memcpy(memcpy COMPILE_OPTIONS "SHELL:-mllvm --tail-merge-threshold=0" COMPILE_OPTIONS "SHELL:-mllvm --combiner-global-alias-analysis") If `SHELL` is not provided we should not double quote. I'll update the patch soonish.

Fix CMake flag deduplication issue

Harbormaster completed remote builds in B102996: Diff 343406.May 6 2021, 8:30 AM

Overall LGTM. I have few questions I want to get clarity on.

I think we have the following scenarios:

we compile a release and we want to specify which target to compile for (right now this is what the compiler decide is the default).

In other words, when compiling for the host, we will let the compiler decide the best options?

we compile a release via cross compilation (ARM compiled on x86 or vice versa), so we want to explicitly specify which arch to target.

In other words, when cross-compiling, we will have to specify certain CMake vars explicitly?

we want to run benchmarks on a specific machine and want it to leverage -march/-mtune = native.

Does it mean that benchmarks are always compiled for the host?

In D101895#2742165, @sivachandra wrote:

I think we have the following scenarios:

we compile a release and we want to specify which target to compile for (right now this is what the compiler decide is the default).

In other words, when compiling for the host, we will let the compiler decide the best options?

Compiling for the host means using -march=native / -mcpu=native. Here I'm just talking about compiling llvm-libc without any options.
That is: when doing a full build, what is the targeted architecture?
AFAIU right now, the compiler decides. With a new version of the compiler the default may change and the targeted architecture as well.
It may be best to explicitly say "I want to compile llvm-libc for haswell when on x86".
AFAICT compilers are being very conservative and by default you get SSE2 only for x86 (check predefined macros without specifying march).

This is the rationale behind https://reviews.llvm.org/D101991

we compile a release via cross compilation (ARM compiled on x86 or vice versa), so we want to explicitly specify which arch to target.

In other words, when cross-compiling, we will have to specify certain CMake vars explicitly?

Yes as for the previous bullet point I believe.
You always want to specify what you are targeting.

we want to run benchmarks on a specific machine and want it to leverage -march/-mtune = native.

Does it mean that benchmarks are always compiled for the host?

We want to be able to benchmark all the implementations, presumably the host one will perform best.
I removed this part from the patch but we will provide one benchmark per function and per implementation so people can easily test and compare them out.

Thanks for explaining.

This revision is now accepted and ready to land.May 6 2021, 9:27 AM

I'll wait for D101991 before submitting this one.

Closed by commit rG541f107871bc: [libc] Simplifies multi implementations and benchmarks (authored by gchatelet). · Explain WhyMay 10 2021, 1:23 AM

This revision was automatically updated to reflect the committed changes.

gchatelet added a commit: rG541f107871bc: [libc] Simplifies multi implementations and benchmarks.

sivachandra added a reverting change: rG0c64cef89435: [libc] Rever "Simplifies multi implementations and benchmarks"..May 10 2021, 12:23 PM

Unfortunately, I had to revert this as the bots were failing like this: https://lab.llvm.org/buildbot/#/builders/78/builds/11551.
I looked around for a bit to see if I can fix forward, but I do not think I know enough about the various x86_64 arch names. May be -march=k8 is all we need, but I will let you decide what the best course of action is.

This change also triggers a failure on the aarch64 bots like this: https://lab.llvm.org/buildbot/#/builders/138/builds/4761

In D101895#2748607, @sivachandra wrote:

Unfortunately, I had to revert this as the bots were failing like this: https://lab.llvm.org/buildbot/#/builders/78/builds/11551.
I looked around for a bit to see if I can fix forward, but I do not think I know enough about the various x86_64 arch names. May be -march=k8 is all we need, but I will let you decide what the best course of action is.

Thx for the revert Siva. I was AFK when I saw the failure.
These pseudo archs are available from Clang 12 on. Do you happen to know which clang version is being used by the buildbot?
We can target more genuine architectures instead, that said those pseudo arch better capture the requirements.

In D101895#2748627, @gchatelet wrote:

Do you happen to know which clang version is being used by the buildbot?

The builders use clang-7 but I am working on upgrading them to clang-11. If you think clang-12 would be ideal, I can try getting it up to clang-12. I will try now and report back here.

In D101895#2748625, @sivachandra wrote:

This change also triggers a failure on the aarch64 bots like this: https://lab.llvm.org/buildbot/#/builders/138/builds/4761

Where can I see the real error? I can't find it on the page you linked in. All I can find is the python executor error which is not helpful :-/
Also, I'm puzzled as to why the failures triggers right now, the patch has been sent more than 9h ago.

In D101895#2748649, @gchatelet wrote:

Where can I see the real error? I can't find it on the page you linked in. All I can find is the python executor error which is not helpful :-/
Also, I'm puzzled as to why the failures triggers right now, the patch has been sent more than 9h ago.

Yes, the timing of the x86_64 failures confused me as well. The aarch64 bot dropped off the network over the weekend so I had to bring it back up. I noticed the failure as soon as I brought it up. About the full failure, if you click on the stdio links on the build page, it will take you to the full log like this: https://lab.llvm.org/buildbot/#/builders/138/builds/4761/steps/4/logs/stdio.

In D101895#2748640, @sivachandra wrote:

The builders use clang-7 but I am working on upgrading them to clang-11. If you think clang-12 would be ideal, I can try getting it up to clang-12. I will try now and report back here.

There is no easy way to go to clang-12. Can we work with clang-11? I will try to move to clang-11 now.

In D101895#2748666, @sivachandra wrote:

In D101895#2748649, @gchatelet wrote:

Where can I see the real error? I can't find it on the page you linked in. All I can find is the python executor error which is not helpful :-/
Also, I'm puzzled as to why the failures triggers right now, the patch has been sent more than 9h ago.

Yes, the timing of the x86_64 failures confused me as well. The aarch64 bot dropped off the network over the weekend so I had to bring it back up. I noticed the failure as soon as I brought it up. About the full failure, if you click on the stdio links on the build page, it will take you to the full log like this: https://lab.llvm.org/buildbot/#/builders/138/builds/4761/steps/4/logs/stdio.

Thank you! I'll create a separate patch with the roll forward and the fixes for both aarch64 and x86.

In D101895#2748691, @sivachandra wrote:

In D101895#2748640, @sivachandra wrote:

The builders use clang-7 but I am working on upgrading them to clang-11. If you think clang-12 would be ideal, I can try getting it up to clang-12. I will try now and report back here.

There is no easy way to go to clang-12. Can we work with clang-11? I will try to move to clang-11 now.

Yes it's fine, I'll use a different scheme, I think it's better to stick to what's broadly available. thx 👍

gchatelet mentioned this in D102233: [libc] Simplifies multi implementations and benchmarks.May 11 2021, 5:22 AM

gchatelet mentioned this in rG6351993da72e: [libc] Simplifies multi implementations.May 12 2021, 12:27 AM

Revision Contents

Path

Size

libc/

cmake/

modules/

LLVMLibCCheckCpuFeatures.cmake

102 lines

src/

string/

CMakeLists.txt

63 lines

aarch64/

CMakeLists.txt

x86_64/

CMakeLists.txt

test/

src/

string/

CMakeLists.txt

2 lines

Diff 343406

libc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake

	# ------------------------------------------------------------------------------			# ------------------------------------------------------------------------------
	# Cpu features definition and flags			# Cpu features definition and flags
	# ------------------------------------------------------------------------------			# ------------------------------------------------------------------------------

	if(${LIBC_TARGET_ARCHITECTURE_IS_X86})			if(${LIBC_TARGET_ARCHITECTURE_IS_X86})
	set(ALL_CPU_FEATURES SSE SSE2 AVX AVX2 AVX512F)			set(ALL_CPU_FEATURES SSE2 SSE4_2 AVX2 AVX512F)
	list(SORT ALL_CPU_FEATURES)			set(LIBC_COMPILE_OPTIONS_NATIVE -march=native)
				elseif(${LIBC_TARGET_ARCHITECTURE_IS_AARCH64})
				set(LIBC_COMPILE_OPTIONS_NATIVE -mcpu=native)
				avieiraUnsubmitted Done Reply Inline Actions I'd really use -mcpu here though, as it will also enable available features. avieira: I'd really use -mcpu here though, as it will also enable available features.
	endif()			endif()

				list(SORT ALL_CPU_FEATURES)

	# Function to check whether the target CPU supports the provided set of features.			# Function to check whether the target CPU supports the provided set of features.
	# Usage:			# Usage:
	# cpu_supports(			# cpu_supports(
	# <output variable>			# <output variable>
	# <list of cpu features>			# <list of cpu features>
	# )			# )
	function(cpu_supports output_var features)			function(cpu_supports output_var features)
	_intersection(var "${LIBC_CPU_FEATURES}" "${features}")			_intersection(var "${LIBC_CPU_FEATURES}" "${features}")
	if("${var}" STREQUAL "${features}")			if("${var}" STREQUAL "${features}")
	set(${output_var} TRUE PARENT_SCOPE)			set(${output_var} TRUE PARENT_SCOPE)
	else()			else()
	unset(${output_var} PARENT_SCOPE)			unset(${output_var} PARENT_SCOPE)
	endif()			endif()
	endfunction()			endfunction()

	# Function to compute the flags to pass down to the compiler.
	# Usage:
	# compute_flags(
	# <output variable>
	# MARCH <arch name or "native">
	# REQUIRE <list of mandatory features to enable>
	# REJECT <list of features to disable>
	# )
	function(compute_flags output_var)
	cmake_parse_arguments(
	"COMPUTE_FLAGS"
	"" # Optional arguments
	"MARCH" # Single value arguments
	"REQUIRE;REJECT" # Multi value arguments
	${ARGN})
	# Check that features are not required and rejected at the same time.
	if(COMPUTE_FLAGS_REQUIRE AND COMPUTE_FLAGS_REJECT)
	_intersection(var ${COMPUTE_FLAGS_REQUIRE} ${COMPUTE_FLAGS_REJECT})
	if(var)
	message(FATAL_ERROR "Cpu Features REQUIRE and REJECT ${var}")
	endif()
	endif()
	# Generate the compiler flags in `current`.
	if(${CMAKE_CXX_COMPILER_ID} MATCHES "Clang\|GNU")
	if(COMPUTE_FLAGS_MARCH)
	list(APPEND current "-march=${COMPUTE_FLAGS_MARCH}")
	endif()
	foreach(feature IN LISTS COMPUTE_FLAGS_REQUIRE)
	string(TOLOWER ${feature} lowercase_feature)
	list(APPEND current "-m${lowercase_feature}")
	endforeach()
	foreach(feature IN LISTS COMPUTE_FLAGS_REJECT)
	string(TOLOWER ${feature} lowercase_feature)
	list(APPEND current "-mno-${lowercase_feature}")
	endforeach()
	else()
	# In future, we can extend for other compilers.
	message(FATAL_ERROR "Unkown compiler ${CMAKE_CXX_COMPILER_ID}.")
	endif()
	# Export the list of flags.
	set(${output_var} "${current}" PARENT_SCOPE)
	endfunction()

	# ------------------------------------------------------------------------------			# ------------------------------------------------------------------------------
	# Internal helpers and utilities.			# Internal helpers and utilities.
	# ------------------------------------------------------------------------------			# ------------------------------------------------------------------------------

	# Computes the intersection between two lists.			# Computes the intersection between two lists.
	function(_intersection output_var list1 list2)			function(_intersection output_var list1 list2)
	foreach(element IN LISTS list1)			foreach(element IN LISTS list1)
	if("${list2}" MATCHES "(^\|;)${element}(;\|$)")			if("${list2}" MATCHES "(^\|;)${element}(;\|$)")
	Show All 13 Lines
	#endif")			#endif")
	endforeach()			endforeach()
	configure_file(			configure_file(
	"${LIBC_SOURCE_DIR}/cmake/modules/cpu_features/check_cpu_features.cpp.in"			"${LIBC_SOURCE_DIR}/cmake/modules/cpu_features/check_cpu_features.cpp.in"
	"cpu_features/check_cpu_features.cpp" @ONLY)			"cpu_features/check_cpu_features.cpp" @ONLY)
	endfunction()			endfunction()
	_generate_check_code()			_generate_check_code()

	# Compiles and runs the code generated above with the specified requirements.			set(LIBC_CPU_FEATURES "" CACHE PATH "Host supported CPU features")
	# This is helpful to infer which features a particular target supports or if
	# a specific features implies other features (e.g. BMI2 implies SSE2 and SSE).			if(CMAKE_CROSSCOMPILING)
	function(_check_defined_cpu_feature output_var)			_intersection(cpu_features "${ALL_CPU_FEATURES}" "${LIBC_CPU_FEATURES}")
	cmake_parse_arguments(			if(NOT "${cpu_features}" STREQUAL "${LIBC_CPU_FEATURES}")
	"CHECK_DEFINED"			message(FATAL_ERROR "Unsupported CPU features: ${cpu_features}")
	"" # Optional arguments			endif()
	"MARCH" # Single value arguments			set(LIBC_CPU_FEATURES "${cpu_features}")
	"REQUIRE;REJECT" # Multi value arguments			else()
	${ARGN})			# Populates the LIBC_CPU_FEATURES list from host.
	compute_flags(
	flags
	MARCH ${CHECK_DEFINED_MARCH}
	REQUIRE ${CHECK_DEFINED_REQUIRE}
	REJECT ${CHECK_DEFINED_REJECT})
	try_run(			try_run(
	run_result compile_result "${CMAKE_CURRENT_BINARY_DIR}/check_${feature}"			run_result compile_result "${CMAKE_CURRENT_BINARY_DIR}/check_${feature}"
	"${CMAKE_CURRENT_BINARY_DIR}/cpu_features/check_cpu_features.cpp"			"${CMAKE_CURRENT_BINARY_DIR}/cpu_features/check_cpu_features.cpp"
	COMPILE_DEFINITIONS ${flags}			COMPILE_DEFINITIONS ${LIBC_COMPILE_OPTIONS_NATIVE}
	COMPILE_OUTPUT_VARIABLE compile_output			COMPILE_OUTPUT_VARIABLE compile_output
	RUN_OUTPUT_VARIABLE run_output)			RUN_OUTPUT_VARIABLE run_output)
	if("${run_result}" EQUAL 0)			if("${run_result}" EQUAL 0)
	set(${output_var}			set(LIBC_CPU_FEATURES "${run_output}")
	"${run_output}"
	PARENT_SCOPE)
	elseif(NOT ${compile_result})			elseif(NOT ${compile_result})
	message(FATAL_ERROR "Failed to compile: ${compile_output}")			message(FATAL_ERROR "Failed to compile: ${compile_output}")
	else()			else()
	message(FATAL_ERROR "Failed to run: ${run_output}")			message(FATAL_ERROR "Failed to run: ${run_output}")
	endif()			endif()
	endfunction()

	set(LIBC_CPU_FEATURES "" CACHE PATH "supported CPU features")

	if(CMAKE_CROSSCOMPILING)
	_intersection(cpu_features "${ALL_CPU_FEATURES}" "${LIBC_CPU_FEATURES}")
	if(NOT "${cpu_features}" STREQUAL "${LIBC_CPU_FEATURES}")
	message(FATAL_ERROR "Unsupported CPU features: ${cpu_features}")
	endif()
	set(LIBC_CPU_FEATURES "${cpu_features}")
	else()
	# Populates the LIBC_CPU_FEATURES list.
	# Use -march=native only when the compiler supports it.
	include(CheckCXXCompilerFlag)
	CHECK_CXX_COMPILER_FLAG("-march=native" COMPILER_SUPPORTS_MARCH_NATIVE)
	if(COMPILER_SUPPORTS_MARCH_NATIVE)
	_check_defined_cpu_feature(LIBC_CPU_FEATURES MARCH native)
	else()
	_check_defined_cpu_feature(LIBC_CPU_FEATURES)
	endif()
	endif()			endif()

libc/src/string/CMakeLists.txt

	Show First 20 Lines • Show All 180 Lines • ▼ Show 20 Lines
	# - Computes flags to satisfy required/rejected features and arch,			# - Computes flags to satisfy required/rejected features and arch,
	# - Declares an entry point,			# - Declares an entry point,
	# - Attach the REQUIRE_CPU_FEATURES property to the target,			# - Attach the REQUIRE_CPU_FEATURES property to the target,
	# - Add the fully qualified target to `${name}_implementations` global property for tests.			# - Add the fully qualified target to `${name}_implementations` global property for tests.
	function(add_implementation name impl_name)			function(add_implementation name impl_name)
	cmake_parse_arguments(			cmake_parse_arguments(
	"ADD_IMPL"			"ADD_IMPL"
	"" # Optional arguments			"" # Optional arguments
	"MARCH" # Single value arguments			"" # Single value arguments
	"REQUIRE;REJECT;SRCS;HDRS;DEPENDS;COMPILE_OPTIONS" # Multi value arguments			"REQUIRE;SRCS;HDRS;DEPENDS;COMPILE_OPTIONS" # Multi value arguments
	${ARGN})			${ARGN})
	compute_flags(flags
	MARCH ${ADD_IMPL_MARCH}
	REQUIRE ${ADD_IMPL_REQUIRE}
	REJECT ${ADD_IMPL_REJECT}
	)
	add_entrypoint_object(${impl_name}			add_entrypoint_object(${impl_name}
	NAME ${name}			NAME ${name}
	SRCS ${ADD_IMPL_SRCS}			SRCS ${ADD_IMPL_SRCS}
	HDRS ${ADD_IMPL_HDRS}			HDRS ${ADD_IMPL_HDRS}
	DEPENDS ${ADD_IMPL_DEPENDS}			DEPENDS ${ADD_IMPL_DEPENDS}
	COMPILE_OPTIONS ${ADD_IMPL_COMPILE_OPTIONS} ${flags} -O2			COMPILE_OPTIONS ${ADD_IMPL_COMPILE_OPTIONS}
	)			)
	get_fq_target_name(${impl_name} fq_target_name)			get_fq_target_name(${impl_name} fq_target_name)
	set_target_properties(${fq_target_name} PROPERTIES REQUIRE_CPU_FEATURES "${ADD_IMPL_REQUIRE}")			set_target_properties(${fq_target_name} PROPERTIES REQUIRE_CPU_FEATURES "${ADD_IMPL_REQUIRE}")
	set_property(GLOBAL APPEND PROPERTY "${name}_implementations" "${fq_target_name}")			set_property(GLOBAL APPEND PROPERTY "${name}_implementations" "${fq_target_name}")
	endfunction()			endfunction()

	# ------------------------------------------------------------------------------			# ------------------------------------------------------------------------------
	# memcpy			# memcpy
	# ------------------------------------------------------------------------------			# ------------------------------------------------------------------------------

	# include the relevant architecture specific implementations
	if(${LIBC_TARGET_ARCHITECTURE_IS_X86})
	set(MEMCPY_SRC ${LIBC_SOURCE_DIR}/src/string/${LIBC_TARGET_ARCHITECTURE}/memcpy.cpp)
	elseif(${LIBC_TARGET_ARCHITECTURE_IS_AARCH64})
	set(MEMCPY_SRC ${LIBC_SOURCE_DIR}/src/string/${LIBC_TARGET_ARCHITECTURE}/memcpy.cpp)
	#Disable tail merging as it leads to lower performance
	set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mllvm --tail-merge-threshold=0")
	else()
	set(MEMCPY_SRC ${LIBC_SOURCE_DIR}/src/string/memcpy.cpp)
	endif()

	function(add_memcpy memcpy_name)			function(add_memcpy memcpy_name)
	add_implementation(memcpy ${memcpy_name}			add_implementation(memcpy ${memcpy_name}
	SRCS ${MEMCPY_SRC}			SRCS ${MEMCPY_SRC}
	HDRS ${LIBC_SOURCE_DIR}/src/string/memcpy.h			HDRS ${LIBC_SOURCE_DIR}/src/string/memcpy.h
	DEPENDS			DEPENDS
	.memory_utils.memory_utils			.memory_utils.memory_utils
	libc.include.string			libc.include.string
	COMPILE_OPTIONS			COMPILE_OPTIONS
	-fno-builtin-memcpy			-fno-builtin-memcpy
	${ARGN}			${ARGN}
	)			)
	endfunction()			endfunction()

	if(${LIBC_TARGET_ARCHITECTURE_IS_X86})			if(${LIBC_TARGET_ARCHITECTURE_IS_X86})
	add_memcpy(memcpy MARCH native)			set(MEMCPY_SRC ${LIBC_SOURCE_DIR}/src/string/x86_64/memcpy.cpp)
				add_memcpy(memcpy_x86_64_opt_sse2 COMPILE_OPTIONS -march=x86-64 REQUIRE SSE2)
				add_memcpy(memcpy_x86_64_opt_sse4 COMPILE_OPTIONS -march=x86-64-v2 REQUIRE SSE4_2)
				add_memcpy(memcpy_x86_64_opt_avx2 COMPILE_OPTIONS -march=x86-64-v3 REQUIRE AVX2)
				add_memcpy(memcpy_x86_64_opt_avx512 COMPILE_OPTIONS -march=x86-64-v4 REQUIRE AVX512F)
				add_memcpy(memcpy_opt_host COMPILE_OPTIONS ${LIBC_COMPILE_OPTIONS_NATIVE})
				add_memcpy(memcpy)
				elseif(${LIBC_TARGET_ARCHITECTURE_IS_AARCH64})
				set(MEMCPY_SRC ${LIBC_SOURCE_DIR}/src/string/aarch64/memcpy.cpp)
				# Disable tail merging as it leads to lower performance.
				# Note that '-mllvm' needs to be prefixed with 'SHELL:' to prevent CMake flag deduplication.
				add_memcpy(memcpy_opt_host COMPILE_OPTIONS ${LIBC_COMPILE_OPTIONS_NATIVE}
				COMPILE_OPTIONS "SHELL:-mllvm --tail-merge-threshold=0")
				avieiraUnsubmitted Not Done Reply Inline Actions During some of my experiments I learned that CMAKE prefers if you pass such options with "SHELL:-mllvm --tail-mege-threshold=0" just in case there is another '-mllvm opt' or '-A B' like option. Having said this, I am not seeing (with either SHELL or not) this option being passed when I do `ninja libc-benchmark-main -v -v', but don't quote me on this just yet, I might need to update all the patches you've put up for review. Any chance you can verify you are seeing the same on your side? avieira: During some of my experiments I learned that CMAKE prefers if you pass such options with "SHELL…
				gchateletAuthorUnsubmitted Done Reply Inline Actions I confirm that it does not work : D For one, I need to remove the double quotes, then SHELL does not seem to work. I'm looking into it and will provide an update shortly. gchatelet: I confirm that it does not work : D For one, I need to remove the double quotes, then SHELL…
				gchateletAuthorUnsubmitted Done Reply Inline Actions So the proper way to do it is to double quote AND use `SHELL` at the same time, e.g. add_memcpy(memcpy COMPILE_OPTIONS "SHELL:-mllvm --tail-merge-threshold=0" COMPILE_OPTIONS "SHELL:-mllvm --combiner-global-alias-analysis") If `SHELL` is not provided we should not double quote. I'll update the patch soonish. gchatelet: So the proper way to do it is to double quote AND use `SHELL` at the same time, e.g. ```…
				add_memcpy(memcpy COMPILE_OPTIONS "SHELL:-mllvm --tail-merge-threshold=0")
	else()			else()
				set(MEMCPY_SRC ${LIBC_SOURCE_DIR}/src/string/memcpy.cpp)
				add_memcpy(memcpy_opt_host COMPILE_OPTIONS ${LIBC_COMPILE_OPTIONS_NATIVE})
	add_memcpy(memcpy)			add_memcpy(memcpy)
	endif()			endif()

	# ------------------------------------------------------------------------------			# ------------------------------------------------------------------------------
	# memset			# memset
	# ------------------------------------------------------------------------------			# ------------------------------------------------------------------------------

	function(add_memset memset_name)			function(add_memset memset_name)
	add_implementation(memset ${memset_name}			add_implementation(memset ${memset_name}
	SRCS ${LIBC_SOURCE_DIR}/src/string/memset.cpp			SRCS ${LIBC_SOURCE_DIR}/src/string/memset.cpp
	HDRS ${LIBC_SOURCE_DIR}/src/string/memset.h			HDRS ${LIBC_SOURCE_DIR}/src/string/memset.h
	DEPENDS			DEPENDS
	.memory_utils.memory_utils			.memory_utils.memory_utils
	libc.include.string			libc.include.string
	COMPILE_OPTIONS			COMPILE_OPTIONS
	-fno-builtin-memset			-fno-builtin-memset
	${ARGN}			${ARGN}
	)			)
	endfunction()			endfunction()

	if(${LIBC_TARGET_ARCHITECTURE_IS_X86})			if(${LIBC_TARGET_ARCHITECTURE_IS_X86})
	add_memset(memset MARCH native)			add_memset(memset_x86_64_opt_sse2 COMPILE_OPTIONS -march=x86-64 REQUIRE SSE2)
				add_memset(memset_x86_64_opt_sse4 COMPILE_OPTIONS -march=x86-64-v2 REQUIRE SSE4_2)
				add_memset(memset_x86_64_opt_avx2 COMPILE_OPTIONS -march=x86-64-v3 REQUIRE AVX2)
				add_memset(memset_x86_64_opt_avx512 COMPILE_OPTIONS -march=x86-64-v4 REQUIRE AVX512F)
				add_memset(memset_opt_host COMPILE_OPTIONS ${LIBC_COMPILE_OPTIONS_NATIVE})
				add_memset(memset)
	else()			else()
				add_memset(memset_opt_host COMPILE_OPTIONS ${LIBC_COMPILE_OPTIONS_NATIVE})
	add_memset(memset)			add_memset(memset)
	endif()			endif()

	# ------------------------------------------------------------------------------			# ------------------------------------------------------------------------------
	# bzero			# bzero
	# ------------------------------------------------------------------------------			# ------------------------------------------------------------------------------

	function(add_bzero bzero_name)			function(add_bzero bzero_name)
	add_implementation(bzero ${bzero_name}			add_implementation(bzero ${bzero_name}
	SRCS ${LIBC_SOURCE_DIR}/src/string/bzero.cpp			SRCS ${LIBC_SOURCE_DIR}/src/string/bzero.cpp
	HDRS ${LIBC_SOURCE_DIR}/src/string/bzero.h			HDRS ${LIBC_SOURCE_DIR}/src/string/bzero.h
	DEPENDS			DEPENDS
	.memory_utils.memory_utils			.memory_utils.memory_utils
	libc.include.string			libc.include.string
	COMPILE_OPTIONS			COMPILE_OPTIONS
	-fno-builtin-memset			-fno-builtin-memset
	-fno-builtin-bzero			-fno-builtin-bzero
	${ARGN}			${ARGN}
	)			)
	endfunction()			endfunction()

	if(${LIBC_TARGET_ARCHITECTURE_IS_X86})			if(${LIBC_TARGET_ARCHITECTURE_IS_X86})
	add_bzero(bzero MARCH native)			add_bzero(bzero_x86_64_opt_sse2 COMPILE_OPTIONS -march=x86-64 REQUIRE SSE2)
				add_bzero(bzero_x86_64_opt_sse4 COMPILE_OPTIONS -march=x86-64-v2 REQUIRE SSE4_2)
				add_bzero(bzero_x86_64_opt_avx2 COMPILE_OPTIONS -march=x86-64-v3 REQUIRE AVX2)
				add_bzero(bzero_x86_64_opt_avx512 COMPILE_OPTIONS -march=x86-64-v4 REQUIRE AVX512F)
				add_bzero(bzero_opt_host COMPILE_OPTIONS ${LIBC_COMPILE_OPTIONS_NATIVE})
				add_bzero(bzero)
	else()			else()
				add_bzero(bzero_opt_host COMPILE_OPTIONS ${LIBC_COMPILE_OPTIONS_NATIVE})
	add_bzero(bzero)			add_bzero(bzero)
	endif()			endif()

	# ------------------------------------------------------------------------------
	# Add all other relevant implementations for the native target.
	# ------------------------------------------------------------------------------

	if(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/${LIBC_TARGET_ARCHITECTURE})
	include(${LIBC_TARGET_ARCHITECTURE}/CMakeLists.txt)
	endif()

libc/src/string/aarch64/CMakeLists.txt

This file was deleted.

add_memcpy("memcpy_${LIBC_TARGET_ARCHITECTURE}")

libc/src/string/x86_64/CMakeLists.txt

This file was deleted.

	add_memcpy("memcpy_${LIBC_TARGET_ARCHITECTURE}_opt_none" REJECT "${ALL_CPU_FEATURES}")
	add_memcpy("memcpy_${LIBC_TARGET_ARCHITECTURE}_opt_sse" REQUIRE "SSE" REJECT "SSE2")
	add_memcpy("memcpy_${LIBC_TARGET_ARCHITECTURE}_opt_avx" REQUIRE "AVX" REJECT "AVX2")
	add_memcpy("memcpy_${LIBC_TARGET_ARCHITECTURE}_opt_avx512f" REQUIRE "AVX512F")

	add_memset("memset_${LIBC_TARGET_ARCHITECTURE}_opt_none" REJECT "${ALL_CPU_FEATURES}")
	add_memset("memset_${LIBC_TARGET_ARCHITECTURE}_opt_sse" REQUIRE "SSE" REJECT "SSE2")
	add_memset("memset_${LIBC_TARGET_ARCHITECTURE}_opt_avx" REQUIRE "AVX" REJECT "AVX2")
	add_memset("memset_${LIBC_TARGET_ARCHITECTURE}_opt_avx512f" REQUIRE "AVX512F")

	add_bzero("bzero_${LIBC_TARGET_ARCHITECTURE}_opt_none" REJECT "${ALL_CPU_FEATURES}")
	add_bzero("bzero_${LIBC_TARGET_ARCHITECTURE}_opt_sse" REQUIRE "SSE" REJECT "SSE2")
	add_bzero("bzero_${LIBC_TARGET_ARCHITECTURE}_opt_avx" REQUIRE "AVX" REJECT "AVX2")
	add_bzero("bzero_${LIBC_TARGET_ARCHITECTURE}_opt_avx512f" REQUIRE "AVX512F")

libc/test/src/string/CMakeLists.txt

Show First 20 Lines • Show All 190 Lines • ▼ Show 20 Lines	foreach(fq_config_name IN LISTS fq_implementations)
cpu_supports(can_run "${required_cpu_features}")		cpu_supports(can_run "${required_cpu_features}")
if(can_run)		if(can_run)
add_libc_unittest(		add_libc_unittest(
${fq_config_name}_test		${fq_config_name}_test
SUITE		SUITE
libc_string_unittests		libc_string_unittests
DEPENDS		DEPENDS
${fq_config_name}		${fq_config_name}
		COMPILE_OPTIONS
		${LIBC_COMPILE_OPTIONS_NATIVE}
${ARGN}		${ARGN}
)		)
else()		else()
message(STATUS "Skipping test for '${fq_config_name}' insufficient host cpu features '${required_cpu_features}'")		message(STATUS "Skipping test for '${fq_config_name}' insufficient host cpu features '${required_cpu_features}'")
endif()		endif()
endforeach()		endforeach()
endfunction()		endfunction()

add_libc_multi_impl_test(memcpy SRCS memcpy_test.cpp)		add_libc_multi_impl_test(memcpy SRCS memcpy_test.cpp)
add_libc_multi_impl_test(memset SRCS memset_test.cpp)		add_libc_multi_impl_test(memset SRCS memset_test.cpp)
add_libc_multi_impl_test(bzero SRCS bzero_test.cpp)		add_libc_multi_impl_test(bzero SRCS bzero_test.cpp)

This is an archive of the discontinued LLVM Phabricator instance.

[libc] Simplifies multi implementationsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 343406

libc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake

libc/src/string/CMakeLists.txt

libc/src/string/aarch64/CMakeLists.txt

libc/src/string/x86_64/CMakeLists.txt

libc/test/src/string/CMakeLists.txt

[libc] Simplifies multi implementations
ClosedPublic