Tested on MacOSX and Linux.
For robustness we can go the OpenCV way and add individual c++ files with intrinsics.
https://github.com/opencv/opencv/blob/master/cmake/checks/cpu_avx2.cpp
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 46912 Build 49599: arc lint + arc unit
Event Timeline
There are slight variations in how the cpu features are encoded for instance OSX has AVX1.0 where Linux has AVX.
So I believe we need something more robust.
For robustness we can go the OpenCV way and add individual c++ files with intrinsics.
https://github.com/opencv/opencv/blob/master/cmake/checks/cpu_avx2.cpp
I think this makes the most sense. It's also what cmake does in the initial configuration step where it creates small source files and sees if they compile. Do all of the cpu features that you are interested in have related intrinsics? Or will we need to assemble and run files to see if the cpu raises an invalid opcode exception?
Yes, this is what I was hinting at when I proposed the try_run/try_compile solution.
I am OK with this change however if you think it will help you make progress with other things but may be put this in a x86 specific location.
libc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake | ||
---|---|---|
11 | If OSX requires more massaging of the results, then may be just not include it here? | |
21 | Apparently, cpuinfo on powerpc does not give the flags! |
libc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake | ||
---|---|---|
2 | I had a hard time jumping through the various pieces here. WDYT of this suggestion: Populate ALL_CPU_FEATURES first up as: if(${LIBC_TARGET_MACHINE} MATCHES "x86|x86_64") set(ALL_CPU_FEATURES SSE SSE2 AVX AVX512) endif() Then, you can have just one function called check_cpu_features which populates HOST_CPU_FEATURES as a return value. You choose to use a helper function like define_compiler_options_for_feature: function(define_compiler_options_for_feature feature) if(${CMAKE_CXX_COMPILER_ID} MATCHES "Clang") string(TOLOWER ${feature} lowercase_feature) set(CPU_FEATURE_${name}_ENABLE_FLAG "-m${lowercase_feature}" PARENT_SCOPE) set(CPU_FEATURE_${name}_DISABLE_FLAG "-mno-${lowercase_feature}" PARENT_SCOPE) else() # In future, we can extend for other compilers. message(FATAL "Unkown compiler ${CMAKE_CXX_COMPILER_ID}.") endif() endfunction() | |
23 | We shouldn't need this block. | |
30 | Does it need to be set here? | |
52 | We can push this foreach block into the above function and instead have a call to the function without any args? If you want to split, then may be call the above function check_single_cpu_feature which gives a yes/no answer, and the function check_cpu_features will call this single feature function and populate AVAILABLE_CPU_FEATURES as a return value. When cross-compiling, one will have to set AVAILABLE_CPU_FEATURES manually. WDYT? |
The overall approach is solid. I left a few comments related to future extensions/scalability.
Please note that this is not the final solution since it assumes a linear hierarchy of the features. This is the case for X86 but not true for other architectures like ARM.
I need to think about it a little more but we can start from here.
This has a better top-down flow. I left a few minor comments inline but accepting as I agree with you that we can iterate as required.
libc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake | ||
---|---|---|
17 | Do we really need this function? We only iterate over the ALL_CPU_FEATURES list which is a literal list defined in this file. | |
45 | If you want these flags available in global scope, then may be prefix them with CPU_FEATURE_OPT_<FEATURE>_FLAGS? | |
89 | Like above, do we need this call? |
Rebased and addressed comments.
libc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake | ||
---|---|---|
17 | Removed. |
I had a hard time jumping through the various pieces here. WDYT of this suggestion:
Populate ALL_CPU_FEATURES first up as:
Then, you can have just one function called check_cpu_features which populates HOST_CPU_FEATURES as a return value. You choose to use a helper function like define_compiler_options_for_feature: