This is an archive of the discontinued LLVM Phabricator instance.

[X86] Support arch=x86-64{,-v2,-v3,-v4} for target_clones attribute
ClosedPublic

Authored by MaskRay on Aug 18 2023, 8:56 PM.

Details

Summary

GCC 12 (https://gcc.gnu.org/PR101696) allows arch=x86-64
arch=x86-64-v2 arch=x86-64-v3 arch=x86-64-v4 in the
target_clones function attribute. This patch ports the feature.

  • Set KeyFeature to x86-64{,-v2,-v3,-v4} in Processors[], to be used by X86TargetInfo::multiVersionSortPriority
  • builtins: change __cpu_features2 to an array like libgcc. Define FEATURE_X86_64_{BASELINE,V2,V3,V4} and depended ISA feature bits.
  • CGBuiltin.cpp: update EmitX86CpuSupports to handle arch=x86-64*.

Close https://github.com/llvm/llvm-project/issues/55830

Diff Detail

Event Timeline

MaskRay created this revision.Aug 18 2023, 8:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2023, 8:56 PM
MaskRay requested review of this revision.Aug 18 2023, 8:56 PM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptAug 18 2023, 8:56 PM
Herald added subscribers: llvm-commits, Restricted Project, cfe-commits. · View Herald Transcript
MaskRay retitled this revision from [X86] Support arch=x86-64{,-v2,-v3,-v4} for target/target_clones attributes to [X86] Support arch=x86-64{,-v2,-v3,-v4} for target_clones attribute.Aug 18 2023, 8:59 PM
MaskRay edited the summary of this revision. (Show Details)
MaskRay added reviewers: erichkeane, Restricted Project.Aug 18 2023, 9:01 PM
erichkeane added inline comments.Aug 18 2023, 9:15 PM
clang/lib/CodeGen/CGBuiltin.cpp
13334

Can this be a std::array instead? The C array is pretty meaningless here.

13371

Why is this in the loop? Is there a good reason to not just always initialize it?

clang/lib/CodeGen/CodeGenFunction.cpp
2688

Can you clarify the comments here on these? I'm a little confused by their meaning.

pengfei added inline comments.Aug 19 2023, 2:49 AM
clang/lib/CodeGen/CGBuiltin.cpp
13329

Use Lo_32(Mask), Hi_32(Mask)?

13375–13377

Do we have problem when linking with old version of libgcc?

compiler-rt/lib/builtins/cpu_model.c
810

Can we reuse bit defination in cpuid.h, e.g., if (ECX & bit_LAHF_LM)? The same below.

810

Can we reuse bit defination in cpuid.h, e.g., if (ECX & bit_LAHF_LM)? The same below.

MaskRay updated this revision to Diff 551773.Aug 19 2023, 10:53 AM
MaskRay marked 6 inline comments as done.

address comments

clang/lib/CodeGen/CGBuiltin.cpp
13329

Lo_32/Hi_32 are currently unused in clang/. Using this needs MathExtras.h. I am unsure it's worthwhile..

13375–13377

If we don't use __cpu_features2[1..3], no problem with older libgcc.

The way GCC upgraded __cpu_features2 to an array is compatible with scalar __cpu_features2.

compiler-rt/lib/builtins/cpu_model.c
810

Thanks for the cpuid.h idea, but I think likely no. MSVC does not provide cpuid.h. It would be nice if we don't allow MSVC to build the file (we can say that LLVM_ENABLE_PROJECTS='...compiler-rt' is not allowed and LLVM_ENABLE_RUNTIMES or Clang should be used, but I think we are not there yet).

Unfortunately I am also unsure whether anyone uses MSVC to build this file, but there are Windows related macros like FIXME: For MSVC, we should make a function pointer global in in this file...

Cant help with the RT/LLVM parts, but the clang parts are pretty much right.

clang/lib/CodeGen/CGBuiltin.cpp
13329

I was previously using Lo_32/Hi_32 here (see the removed parts of EmitX86CpuSupports). Were we using those transitively?

13334

Hmm... I guess size-wise this is on the edge of "const ref vs pass by value". I think its fine now, but 'next time' this grows we'll have to think about making this a const-ref.

13367

This won't double-create this if used more than 1x, right? There doesn't need to be something like GetOrCreate... here?

MaskRay updated this revision to Diff 551787.Aug 19 2023, 2:22 PM
MaskRay marked 4 inline comments as done.

Use Lo_32/Hi_32

clang/lib/CodeGen/CGBuiltin.cpp
13334

Yes. Right now passing 2 registers on a 64-bit target is more efficient.
'next time' may be quite a while from now:)

13367

CodeGenModule::CreateRuntimeVariable calls GetOrCreate internally. It's fine to be called multiple times.

pengfei added inline comments.Aug 20 2023, 5:08 PM
clang/lib/CodeGen/CGBuiltin.cpp
13375–13377

I assume GCC seldom links to older libgcc, but there's no guarantee Clang can link to latest libgcc.
And we cannot assume user won't use it once the method provided.
I don't have a good idea for it, but I think we should write the requirement done in release note or somewhere if we cannot find a better way.

FreddyYe added inline comments.Aug 20 2023, 9:48 PM
compiler-rt/lib/builtins/cpu_model.c
167

= 57 redundant

MaskRay added inline comments.Aug 20 2023, 10:21 PM
clang/lib/CodeGen/CGBuiltin.cpp
13375–13377

The way we treat __cpu_features2 as an array is compatible with older libgcc providing scalar __cpu_features2, as long as we don't access the high bits (FEATURE_X86_64_{BASELINE,V2,V3,V4}).

  • When arch=x86-64 is used, this patch changes a compiler crash to (if newer libgcc, work; if older libgcc, runtime crash)
  • When arch=x86-64* is unused, the same as before.
compiler-rt/lib/builtins/cpu_model.c
167

Thanks. Minor issue, I've fixed it locally and will only upload a new diff if other things need changing.

MaskRay marked an inline comment as done.Aug 20 2023, 10:23 PM
MaskRay added inline comments.
clang/lib/CodeGen/CGBuiltin.cpp
13375–13377

I think a release note isn't really necessary as the compatibility story with this patch is strictly better than the current state.

FreddyYe added inline comments.Aug 21 2023, 12:43 AM
clang/lib/CodeGen/CGBuiltin.cpp
13375

Will function multi version also be fixed in this patch? https://gcc.godbolt.org/z/cafhs9qbG If so, need to add test in clang/test/CodeGen/attr-target-mv.c

FreddyYe added inline comments.Aug 21 2023, 12:45 AM
clang/lib/CodeGen/CGBuiltin.cpp
13373

Don't quite understand why we need to change from a i32* into <4 x i32>* and the first element here to be always zero.

MaskRay updated this revision to Diff 552078.Aug 21 2023, 10:42 AM
MaskRay marked 3 inline comments as done.

<4 x i32> => <3 x i32> (though no difference at runtime)
remove a redundant = 57

clang/lib/CodeGen/CGBuiltin.cpp
13373

Sorry, this global variable (array) should be <3 x i32>. Fixed.

This (strange) scheme is for GCC/libgcc compatibility: bits 0~31 are in __cpu_model.__cpufeatures[0] while bits 32~max are in __cpu_features2[0..2]

13375

The target attribute has strange semantics for overloading:

int __attribute__((target("arch=skylake"))) foo(void) {return 0;}
int __attribute__((target("arch=x86-64"))) foo(void) {return 1;}

is not allowed in C mode of GCC.

I think such use cases are not recommended in C++.

If we don't use overloading, int __attribute__((target("arch=x86-64"))) foo(void) {return 1;} is supported by Clang today and this patch does not intend to change anything about it.

FreddyYe added inline comments.Aug 22 2023, 12:06 AM
clang/lib/CodeGen/CGBuiltin.cpp
13375

I think behavior for target attribute is not only overloading but also function multiversioning for redefined functions. And seems like C model of gcc haven't supported is due to it will target C23 standard: https://clang.llvm.org/docs/AttributeReference.html#target
Comparing to gcc, clang misses not only target attribute but also other cpuname/feature related features for "x86-64*". See https://gcc.godbolt.org/z/arhne35GG (Seems gcc defined x86-64* as not only cpu name but also feature name.) Anyway, this patch is targeting for target_clones attribute only. So no problem here.

MaskRay added inline comments.Aug 22 2023, 12:16 AM
clang/lib/CodeGen/CGBuiltin.cpp
13375

So no problem here.

Right. Not supporting C and requiring duplicate definitions makes target not really useful.
https://gcc.gnu.org/pipermail/gcc-patches/2015-October/430585.html Jeff Law said: "I wasn't aware that multi-versioning was only implemented for C++, that seems fairly lame. I hope I didn't approve that :-)"

Matt added a subscriber: Matt.Aug 22 2023, 9:35 PM

(I think there is no action item on my side and this patch is good for review)

pengfei accepted this revision.Aug 23 2023, 9:34 PM

LGTM.

This revision is now accepted and ready to land.Aug 23 2023, 9:34 PM
This revision was landed with ongoing or failed builds.Aug 23 2023, 10:09 PM
This revision was automatically updated to reflect the committed changes.