This is an archive of the discontinued LLVM Phabricator instance.

[X86] Support -march=rocketlake
ClosedPublic

Authored by FreddyYe on Apr 7 2021, 8:28 PM.

Diff Detail

Event Timeline

FreddyYe created this revision.Apr 7 2021, 8:28 PM
FreddyYe requested review of this revision.Apr 7 2021, 8:28 PM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptApr 7 2021, 8:28 PM
Herald added subscribers: llvm-commits, Restricted Project, cfe-commits. · View Herald Transcript
MaskRay added a subscriber: MaskRay.Apr 7 2021, 8:57 PM
MaskRay added inline comments.
clang/test/Preprocessor/predefined-arch-macros.c
1404

The file may need some refactoring first. You can let RUN lines share some common check prefixes, instead of adding a bunch of defines for every new processor.

// CHECK_X86_64_V2: ...
// CHECK_X86_64_V2: ...
// CHECK_X86_64_V3: ...
// CHECK_PROCESSOR1_M32:
// CHECK_PROCESSOR1_M64:
// CHECK_PROCESSOR2_M32:
// CHECK_PROCESSOR2_M64:
craig.topper added inline comments.Apr 7 2021, 10:00 PM
compiler-rt/lib/builtins/cpu_model.c
101

This order is defined by libgcc. We can't insert in the middle unless ZNVER3 was in the wrong place

Why this not referenced in the switch the select subtype?

llvm/lib/Target/X86/X86.td
767

Is this list this long because SKL includes SGX but RKL doesn't?

THX for review!

clang/test/Preprocessor/predefined-arch-macros.c
1404

I agree. I'll do it

compiler-rt/lib/builtins/cpu_model.c
101

This is a mistake. I'll modify. And reference is missing in two switch. I'll add.

llvm/lib/Target/X86/X86.td
767

Yes. And I don't know any simple ways to exclude SGX here, any suggestions?

FreddyYe updated this revision to Diff 336007.Apr 7 2021, 11:00 PM

Updating according to comments. Test refactoring not done.

Hi @MaskRay, I tried to refactor, but met some difficulties. Since these defines are dictionary ordered, a new #define may insert into a common CHECK. So it is difficult to let different RUN share common CHECKs.

// RUN: %clang -march=pentium-mmx -m32 -E -dM %s -o - 2>&1 \
// RUN:     -target i386-unknown-linux \
// RUN:   | FileCheck -match-full-lines %s -check-prefixes=CHECK_I386_M32,CHECK_I586_M32,CHECK_PENTIUM_MMX_M32

// CHECK_I386_M32: #define __i386 1
// CHECK_I386_M32: #define __i386__ 1
// CHECK_I386_M32: #define i386 1

// CHECK_I586_M32: #define __i586 1
// CHECK_I586_M32: #define __i586__ 1
// CHECK_I586_M32: #define __pentium 1
// CHECK_I586_M32: #define __pentium__ 1
// CHECK_I586_M32: #define __tune_i586__ 1
// CHECK_I586_M32: #define __tune_pentium__ 1

// CHECK_PENTIUM_MMX_M32: #define __MMX__ 1
// CHECK_PENTIUM_MMX_M32: #define __pentium_mmx__ 1
// CHECK_PENTIUM_MMX_M32: #define __tune_pentium_mmx__ 1

The example above will destroy the original order of the define list. Do you have some good suggestions?

Hi @MaskRay, I tried to refactor, but met some difficulties. Since these defines are dictionary ordered, a new #define may insert into a common CHECK. So it is difficult to let different RUN share common CHECKs.

Check prefixes of the same kind do not need to be contiguous.

// A:
// B:
// A:

Hi @MaskRay, I tried to refactor, but met some difficulties. Since these defines are dictionary ordered, a new #define may insert into a common CHECK. So it is difficult to let different RUN share common CHECKs.

Check prefixes of the same kind do not need to be contiguous.

// A:
// B:
// A:

I see. And It works. And I'll update. And thank you!

RKSimon added a subscriber: RKSimon.Apr 8 2021, 1:59 AM
RKSimon added inline comments.
llvm/lib/Target/X86/X86.td
790

Using ICLTuning suggests we should still be avoiding 512-bit ops (FeaturePrefer256Bit) - is this still true for RKL (or anything past CNL...)?

I posted PR48336 but never got any response, but from what others have reported (Travis Downs, Phoronix etc) its mainly a power issue these days, not a perf issue due to big freq drops.

FreddyYe updated this revision to Diff 336133.Apr 8 2021, 8:44 AM

update lit test and clang-format

Hi @MaskRay , I tried to refactor the test file by assembling the common CHECKs. But I found it will lead to too many check-prefixes ine one RUN line. For example,

// RUN: %clang -march=c3 -m32 -E -dM %s -o - 2>&1 \
// RUN:     -target i386-unknown-linux \
// RUN:   | FileCheck -match-full-lines %s -check-prefixes=CHECK_I386_M32C,CHECK_I486_M32C,CHECK_I486_M32S,CHECK_PENTIUM_MMX_M32S,CHECK_WINCHIP2_M32S

// CHECK_WINCHIP2_M32S: #define __3dNOW__ 1
// CHECK_PENTIUM_MMX_M32S:    #define __MMX__ 1
// CHECK_WINCHIP_C6_M32S: #define __MMX__ 1
// CHECK_I386_M32C:             #define __i386 1
// CHECK_I386_M32C:             #define __i386__ 1
// CHECK_I486_M32C:             #define __i486 1
// CHECK_I486_M32C:             #define __i486__ 1
// CHECK_I586_M32C:             #define __i586 1
// CHECK_I586_M32C:             #define __i586__ 1
// CHECK_I586_M32C:             #define __pentium 1
// CHECK_I586_M32C:             #define __pentium__ 1
// CHECK_PENTIUM_MMX_M32S:      #define __pentium_mmx__ 1
// CHECK_I386_M32S:         #define __tune_i386__ 1
// CHECK_I486_M32S:         #define __tune_i486__ 1
// CHECK_PENTIUM_MMX_M32S:      #define __tune_pentium_mmx__ 1
// CHECK_I386_M32C:             #define i386 1

And that CPU is still a very old CPU. It is easy to imagine how long for example skylake's RUN line is. Generally speaking, X86's ISA evolution between different chips is not very clear. So this refactoring work is beyond my ETA. I uploaded a new version to reuse the most similar CHECKS. Does that look good to you?

craig.topper added inline comments.Apr 8 2021, 10:20 AM
llvm/lib/Target/X86/X86.td
767

Nothing pretty. Guess it depends on if SGX is going to not appear in more future CPUs or if this is a one off case. If it's going to continue then we could remove it from the inheritance and just give it to SKL, ICL, CNL, etc. individually.

Or we could just not default SGX on for any CPU. It's probably not all that useful in the backend anyway. Clang will put it in the target-feature attribute anyway. Having it in the backend feature lists doesn't really do anything since I don't think we have any IR intrinsics for SGX.

FreddyYe added inline comments.Apr 8 2021, 6:58 PM
llvm/lib/Target/X86/X86.td
790

We need more tests on such as SPEC to see whether we can default enable FeaturePrefer512bit.

FreddyYe added inline comments.Apr 8 2021, 7:05 PM
llvm/lib/Target/X86/X86.td
767

Agree. Like we did in https://reviews.llvm.org/D88006. SGX is also not useful in the backend.

FreddyYe updated this revision to Diff 336295.Apr 8 2021, 7:24 PM

delete FeatureSGX in the backend since there are no IR intrinsics for SGX.

skan added inline comments.Apr 8 2021, 7:37 PM
llvm/lib/Support/X86TargetParser.cpp
414–415

It's not correct to format here in this patch and do not mix tab with space.

craig.topper added inline comments.Apr 8 2021, 7:46 PM
llvm/lib/Target/X86/X86.td
271

Clang still puts it in target-features attribute so you can’t delete this or you’ll get a warning that the feature doesn’t exist.

skan added inline comments.Apr 8 2021, 7:49 PM
llvm/lib/Target/X86/X86.td
271–273

If you delete the definition of FeatureSGX, you need to remove the related code in X86Subtarget.h too. BTW, I don't think "there are no IR intrinsics for a feature" is a good reason to remove a feature.

craig.topper added inline comments.Apr 8 2021, 7:58 PM
llvm/lib/Target/X86/X86.td
271–273

I only said to remove it from the CPUs because for llc -march=skylake it doesn’t matter if we enable SGX because there’s nothing you can test from llc.

FreddyYe updated this revision to Diff 336301.Apr 8 2021, 8:01 PM

cancel the clang-format in constexpr ProcInfo Processors[] = {}

FreddyYe updated this revision to Diff 336303.Apr 8 2021, 8:19 PM

revert clang-format and revert deleting FeatureSGX def.

Hi @craig.topper and @skan , THX for review! I tested that deleting sgx indeed leads to not generating "+sgx" in 'target-features', didn't know before:)

'+sgx' is not a recognized feature for this target (ignoring feature)
craig.topper added inline comments.Apr 9 2021, 9:48 AM
llvm/lib/Target/X86/X86.td
741–742

I'm not sure that rocketlake has CLWB. Can you double check that? It's not listed in the cpuinfo dump on the 11700K that I found with a google search here https://www.pugetsystems.com/labs/hpc/Intel-Rocket-Lake-Compute-Performance-Results-HPL-HPCG-NAMD-and-Numpy-2116/

FreddyYe added inline comments.Apr 11 2021, 2:49 AM
llvm/lib/Target/X86/X86.td
741–742

For now I have only an icelake-client machine and found that CLWB is not there, too. Guess I can do that modification in this patch? Rocketlake may probably lose CLWB. I'll double check.

craig.topper added inline comments.Apr 11 2021, 9:15 AM
llvm/lib/Target/X86/X86.td
741–742

What’s the model number for you ice lake client CPU?

FreddyYe marked an inline comment as done.Apr 11 2021, 6:50 PM
FreddyYe added inline comments.
llvm/lib/Target/X86/X86.td
741–742

It is 0x7e. And I've double checked that rkl also hasn't CLWB.

craig.topper added inline comments.Apr 11 2021, 6:58 PM
llvm/lib/Target/X86/X86.td
741–742

Sorry I meant the marketing name like "Intel® Core™ i7-1065G7"

FreddyYe marked an inline comment as done.Apr 11 2021, 7:14 PM
FreddyYe added inline comments.
llvm/lib/Target/X86/X86.td
741–742

It is Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz

Thanks. I also found this https://github.com/gcc-mirror/gcc/commit/c422e5f81f42a0fc197f0715f4fcd81f1be90bff can you create a new patch to do the same for llvm/clang and rebase this patch on top of it.

Thanks. I also found this https://github.com/gcc-mirror/gcc/commit/c422e5f81f42a0fc197f0715f4fcd81f1be90bff can you create a new patch to do the same for llvm/clang and rebase this patch on top of it.

OK, I'll create it.

Hi @MaskRay, @craig.topper, @skan, reviewers, I've addressed your comments. Any more concerns?

This revision is now accepted and ready to land.Apr 12 2021, 9:55 AM
MaskRay accepted this revision.Apr 12 2021, 3:44 PM
skan added inline comments.Apr 12 2021, 6:06 PM
llvm/lib/Support/X86TargetParser.cpp
176

Shouldn't the FeatureSGX be removed here?

180

Remove ~FeatureSGX here?

197

Remove ~FeatureSGX here?

craig.topper added inline comments.Apr 12 2021, 6:07 PM
llvm/lib/Support/X86TargetParser.cpp
176

That would change the frontend behavior which would require a wider discussion. My suggestion was only to change the backend behavior since there was nothing testable with llc anyway.

skan accepted this revision.Apr 12 2021, 6:09 PM
skan added inline comments.
llvm/lib/Support/X86TargetParser.cpp
176

Okay, it makes sense to me.

This revision was landed with ongoing or failed builds.Apr 12 2021, 6:48 PM
This revision was automatically updated to reflect the committed changes.