This is an archive of the discontinued LLVM Phabricator instance.

[cpu-detection] Substantial refactor of Host CPU detection code (x86)
ClosedPublic

Authored by asbirlea on Jun 3 2016, 2:34 PM.

Details

Summary

Following D20970 (committed as r271726).
This is a substantial refactoring of the host CPU detection code.

There is no functionality change intended, but the changes are extensive.

Definitions of architecture types and subtypes are by no means exhaustive or
perfectly defined, but a fair starting point.
Suggestions for futher improvements are welcome.

Diff Detail

Repository
rL LLVM

Event Timeline

asbirlea updated this revision to Diff 59626.Jun 3 2016, 2:34 PM
asbirlea retitled this revision from to [cpu-detection] Substantial refactor of Host CPU detection code (x86).
asbirlea updated this object.
asbirlea added a reviewer: llvm-commits.
echristo added a subscriber: echristo.

Adding Craig. In general, I'm OK for now, let's get his opinion on this.

-eric

echristo accepted this revision.Jun 3 2016, 2:42 PM
echristo added a reviewer: echristo.

I'm going to accept this on the "this is much cleaner than what we've got and going in the direction of what we want for compiler-rt as well". Let's go ahead and wait for Craig too.

Thanks!

-eric

This revision is now accepted and ready to land.Jun 3 2016, 2:42 PM

Sure. Waiting for Craig's feedback. I'll wait for additional input, at the very least until Monday.

craig.topper edited edge metadata.Jun 3 2016, 2:57 PM

Overall I think is fine, but I'm a little unsure about what the CPU type and subtype split is trying to accomplish. Particularly the COREI7 name seems confusing since i7 refers to the higher end Intel parts, but there are also i3/i5. And skylake-avx512 is actually a Xeon part and not marketed as an i7 at all.

A clarification: the end goal of this refactoring is to address Bug 25510, adding __cpu_model to compiler_rt.
The split for types and subtypes is intended to facilitate that, similar to the split libgcc uses.
That one has a pretty limited set of archs, hence new ones were added here. In practice, I'm sure I did not do the best job with the split (and it was not the primary focus either), so suggestions for moving/renaming types/subtypes are welcome.

joerg added a subscriber: joerg.Jun 3 2016, 3:29 PM

Do we really want to have this in compiler-rt? It seems to me like the kind of code that is going to be a huge maintenance hassle, especially for Operating Systems that want to support code for longer than a couple of month. Just like the corresponding feature set builtins with string arguments, this seems to me like a horrible API design for a library that is potentially embedded in every binary. The suggested use cases in the GCC documentation don't feel very promising to me either.

As I mentioned on IRC, I can understand wanting to easily test for feature bits, but that's much better provided by having a __builtin_cpuid_feature(n) helper and just using the stable and documented bitmasks directly. It would be useful to know if people are just using this kind of checks in initialization code to select dispatcher functions (or the equivalent ifunc resolvers ) or whether they are used in actual hot code. Depending on that, caching the cpuid result is import or not. Without a need to cache cpuid results, __builtin_cpu_supports can be expanded completely inline without any library support. If there is a certain desire for efficiency here, being able to merge multiple feature tests could be helpful as well. In any case, I don't believe that forcing string matching into an API for something that is using bitmasks or enumerations on the hardware side anyway is just silly. Doing the string translations completely on the LLVM/Clang side would at least stop that maintenance hassle.

Yes, we really do want this in compiler-rt. We support the builtins via clang and there's no reason not to have the runtime support.

I understand you have some reservations about using strings rather than enums in the intrinsics, but that seems like a bit of a bikeshed to the general support for querying (and caching in the compiler-rt side) cpu id features. If we need to design a new interface we can do that separately and do what we need there while still supporting the old interfaces.

Also, ultimately, this side of things is still just a nice cleanup for the existing code in Host.cpp and so I think it's a worthwhile patch for now.

Thanks.

-eric

I'll go ahead and check this in for now, as a good clean-up for the host detection code. Further improvements are welcome!

The builtins are already supported in clang; it makes sense to add the runtime support.
While I tend to agree that a new interface would be nice, I think for the time being the old interface should be supported. I'll add you on the next patch to compiler-rt to keep a log of this; perhaps it encourages those interested in designing a new interface.

This revision was automatically updated to reflect the committed changes.
sanjoy added a subscriber: sanjoy.Jun 8 2016, 4:41 PM
sanjoy added inline comments.
llvm/trunk/lib/Support/Host.cpp
659

I think there is a bug here. The break; will only break out of the inner switch, and for an unknown model of "amdfam10", we'll set the type to AMDFAM14H.

asbirlea added inline comments.Jun 8 2016, 5:08 PM
llvm/trunk/lib/Support/Host.cpp
659

Agreed.
Created: http://reviews.llvm.org/D21158
Let me know if replacing break with return is preferred.