This is an archive of the discontinued LLVM Phabricator instance.

Add runtime support for __cpu_model (__builtin_cpu_supports)
ClosedPublic

Authored by asbirlea on Jun 6 2016, 12:09 PM.

Details

Summary

This aims to add support for cpu_model and address Bug 25510. It uses the code from lib/Support/Host.cpp for cpu detection, and creates cpu_model with that info.

Tested on OSX, it builts successfully, but the current version does *not* resolve Bug 25510. The __cpu_model symbol is present in the library but it only gets loaded with -all_load. This patch will not land until this issue is clarified.

Built on Linux as well (though libgcc is the default). The use of "asm" required -std=gnu99, hence the cmake change. Corrections on better addressing this are welcome.

Note: See additional comments on D20988 (committed as r271921).

Diff Detail

Event Timeline

asbirlea updated this revision to Diff 59758.Jun 6 2016, 12:09 PM
asbirlea retitled this revision from to Add runtime support for __cpu_model (__builtin_cpu_supports).
asbirlea updated this object.
echristo edited edge metadata.Jun 6 2016, 3:52 PM

Please go ahead and clang-format the file.

Reviewing the rest now.

-eric

asbirlea updated this revision to Diff 59801.Jun 6 2016, 4:05 PM
asbirlea edited edge metadata.
  • clang-format it
mehdi_amini added inline comments.Jun 6 2016, 4:26 PM
lib/builtins/cpu_model.c
710

This definition (which should probably be documented as (with the previous declaration) *is* the public API of this file) is a "common" (tentative definition in C).
On MachO, these are not part of the symbol table (I had to pull my "Joker card" Nick Kledzik to figure) in the static archive by default.

I can see as solution:

  • make sure we run ranlib -c on the archive to include the common in the symbol table
  • make the file a C++ file (there is no "tentative definition" and as such "common" in C++).
  • build with -fno-common
  • Define the symbol with an explicit initialization: } __cpu_model = { 0, 0, 0, { 0 } };

(I may miss other possible solutions)

Huh, I learnt something new (*wishing this info was added to man nm*). Thanks, Mehdi!
Testing with the explicit initialization, since it seems like the simplest solution. If a different solution is preferred, please let me know.

Aha. Knew I was missing something... the lack of being in the symbol table was it :)

Thanks Mehdi.

I think an explicit instantiation is probably easier than the rest of them - though running ranlib -c isn't a bad idea either.

-eric

asbirlea updated this revision to Diff 59919.Jun 7 2016, 11:30 AM
  • Add default initialization. Add unit test. Note: Lit support is not set up for the compiler-rt's builtins.
joerg edited edge metadata.Jun 7 2016, 11:49 AM

I really, really strongly wants to go back to the drawing board with this, including the way the GCC builtin is lowered in first place. There are a number of very big concerns I have and I don't think the current implementation fits:

(1) No constructors in builtins. They introduce fundamental issues for a number of use cases of compiler-rt, like pretty much any self-hosting environment. There is no need either -- for __builtin_cpu_supports, only a few global words are needed and the content is idempotent. That means checking if a flag is set and replacing the content is perfectly fine and comes with minimal overhead.

(2) For the purpose of __builtin_cpu_supports, only the flags are needed. That means 90% of the code complexity here is unnecessary.

(3) No string interfaces, please. IMO the semantic contract for __builtin_cpu_supports should require a string constant as argument, variables should be rejected. The target lowering should then replace the string with the corresponding feature mask. The basic goal here is to avoid having to maintain string->bitmask tables in compiler-rt. I don't want to have to tell people that they need to update central libraries just because someone decided to add a new flag to the list. That approach doesn't scale and the problem of having multiple libgcc_s versions around should be a good indicator of why to avoid this kind of problems.

(4) For the latter work of identifying the CPU, I'm included to make similar requirements. I don't think it belongs into a low-level library like compiler-rt at all, but even then the low-level code should at most expose vendor + cpu family counter. I most explicitly don't want to have an enumeration of all known Intel and AMD CPUs as string table in compiler-rt.

I really, really strongly wants to go back to the drawing board with this, including the way the GCC builtin is lowered in first place. There are a number of very big concerns I have and I don't think the current implementation fits:

(1) No constructors in builtins. They introduce fundamental issues for a number of use cases of compiler-rt, like pretty much any self-hosting environment. There is no need either -- for __builtin_cpu_supports, only a few global words are needed and the content is idempotent. That means checking if a flag is set and replacing the content is perfectly fine and comes with minimal overhead.

Could you explain more? Also talk about how this works with ifunc resolution and uses of builtin_cpu_is and builtin_cpu_supports in ifunc resolver functions.

(2) For the purpose of __builtin_cpu_supports, only the flags are needed. That means 90% of the code complexity here is unnecessary.

This isn't just for __builtin_cpu_supports. That's just what Alina needed here.

(3) No string interfaces, please. IMO the semantic contract for __builtin_cpu_supports should require a string constant as argument, variables should be rejected. The target lowering should then replace the string with the corresponding feature mask. The basic goal here is to avoid having to maintain string->bitmask tables in compiler-rt. I don't want to have to tell people that they need to update central libraries just because someone decided to add a new flag to the list. That approach doesn't scale and the problem of having multiple libgcc_s versions around should be a good indicator of why to avoid this kind of problems.

(4) For the latter work of identifying the CPU, I'm included to make similar requirements. I don't think it belongs into a low-level library like compiler-rt at all, but even then the low-level code should at most expose vendor + cpu family counter. I most explicitly don't want to have an enumeration of all known Intel and AMD CPUs as string table in compiler-rt.

This is talking about a new interface. While I'm somewhat sympathetic to your goals, having a compiler-rt that's a drop in replacement for libgcc unfortunately needs this feature set to be implemented. Also see arguments against implementing ifunc in the first place versus just a (possibly hidden) global that's set at program start time.

-eric

If I understand correctly, your suggestion is having correct behavior
for builtin_cpu_supports
(i.e. setting the flags) and undefined behavior (or return 0 / always
generic) for
builtin_cpu_is (i.e. not setting cpu type and subtype),
because of the lack of uses for the latter, code bloat, hardship to
maintain/add new CPU generations. Is this right?

joerg added inline comments.Jun 9 2016, 8:20 AM
lib/builtins/cpu_model.c
729

This assumption will bit us, BTW. We've bug reports for OpenSSL crashing not too long ago on CPUs without CPUID support. At least for i386, presence definitely must be checked first via the corresponding eflags bit.

echristo added inline comments.Jun 9 2016, 2:39 PM
lib/builtins/cpu_model.c
729

Yeesh. You should probably update the code in Host.cpp then :)

asbirlea added inline comments.Jun 15 2016, 1:36 PM
lib/builtins/cpu_model.c
729

I'm not familiar with what flags need to be checked. Could you elaborate or update Host.cpp to resolve this?

joerg added inline comments.Jun 15 2016, 1:51 PM
lib/builtins/cpu_model.c
729

Take a look at clang's cpuid.h and __get_cpuid_max. I wonder if we can/should assume the presence of that header. What's the Windows situation here?

asbirlea added inline comments.Jun 15 2016, 2:29 PM
lib/builtins/cpu_model.c
729

Is this roughly what you had in mind?

// insert @:730; also include cpuid.h
+ MaxLeaf = __get_cpuid_max(0, &Vendor);
+ if(!MaxLeaf)
+ return -1;

Alternatively just copy the check ifdefed by i386?
Both options will work from preliminary builds.

The plan was to support Windows as well. I'm not clear if this will still be the case.

joerg added inline comments.Jun 15 2016, 2:36 PM
lib/builtins/cpu_model.c
729

Yeah, copying the ifdef is likely the best approach. Depending on the header was more a question for Host.cpp.

asbirlea updated this revision to Diff 60917.Jun 15 2016, 3:02 PM
asbirlea edited edge metadata.

Check cpuid supported for i386.

Note: I can add the same check in Host.cpp in a later patch.

Pinging patch.

Weekly ping. As clang supports builtins, I believe it's worth having the same runtime support as libgcc.

echristo accepted this revision.Jul 1 2016, 4:44 PM
echristo edited edge metadata.

I agree that we should add the support here. I understand Joerg's wish for a better API, but for drop in compatibility with libgcc we should still define these and perhaps conditionally compile them if we get a better library interface in the future (i.e. compatibility or not).

Thanks a lot for doing this!

-eric

ps One inline request before you commit, thanks. :)

lib/builtins/cpu_model.c
123

Please add a block comment above this that spells out that we took it from lib/Support/Host.cpp.

This revision is now accepted and ready to land.Jul 1 2016, 4:44 PM
asbirlea updated this revision to Diff 62571.Jul 1 2016, 4:49 PM
asbirlea edited edge metadata.

clang-format (minor)

asbirlea updated this revision to Diff 62574.Jul 1 2016, 4:56 PM

Added comments identifing code source

This revision was automatically updated to reflect the committed changes.