This is an archive of the discontinued LLVM Phabricator instance.

[Draft] Make __builtin_cpu builtins target-independent
Needs ReviewPublic

Authored by nemanjai on Jun 14 2023, 6:45 AM.

Details

Reviewers
RKSimon
pengfei
arsenm
t.p.northover
Group Reviewers
Restricted Project
Summary

This is just a proposal patch for a possible direction we can extend these builtins to other targets as well as an implementation for PowerPC.

This adds a new instruction similar to LOAD_STACK_GUARD that is meant to load things from fixed addresses. The new instruction has semantics that are a superset of what LOAD_STACK_GUARD does so the two can be folded into one.

My hope with this patch is to collect feedback from the community about the direction, so please provide any feedback you may have.

Diff Detail

Event Timeline

nemanjai created this revision.Jun 14 2023, 6:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2023, 6:45 AM
nemanjai requested review of this revision.Jun 14 2023, 6:45 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 14 2023, 6:45 AM
nemanjai added inline comments.Jun 14 2023, 6:47 AM
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1785

This probably deserves a comment along the lines of:

// Emit a reference to a symbol exported by versions of GLIBC
// that provide the HWCAP access. If a program built with this
// is run on a system with an old GLIBC, it should cause a
// load-time error rather than loading garbage.
arsenm added inline comments.Jun 14 2023, 6:33 PM
llvm/include/llvm/IR/Intrinsics.td
903–907

From this description I don't understand what this is supposed to do. What does the input mean? Why does this use an i32 immarg and not a pointer? Why is the result only i32?

905

I think a new intrinsic should be in a separate commit

llvm/include/llvm/Support/TargetOpcodes.def
148

Typo afixed

nemanjai added inline comments.Jun 16 2023, 6:43 AM
llvm/include/llvm/IR/Intrinsics.td
903–907

That is fair enough. The description is fairly vague. I can try to improve it as per below. The parameter for this is not a pointer (i.e. not an address). It is an immediate that represents the index into the enumeration of values that are provided at a fixed address. The back end is then free to produce the actual fixed address and the load itself.
The choice for the result type was admittedly arbitrary - on PPC, the values provided by GLIBC are 32-bit words.

Proposed comment describing this intrinsic:

// This intrinsic is provided to allow back ends to emit load
// instructions that load a value from a fixed address. The
// parameter to the intrinsic is not an address, but an
// immediate index into an enumeration that contains the
// union of all such values available on all back ends.
// An example is the HWCAP/HWCAP2/CPUID words
// provided by GLIBC on PowerPC to allow fast access
// to commonly used parts of AUXV. These are provided
// at a fixed offset into the TCB (accessible through the
// thread pointer).
905

Sounds good.

llvm/include/llvm/Support/TargetOpcodes.def
148

ACK

ilinpv added a subscriber: ilinpv.Jun 16 2023, 7:28 AM
ilinpv added a comment.EditedJun 16 2023, 9:45 AM

Thank you for the patch, it comes in the right time - we are also working on AArch64 __builtin_cpu_supports, and I was thinking how to make it more general.
I uploaded our RFC version for review https://reviews.llvm.org/D153153


It would be great to have in __builtin_cpu_supports argument string of plus-separated features. Just SemaBuiltinCpuSupports need to handle this.

Thank you for the patch, it comes in the right time - we are also working on AArch64 __builtin_cpu_supports, and I was thinking how to make it more general.
I uploaded our RFC version for review https://reviews.llvm.org/D153153


It would be great to have in __builtin_cpu_supports argument string of plus-separated features. Just SemaBuiltinCpuSupports need to handle this.

Hi Pavel,
I took a quick look at your patch. I think it would be preferable to make the builtins target-independent rather than implementing the builtin by the same name for multiple targets. Although I think it is very useful to support a plus-separated list for __builtin_cpu_supports(), I think that's probably something for a subsequent patch. We would need to figure out code generation for that - perhaps that part will have to be completely target specific.

@arsenm Has my description adequately addressed your question? Do you think we should move forward with [some version of] this patch or do you have any fundamental objections?

arsenm added inline comments.Jul 13 2023, 7:15 AM
llvm/include/llvm/IR/Intrinsics.td
903–907

This is baking an a very target specific implementation of device identification. Could you redefine this as something more abstract? Like returns a device ID integer, or a bool that some int input is supported?

nemanjai added inline comments.Jul 13 2023, 7:47 AM
llvm/include/llvm/IR/Intrinsics.td
903–907

The idea is for this to not be restricted to device ID at all, but to be used for any values that reside in a fixed address for the compiler. For example, STACK_GUARD can be one of the values. How about if the intrinsic returns any integer type (or maybe any type)?
My thinking is that there may be need in the future for various things to be loaded from fixed addresses. The list of possible things that can be loaded this way would be a union of what all targets want and only specific values would make sense on each target.

arsenm added inline comments.Jul 13 2023, 8:24 AM
llvm/include/llvm/IR/Intrinsics.td
903–907

But you're assuming there's a fixed address this can be loaded from, and not a read from a special register or some other mechanism

ilinpv added a comment.EditedJul 19 2023, 3:52 PM

I took a quick look at your patch. I think it would be preferable to make the builtins target-independent rather than implementing the builtin by the same name for multiple targets. Although I think it is very useful to support a plus-separated list for __builtin_cpu_supports(), I think that's probably something for a subsequent patch. We would need to figure out code generation for that - perhaps that part will have to be completely target specific.

I fully agree with you to make __builtin_cpu_supports() target-independent ( and I will update my patch on top of yours ). If plus-separated format is not supported by all targets then I would suggest to make SemaBuiltinCpuSupports target-dependent - it allows to implement plus-separated format on AArch64 keeping __builtin_cpu_supports() target-independent.

nemanjai added inline comments.Jul 20 2023, 11:45 AM
llvm/include/llvm/IR/Intrinsics.td
903–907

Oh, well sure. My intent was just for those things that are in memory at a fixed address. But I suppose you're right, there is probably not a fundamental reason to restrict it to things at fixed addresses.

What I am trying to avoid here is defining an intrinsic that takes a specific value and returns a bool (i.e. the intrinsic version of __builtin_cpu_{supports|is}. The features/cpus will be different for every target and mapping features to integers is also target specific. So this was an attempt to implement an intrinsic that gets a value (bit vector if you will) that the target can choose how to lower and what to do with the value (i.e. how to mask it).

I suppose each target can define their own intrinsics for accessing CPU identification information since each target has to provide code generation for them anyway.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6787

This should check useLoadFixedAddr().

ilinpv added inline comments.Jul 24 2023, 7:14 AM
llvm/include/llvm/IR/Intrinsics.td
903–907

I would not restrict targets to specific instrinsics for feature detection. Please notice that currently CPU identification is more complicated than HWCAPs, including reads from system registers. It is done in compiler-rt library (cpu_model.c), in init_cpu_features_resolver for AArch64 and getAvailableFeatures for X86.

Friendly ping, are there any questions remained to proceed with target-independent __builtin_cpu_supports ?

Friendly ping, are there any questions remained to proceed with target-independent __builtin_cpu_supports ?

Sorry, I have not gotten back to this review in quite some time as I have been relocating to a different country. I'll try to post an updated version of this on GitHub this weekend.

lei added a subscriber: lei.Oct 10 2023, 2:11 PM

HI @nemanjai, Did you get a chance to post this as a github PR?

HI @nemanjai, Did you get a chance to post this as a github PR?

Long overdue, but I finally have it up at https://github.com/llvm/llvm-project/pull/68919