This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Make ACLE intrinsics always available part1
ClosedPublic

Authored by danielkiss on Sep 6 2022, 8:11 AM.

Details

Summary

A given arch feature might enabled by a pragma or a function attribute so in this cases would be nice to use intrinsics.
Today GCC offers the intrinsics without the march flag[1].
PR[2] for ACLE to clarify the intention and remove the need for -march flag for a given intrinsics.

This is going to be more useful when D127812 lands.

[1] https://godbolt.org/z/bxcMhav3z
[2] https://github.com/ARM-software/acle/pull/214

Diff Detail

Event Timeline

danielkiss created this revision.Sep 6 2022, 8:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 6 2022, 8:11 AM
danielkiss requested review of this revision.Sep 6 2022, 8:11 AM
danielkiss added a subscriber: ilinpv.
danielkiss planned changes to this revision.Sep 6 2022, 8:57 AM

So, what kind of errors will the user get now if try to use an unsupported instruction? I'm assuming that it fails during isel and that doesn't sound like a good user experience.

danielkiss added a comment.EditedSep 8 2022, 12:47 AM

So, what kind of errors will the user get now if try to use an unsupported instruction? I'm assuming that it fails during isel and that doesn't sound like a good user experience.

I'm adding target warnings. so the user will get "error: always_inline function 'foo' requires target feature 'x', but would be inlined into function 'bar' that is compiled without support for 'x'.

danielkiss retitled this revision from [Arm] Make ACLE intrinsics always available to [AArch64] Make ACLE intrinsics always available .

Armv8.3-A Javascript conversion and Armv8.5-A FP rounding don't have feature flag.
One option is to introduce feature flags or the the arch version could be used.
I'm looking comment for the arch version solution.

Matt added a subscriber: Matt.Sep 13 2022, 9:50 AM

Armv8.3-A Javascript conversion and Armv8.5-A FP rounding don't have feature flag.
One option is to introduce feature flags or the the arch version could be used.
I'm looking comment for the arch version solution.

I'm looking into a similar patch for the parts of arm_neon.h that are not already handled, and was thinking along the same lines. The idea at current is to add support for arch= target attributes, as in https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes. And then using "v8.1a" (rdma), "v8.3a" (complex) or "v8.5a" (frint) for the TARGET_BUILTIN, like you have here. I think that is fine for things internal to the compiler. The user can then use "arch=..." target attributes to specify different architecture versions. I'll try and put that part up for review.

What is you plan for AArch32? Keep it behind the same ifdef as before?

clang/lib/Basic/Targets/AArch64.cpp
529 ↗(On Diff #459745)

I had a (completely untested) version of this function doing the same thing. I was basing it on string processing, though, so we didn't have to add new code for each new version of the architecture. I'm not sure if it was a good idea or not.

clang/test/Sema/aarch64-tme-errors.c
6 ↗(On Diff #459745)

Does this get a "failure to inline" error now?

clang/test/Sema/builtins-arm64-mte.c
8 ↗(On Diff #459745)

I think these should still be testing the ACLE intrinsics, not the builtins, if that is what we expect the users to use. Are the error's now different?

dmgreen added inline comments.Sep 14 2022, 2:57 AM
clang/lib/Basic/Targets/AArch64.cpp
529 ↗(On Diff #459745)

It looked like this, by the way:

if (Enabled && Name == "v9.0a")                                              
  setFeatureEnabled(Features, "v8.5a", true);                                
else if (Enabled && Name != "v8.1a" &&                                       
         (Name.startswith("v8.") || Name.startswith("v9.")) &&               
         Name.endswith("a") &&                                               
         atoi(Name.substr(3, Name.size() - 1).data()) >= 1) {                
  SmallString<16> Storage;                                                   
  Twine NewName =                                                            
      Name.take_front(3) +                                                   
      std::to_string(atoi(Name.substr(3, Name.size() - 1).data()) - 1) + "a";
  setFeatureEnabled(Features, NewName.toStringRef(Storage), true);           
}

It is untested, apart from some very simple checks, doesn't handle errors and doesn't handle all the 9 feautres properly. It may be better not to be so recursive too if it can just enable multiple features at once.

This looks like you want an enum over all versions with string2enum, enum2string, and comparison operations on the enum.

compnerd added inline comments.Sep 22 2022, 8:51 AM
clang/lib/Basic/Targets/AArch64.cpp
535–564 ↗(On Diff #459745)

I think that we can compress some of the logic here. But I also wonder if we can write this in a different way: rather than creating a small vector and resizing it, can we actually avoid that all together and create a static constant data blob that we can iterate?

const char * const kv8Architectures[] = { "v8a", "v8.1a", "v8.2a", "v8.3a", "v8.4a", "v8.5a", "v8.6a", "v8.7a" };

unsigned major, minor;
std::tie(major, minor) = [parse_version](Name);
size_t v8Revisions = major == 8 ? minor : 4 + minor;
size_t v9Revisions = major == 9 : minor : 0;

The bit in the middle handwaves over the version parsing, but should hopefully convey the general idea.

danielkiss planned changes to this revision.Sep 23 2022, 8:44 AM

Moved not ACLE related changes to D134349 and D134353.
This to be rebased on top of those and D134127.

clang/test/Sema/builtins-arm64-mte.c
8 ↗(On Diff #459745)

These are custom errors see here e.g. that can't be triggered via the intrinsics anymore because we stop earlier.

danielkiss marked 2 inline comments as done.Oct 4 2022, 11:21 AM

Arm part will be a different patch since it is more complex to decide the support of a given feature. (e.g. ((CPUProfile != "M" && ArchVersion >= 6) || (CPUProfile == "M" && DSP))))

clang/lib/Basic/Targets/AArch64.cpp
535–564 ↗(On Diff #459745)

solved in D134353.

clang/test/Sema/aarch64-tme-errors.c
6 ↗(On Diff #459745)

Yes, but -verify does not go that far.

clang/test/Sema/builtins-arm64-mte.c
8 ↗(On Diff #459745)

added test for intrinsics besides the builtins

Can you split the "mte" and "tme" parts away from the rest? I'm not sure if there are any problems with those part, but they are more involved and I am less familiar with it. It would be good to get a lot of the simpler parts in and out of the way first.

danielkiss marked an inline comment as done.
danielkiss retitled this revision from [AArch64] Make ACLE intrinsics always available to [AArch64] Make ACLE intrinsics always available part1.

MTE/TME parts are removed, they will get their own patch.

dmgreen accepted this revision.Oct 13 2022, 5:59 AM

Thanks. That's great. LGTM

This revision is now accepted and ready to land.Oct 13 2022, 5:59 AM
This revision was landed with ongoing or failed builds.Oct 14 2022, 8:23 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2022, 8:23 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript