This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Enable libm vectorized functions via SLEEF
ClosedPublic

Authored by danielkiss on Sep 27 2022, 1:44 AM.

Details

Summary

It enables trigonometry functions vectorization via SLEEF: http://sleef.org/.

  • A new vectorization library enum is added to TargetLibraryInfo.h: SLEEF.
  • A new option is added to TargetLibraryInfoImpl - ClVectorLibrary: SLEEF.
  • A comprehensive test case is included in this changeset.
  • A new vectorization library argument is added to -fveclib: -fveclib=SLEEF.

Trigonometry functions that are vectorized by sleef:
acos
asin
atan
atanh
cos
cosh
exp
exp2
exp10
lgamma
log10
log2
log
sin
sinh
sqrt
tan
tanh
tgamma

Co-authored-by: Stefan Teleman

Diff Detail

Event Timeline

danielkiss created this revision.Sep 27 2022, 1:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 27 2022, 1:44 AM
danielkiss requested review of this revision.Sep 27 2022, 1:44 AM
mgabka added a subscriber: mgabka.Sep 27 2022, 1:57 AM
Matt added a subscriber: Matt.Sep 29 2022, 12:03 PM
paulwalker-arm added inline comments.
clang/lib/CodeGen/BackendUtil.cpp
310

What does GNUABI signify? My understanding is the SLEEF functions use the Vector ABI (which is somewhat target neutral?) to construct their name.

Just a thought but to be a bit more target neutral what about splitting TargetLibraryInfoImpl::SLEEF based on vector length. So here we'll have

TLII->addVectorizableFunctionsFromVecLib(TargetLibraryInfoImpl::SLEEF_VF2_...)
TLII->addVectorizableFunctionsFromVecLib(TargetLibraryInfoImpl::SLEEF_VF4_...)

This way there will be one version of TargetLibraryInfo for SLEEF and different targets can pick and choose based on vectorisation factors.

clang/lib/Driver/ToolChains/Clang.cpp
5194–5214

Do tests exist for these diagnostics? If not can you add them.

danielkiss added a reviewer: paulwalker-arm.

address review comments, added more tests. rebased.

danielkiss added inline comments.Nov 30 2022, 2:32 AM
clang/lib/CodeGen/BackendUtil.cpp
310

What does GNUABI signify? My understanding is the SLEEF functions use the Vector ABI (which is somewhat target neutral?) to construct their name.

GNUABI comes from the libsleefgnuabi.so. it is compatible with libmvec but has more functions. Normal sleef ABI is different.

I added triplet to addVectorizableFunctionsFromVecLib so vectorisation factors per target then can be handled there without bloating the enum.

Is there any chance of enabling this on platforms other than AArch64? Sleef has x86, AArch32, PowerPC64, System/390, CUDA, and WebAssembly support (although I am mostly interested in x86), in addition to AArch64. This makes sleef a very useful for platforms without libmvec (due either to a lack of architecture support or lack of glibc). I understand this MR is likely motivated by a lack of libmvec on anything but x86, but there are use cases on x86 particularly systems using musl libc.

danielkiss marked an inline comment as done.Jan 16 2023, 6:57 AM

Is there any chance of enabling this on platforms other than AArch64? Sleef has x86, AArch32, PowerPC64, System/390, CUDA, and WebAssembly support (although I am mostly interested in x86), in addition to AArch64.

I'll look into adding x86, x86_64, probably in new patch top of this. I would add targets where we have demand.

This makes sleef a very useful for platforms without libmvec (due either to a lack of architecture support or lack of glibc).

Correct, this is the main motivation.

I understand this MR is likely motivated by a lack of libmvec on anything but x86, but there are use cases on x86 particularly systems using musl libc.

Thanks for the use case.

paulwalker-arm added inline comments.Jan 17 2023, 4:42 AM
clang/lib/CodeGen/BackendUtil.cpp
294–298
NOTE: A following comment might make this one redundant.

Given you emit a diagnostic for -fveclib for currently unsupported targets and addVectorizableFunctionsFromVecLib takes a target triple so that it'll be a NOP for unsupported targets, can this restriction be removed? It just seems like redundant duplication.

310

The target triple doesn't convey things like SVE being available does it? The same goes for other targets I guess in that for x86 does the triple specify the target's vector length?

There's also the question of how the SLEEF library has been built. Can we guarantee the SLEEF library will always match the target the user has chosen. (i.e. just because the user specifies +sve doesn't mean the SLEEF library has functions for the scalable vector types).

Perhaps this is really a toolchain packaging problem but either way it feels like something that must be solved within clang via either a tighter TLI interface and/or a more specific set of -fveclib options?

That's really the crux of the problem. The other -fveclib options has a very specific set of functions they provide whereas SLEEF does not. So perhaps the options should be SLEEF_NEON or SLEEF64, SLEEF128 (not sure if there's a generic way to describe scalable vectors so perhaps we cannot escape SLEEF_SVE?), which might also suggests the parsing for -fveclib requires expanding if for example we want to allow -fveclib= SLEEF64,SLEEF128. NOTE: I can see for MVEC LIBMVEC_X86 exists.

For what it's worth this is why there's the idea to add pragmas to decorate the math function declarations so vector function mappings can be added automatically based on whatever math.h header is included.

Is there any chance of enabling this on platforms other than AArch64? Sleef has x86, AArch32, PowerPC64, System/390, CUDA, and WebAssembly support (although I am mostly interested in x86), in addition to AArch64.

I'll look into adding x86, x86_64, probably in new patch top of this. I would add targets where we have demand.

This makes sleef a very useful for platforms without libmvec (due either to a lack of architecture support or lack of glibc).

Correct, this is the main motivation.

I understand this MR is likely motivated by a lack of libmvec on anything but x86, but there are use cases on x86 particularly systems using musl libc.

Thanks for the use case.

Thanks for the willingness to look into x86 support! If you need anything from me (elaborate on use case, testing, etc) let me know.

danielkiss added inline comments.Jan 17 2023, 10:13 AM
clang/lib/CodeGen/BackendUtil.cpp
294–298

ack

310

Perhaps this is really a toolchain packaging problem but either way it feels like something that must be solved within clang via either a tighter TLI interface and/or a more specific set of -fveclib options?

That's really the crux of the problem. The other -fveclib options has a very specific set of functions they provide whereas SLEEF does not. So perhaps the options should be SLEEF_NEON or SLEEF64, SLEEF128 (not sure if there's a generic way to describe scalable vectors so perhaps we cannot escape SLEEF_SVE?), which might also suggests the parsing for -fveclib requires expanding if for example we want to allow -fveclib= SLEEF64,SLEEF128. NOTE: I can see for MVEC LIBMVEC_X86 exists.

IIRC there is no other platform right now for libmvec beside x86 so LIBMVEC_X86 is what it is, once others are added I'd rename it too just LIBMVEC.

Once SVE is enabled for in Sleef the SVE version of functions are additions to the current set.
Depends on how the sleef library is packed into the toolchain\platform one could add conditional logic for certain platform, version etc based on the triplet. Right now it will be useful to add x86 base versions.
-mveclib_options=<arch_specifics> or similar option could be there to override, drive such a decision.

paulwalker-arm accepted this revision.Jan 18 2023, 5:36 AM

Thanks @danielkiss this all sounds reasonable to me. So other than my comment relating to redundant code in createTLII this patch looks good to me. One extra datapoint is that I'm pretty sure Sleef has supported SVE for a few years now and there are patches in flight to improve LoopVectorize to be able to emit function calls that take a predicate so we're not too far away from having complete support in this regard.

This revision is now accepted and ready to land.Jan 18 2023, 5:36 AM

clean up createTLII since now fronted checks for the target support.

Thanks @danielkiss this all sounds reasonable to me. So other than my comment relating to redundant code in createTLII this patch looks good to me. One extra datapoint is that I'm pretty sure Sleef has supported SVE for a few years now and there are patches in flight to improve LoopVectorize to be able to emit function calls that take a predicate so we're not too far away from having complete support in this regard.

Thanks, Sound good, looking forward to support that too :)

This revision was landed with ongoing or failed builds.Jan 20 2023, 9:52 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2023, 9:52 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript