This is an archive of the discontinued LLVM Phabricator instance.

[TLI][AArch64] Add mappings to vectorized functions from ArmPL
ClosedPublic

Authored by mgabka on Jul 5 2023, 6:47 AM.

Details

Summary

Arm Performance Libraries contain math library which provides
vectorized versions of common math functions.
This patch allows to use it with clang and llvm via -fveclib=ArmPL or
-vector-library=ArmPL, so loops with such calls can be vectorized.
The executable needs to be linked with the amath library.

Arm Performance Libraries are available at:
https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Libraries

Diff Detail

Event Timeline

mgabka created this revision.Jul 5 2023, 6:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2023, 6:47 AM
mgabka requested review of this revision.Jul 5 2023, 6:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2023, 6:47 AM
mgabka edited reviewers, added: pawosm01; removed: pawelo.Jul 5 2023, 6:48 AM
mgabka updated this revision to Diff 537349.Jul 5 2023, 6:57 AM

forgot to include some of the formatting changes

mgabka updated this revision to Diff 537609.Jul 6 2023, 12:21 AM
  • moved replace-with-veclib tests to a better destination
  • added LV tests where llvm math intrinsics, for which ArmPL has mappings, are used
  • formatting changes in tests
mgabka added a comment.Jul 6 2023, 2:10 AM

The precommit failure in libcxx seem to be not related to my change.

paulwalker-arm added inline comments.Jul 6 2023, 3:11 AM
clang/include/clang/Basic/CodeGenOptions.h
65

This should be "Arm Performance Libraries".

llvm/include/llvm/Analysis/TargetLibraryInfo.h
100

This should be "Arm Performance Libraries".

llvm/include/llvm/Analysis/VecFuncs.def
732–739

Up to you but I think it is cleaner to have cos and llvm.cos within separate blocks.

llvm/lib/Analysis/TargetLibraryInfo.cpp
38

This should be "Arm Performance Libraries".

mgabka updated this revision to Diff 537668.Jul 6 2023, 5:03 AM
mgabka marked 4 inline comments as done.
mgabka edited the summary of this revision. (Show Details)

addressed review comments.

paulwalker-arm added inline comments.Jul 7 2023, 6:57 AM
llvm/include/llvm/Analysis/VecFuncs.def
907–910

I guess this is ok but please be aware that sinpi is very new and even my reasonably up to date Fedora install doesn't have it.

llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll
2–3 ↗(On Diff #537668)

It doesn't make sense to split the testing like this because the scalable vector tests are essentially bogus when SVE is not available and the fixed length vector tests should produce the same result when SVE is enabled.

I think there should either be a single RUN line, or two test files that separate the fixed and scalable tests. Personally I'd opt for the former given there's nothing in the TLI interface that uses target features when making decisions

llvm/test/Transforms/LoopVectorize/AArch64/armpl-calls-with-ptr.ll
12–13

The sincos functions do not return a value. It looks like you're calling them correctly so just the declarations are wrong.

17–18

This is not the signature of ArmPL's sincos functions. They all return void and thus these NOT lines will not match even when vectorisation is possible. In general it's best to keep NOT lines as minimal as possible so perhaps just NOT: @armpl_ or even NOT: vector.body?

With that said, the ilogb and ldexp issues I raise below show the dangerous side of including mappings the compiler doesn't yet know how to handle. For this reason I would much rather remove all the unsupported ones and only introduce them once the compiler is ready to use them.

llvm/test/Transforms/LoopVectorize/AArch64/armpl-calls.ll
1853–1854

This is not the interface ArmPL exposes for the vector variants of ilogb. I think this is likely a bug in ArmPL whereby the function's return type incorrectly mirrors the operand type. It's possible this doesn't matter for SVE but for NEON the code generation for the f64 case will certainly be wrong. I think it's safest to remove the mappings until we're sure ArmPL is ready.

1904–1906

This is not the interface ArmPL implements for vector ldexp functions. ArmPL requires the integer operand to be a scalar. I think the mappings should be removed until either ArmPL matches the interface currently expected or LoopVectorize gains the smarts to support scalar operands.

mgabka updated this revision to Diff 538587.Jul 10 2023, 5:00 AM
mgabka marked 5 inline comments as done.

addressed review comments

llvm/test/Transforms/LoopVectorize/AArch64/armpl-calls.ll
1904–1906

good catch, I must admit I wasn't checking header files from armpl, was hoping that the interface is matching scalar definitions, so:
float ldexpf( float arg, int exp );
double ldexp( double arg, int exp );

I will remove the mappings together with the tests

Matt added a subscriber: Matt.Jul 10 2023, 8:39 AM
paulwalker-arm accepted this revision.Jul 11 2023, 2:42 AM
This revision is now accepted and ready to land.Jul 11 2023, 2:42 AM
This revision was landed with ongoing or failed builds.Jul 12 2023, 5:54 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2023, 5:54 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript