This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Implement partial vector ld/st builtins for XL compatibility
ClosedPublic

Authored by nemanjai on Jul 24 2021, 12:45 PM.

Details

Summary

XL provides functions __vec_ldrmb/__vec_strmb for loading/storing a sequence of 1 to 16 bytes in big endian order, right justified in the vector register (regardless of target endianness).
This is equivalent to vec_xl_len_r/vec_xst_len_r which are only available on Power9.

This patch simply uses the Power9 functions when compiled for Power9, but provides a more general implementation for Power8.

Diff Detail

Event Timeline

nemanjai created this revision.Jul 24 2021, 12:45 PM
nemanjai requested review of this revision.Jul 24 2021, 12:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2021, 12:45 PM
nemanjai updated this revision to Diff 361477.Jul 24 2021, 2:02 PM

Updated a couple of issues revealed during functional testing.

Conanap added inline comments.
clang/lib/Headers/altivec.h
3151

I believe the preference is to have this defined in clang/lib/Basic/Targets/PPC.cpp under defineXLCompatMacros

do we need an IR -> ASM test case as well?

do we need an IR -> ASM test case as well?

I didn't add this as the builtins do not produce any new IR - there are no new intrinsics or any other modifications to the back end. Since the code is completely contained in the front end, I feel like the tests should be as well.

clang/lib/Headers/altivec.h
3151

That is a very good point. However for the Altivec builtins, there is long standing precedent for having these defined in the header. We actually don't want the vector builtins to be defined without the inclusion of altivec.h.

clang/test/CodeGen/builtins-ppc-ld-st-rmb.c
1

The test case is quite verbose but the checks were produced by the script so it should be easy to maintain. The reason I added so many checks is that the produce code is very dependent on:

  • endianness
  • CPU
  • the number of bytes (in the store case)
Conanap accepted this revision.Jul 25 2021, 5:38 PM

Thanks for answering the Qs. LGTM.

This revision is now accepted and ready to land.Jul 25 2021, 5:38 PM
lei added a subscriber: lei.Jul 26 2021, 5:47 AM
lei added inline comments.
clang/lib/CodeGen/CGBuiltin.cpp
15133

I find the nested switch to be a bit confusing at first. Maybe it can be done a bit diff?

auto StoreSubVec ...
   if (Width==16) {
   }
   switch (Width) {
      default:  ...    
      case :
      //set the ConvTy, NumElts for non-16byte widths
   }
    // code to handle non-16 byte stores
}
amyk added a subscriber: amyk.Jul 26 2021, 7:12 AM
amyk added inline comments.
clang/lib/CodeGen/CGBuiltin.cpp
15111

Maybe we can pull out this line and do the following:

Constant *Zero = llvm::Constant::getNullValue(IsLE ? ResTy : AllElts->getType());
clang/test/CodeGen/builtins-ppc-ld-st-rmb.c
1

The test is pretty verbose already, but do you think it is necessary to add checks for AIX, or is having just the Linux ones fine?

nemanjai updated this revision to Diff 361658.Jul 26 2021, 8:00 AM

Cleaned up some of the control flow.

nemanjai updated this revision to Diff 361693.Jul 26 2021, 9:24 AM

The last update accidentally removed the newly added tests.

lei accepted this revision.Jul 26 2021, 9:39 AM

thx for the update!

amyk accepted this revision.Jul 26 2021, 9:48 AM

Thanks for addressing the comment and adding back the tests!
LGTM.

This revision was landed with ongoing or failed builds.Jul 26 2021, 11:20 AM
This revision was automatically updated to reflect the committed changes.