This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SME]: Generate streaming-compatible code for bit counting/select
ClosedPublic

Authored by hassnaa-arm on Nov 24 2022, 11:38 AM.

Details

Summary

To generate code compatible to streaming mode:

  • enable custom-lowering ISD::CTLZ and ISD::CTPOP.
  • disable combining OR into BSL.
  • Testing files:
    • bit-counting.ll
    • bitselect.ll

Diff Detail

Event Timeline

hassnaa-arm created this revision.Nov 24 2022, 11:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2022, 11:38 AM
hassnaa-arm requested review of this revision.Nov 24 2022, 11:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2022, 11:38 AM
Matt added a subscriber: Matt.Nov 27 2022, 10:36 AM
david-arm added inline comments.Nov 28 2022, 3:49 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
15664–15668

I think you need brackets around the first part, i.e.

if ((!VT.is64BitVector() && !VT.is128BitVector()) ||
hassnaa-arm marked an inline comment as done.Nov 28 2022, 8:42 AM

Add brackets for the && check.

This revision is now accepted and ready to land.Nov 28 2022, 8:46 AM
sdesmalen added inline comments.Nov 28 2022, 9:06 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
8455

Odd indentation, have you used clang-format?

15662–15663

This comment seems out of date?

15665

Odd indentation, have you used clang-format?

Also, is this the same as (and should this be):

if (useSVEForFixedLengthVectorVT(VT, Subtarget->forceStreamingCompatibleSVE()))

?

hassnaa-arm marked 3 inline comments as done and an inline comment as not done.

code format

sdesmalen accepted this revision.Nov 29 2022, 3:29 AM
mgabka added a subscriber: mgabka.Jan 11 2023, 3:16 AM
mgabka added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
15664–15668

I think this code is introducing a bug, before the combine had an early exit if the VT wasn't 64 or 128 bit vector, while now we allow scalable vectors, and the combine does not give correct results for scalable vectors, it is just not triggered because the "ISD::isBuildVectorAllZeros" and "ISD::isBuildVectorAllOnes" reject vector splat, but I think it would be better to have a clear early exit for scalable VT.
What do you think?

sdesmalen added inline comments.Jan 11 2023, 3:34 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
15664–15668

From what I can see, useSVEForFixedLengthVectorVT returns false if VT is a scalable vector, so I don't think this would be a problem.

mgabka added inline comments.Jan 11 2023, 3:57 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
15664–15668

exactly it returns false, so the branch is not taken and there is no early exit, while before we would have an early exit.

sdesmalen added inline comments.Jan 11 2023, 4:13 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
15664–15668

Okay I see what you mean now. Yes, that does seem like a functional change. Did you find this by looking at the code, or did you come across a regression somewhere?

mgabka added inline comments.Jan 11 2023, 5:24 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
15664–15668

I had a downstream regression.

Looks like in upstream LLVM there is no tests for it, but the code below relies on VT.getScalarSizeInBits() being 64 or 128, check line 15720, so although for scalable vector luckily we are safe (thanks to the fact that ISD::isBuildVectorAllZeros does not accept spats), I think we might have issues for fixed width vectors where SVE should not be used, and where VT.getScalarSizeInBits() isn't 64 nor 128.