This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Generate URHADD from (b - (~a)) >> 1
ClosedPublic

Authored by PetreTudor on Jun 26 2020, 10:41 AM.

Details

Summary

Teach LLVM to recognize the above pattern, which is usually a
transformation of (a + b + 1) >> 1, where the operands are unsigned
types.

Diff Detail

Event Timeline

PetreTudor created this revision.Jun 26 2020, 10:41 AM
dmgreen added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
8810

It might be worth adding a check that this isn't a scalable vector.

8818

getConstantOperandVal

8861–8864

This debug isn't usually added, the combiner will print this kind of info already.

11122

I'm a little surprised that there is no code to do this already. I guess it doesn't usually come up. Please run clang-format on the patch.

11134–11137

Make sure you check the 0 and the 8 from the extract_subvector.

llvm/lib/Target/AArch64/AArch64ISelLowering.h
168

Is it possible to add srhadd at the same time? I guess there is also uhadd and shadd?

llvm/test/CodeGen/AArch64/arm64-vhadd.ll
332

It is worth having tests for half width too - <8 x i8>.

Addressed review comments:

  • Added extra checks
  • Added SRHADD generation from very similar pattern
  • Added tests for 64-bit vectors
PetreTudor marked 8 inline comments as done.Jun 30 2020, 10:51 AM
PetreTudor added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11134–11137

Added the checks right before returning the new node.

PetreTudor updated this revision to Diff 275033.Jul 2 2020, 3:09 AM
PetreTudor marked an inline comment as done.

Fixed some comments and simplified the extract index checks
to just look for indexes of 0 and N00VT.getVectorNumElements()

dmgreen added inline comments.Jul 2 2020, 8:13 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
8817

This is always a VLSHR, as opposed to a VASHR because the type is large enough that the signed shift bits are never demanded? If so can you add a comment about that.

PetreTudor updated this revision to Diff 275335.Jul 3 2020, 2:42 AM

Added comments explaining why we are always looking for AArch64ISD::VLSHR

PetreTudor marked an inline comment as done.Jul 3 2020, 3:20 AM
dmgreen accepted this revision.Jul 3 2020, 3:33 AM

Thanks. There are always other ways to do this - in tablegen but the pattern would be giant, earlier as a dagcombine but it needn't be done like that in this case. (We probably would have to do that for MVE, as the trunc would not remain legal). In the long run if something like this is done for MVE too, we may be able to share the code between the two places. But in the meantime this looks like a nice optimization.

LGTM.

This revision is now accepted and ready to land.Jul 3 2020, 3:33 AM
This revision was automatically updated to reflect the committed changes.