(ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))) to ZERO_EXTEND((UABD a, b))
This partially solves the bug: https://bugs.llvm.org/show_bug.cgi?id=46888
Meta ticket: https://bugs.llvm.org/show_bug.cgi?id=46929
Paths
| Differential D88742
[AArch64] Identify SAD pattern ClosedPublic Authored by mivnay on Oct 2 2020, 9:04 AM.
Details
Summary (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))) to ZERO_EXTEND((UABD a, b)) This partially solves the bug: https://bugs.llvm.org/show_bug.cgi?id=46888
Diff Detail
Event TimelineHerald added subscribers: llvm-commits, danielkiss, hiraditya, kristof.beyls. · View Herald Transcript Comment Actions So this converts v16i32 abs(sub(zext(v16i8), zext(v16i8)))) to v16i32 zext(abd(v16i8, v16i8))? And doesn't start with a trunk? That sounds like it should work OK. Can you add other types and sign extends? And more tests. And a AArch64ISD node (they are not very complex to add :) ) Comment Actions
Yes.
Few patterns already exists in .td files for uabdl, etc. I tried to add the ISD node for UABD (posted it in https://reviews.llvm.org/D88897) but there are unintended consequences (ISel is generating different patterns). Can I take a look at it later (when other pending patches are done)? About the sign extend variants, I am not sure whether it is legal to do this transformation. Comment Actions
What part of sadb are you worried about? I thought they could be treated the same, given you are extended from enough extra bits. But I may be mistaken, they can be somewhat difficult. Can D88897 be folder back into here now? Comment Actions
I am worried about the semantics of sabd instruction especially about v16i32 (sub(sext(v16i8), sext(v16i8))) to v16i32 sext(sub(v16i8, v16i8)) part. Can this be looked at later once my other patches are through?
There seems to be more issues for adding SABD ISD node. Since it is a separate patch now, can this one move forward? Comment Actions
The return type of the abd is positive unsigned, so I think if turning it into v16i32 zext(sabd(v16i8, v16i8)) should be valid? But it's worth testing that, I may be mistaken.. Comment Actions
I am not sure about the validity either and need to test it. Can that be added as enhancement later and go ahead with the current pattern? Comment Actions
Currently the patterns written are for v16i8 to v16i16 and v16i8 to v16i32 types. I am not sure what are the other types needed to support along with this as there are at least 24 more patterns if we consider signed and floating points excluding the vabdl ones which are already in td files (Reference : https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics?search=vabd). There are 12 type pair possible for unsigned types alone:
I have handled the current patterns to solve the bug mentioned in the ticket. Which other types do you have in mind? mivnay added inline comments. Comment Actions
If we are adding UABD/SABD pattern matching I would expect us to generalize it to support for all the types that instruction supports. You can ignore UABDL though, I agree that is a separate issue (and seems to be handled elsewhere). The "to" type isn't very important - it just needs to be big enough to make the results equivalent. I think in this case that's only 1 or 2 bits, and it could even not be a power 2 (although that usually isn't very important to support). The "from" type would be i8, i16 or i32, handling either 64 or 128bit vectors. I think if we effectively ignore the "to" type (because it gets handled by the ZERO_EXTEND), what you have here is already pretty close.
Comment Actions Thanks. https://alive2.llvm.org/ce/z/R7TNGK suggests this should be fine for sign extends too, which would be great to add. @RKSimon As I mentioned in https://reviews.llvm.org/D88897#2316214 I'd like to make this target independent, If you don't mind waiting for me to port this over I am happy to do that later.
Comment Actions Thank you very much. With one little nitpick, LGTM.
This revision is now accepted and ready to land.Oct 13 2020, 12:21 AM This revision was landed with ongoing or failed builds.Oct 13 2020, 3:24 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 297495 lib/Target/AArch64/AArch64ISelLowering.h
lib/Target/AArch64/AArch64ISelLowering.cpp
lib/Target/AArch64/AArch64InstrInfo.td
test/CodeGen/AArch64/arm64-vabs.ll
|
clang-format: please reformat the code