Teach LLVM to recognize the above pattern, which is usually a
transformation of (a + b + 1) >> 1, where the operands are unsigned
types.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
8810 | It might be worth adding a check that this isn't a scalable vector. | |
8818 | getConstantOperandVal | |
8861–8864 | This debug isn't usually added, the combiner will print this kind of info already. | |
11122 | I'm a little surprised that there is no code to do this already. I guess it doesn't usually come up. Please run clang-format on the patch. | |
11134–11137 | Make sure you check the 0 and the 8 from the extract_subvector. | |
llvm/lib/Target/AArch64/AArch64ISelLowering.h | ||
168 | Is it possible to add srhadd at the same time? I guess there is also uhadd and shadd? | |
llvm/test/CodeGen/AArch64/arm64-vhadd.ll | ||
332 | It is worth having tests for half width too - <8 x i8>. |
Addressed review comments:
- Added extra checks
- Added SRHADD generation from very similar pattern
- Added tests for 64-bit vectors
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
11134–11137 | Added the checks right before returning the new node. |
Fixed some comments and simplified the extract index checks
to just look for indexes of 0 and N00VT.getVectorNumElements()
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
8817 | This is always a VLSHR, as opposed to a VASHR because the type is large enough that the signed shift bits are never demanded? If so can you add a comment about that. |
Thanks. There are always other ways to do this - in tablegen but the pattern would be giant, earlier as a dagcombine but it needn't be done like that in this case. (We probably would have to do that for MVE, as the trunc would not remain legal). In the long run if something like this is done for MVE too, we may be able to share the code between the two places. But in the meantime this looks like a nice optimization.
LGTM.
Is it possible to add srhadd at the same time? I guess there is also uhadd and shadd?