AArch64 supports unsigned shift right and accumulate. In case we see a
unsigned shift right followed by an OR. We could turn them into a USRA
instruction, given the operands of the OR has no common bits.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Could this be handled with a tablegen pattern? Something like the or_is_add pattern from X86 (but using haveNoCommonBitsSet). Maybe a PatFrag that accepts an add or an add-like-or?
Also should ssra be treated the same way?
llvm/test/CodeGen/AArch64/shift-accumulate.ll | ||
---|---|---|
13 | It's probably a better test to return the %5, not extract a single lane from it. |
Thanks for the comment David. I will take a look at handling in tablegen pattern as well. Will update when I have more.
Also should ssra be treated the same way?
Thanks. I like the added Known bits - they can be useful in many situations.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
1844 | We know that the operands 1 and 2 will be constants, so I think we can just grab the values for them directly: uint64_t Mask = ~(Op->getConstantOperandVal(1) << Op->getConstantOperandVal(2)); The Mask then specifies which bits are known to be 0. | |
llvm/lib/Target/AArch64/AArch64InstrInfo.td | ||
8366 | Now that we have this, it may be worth using it in the existing patterns, using a PatFrags that accepts either "add" or "or_is_add". That would save the need for new patterns. | |
llvm/test/CodeGen/AArch64/shift-accumulate.ll | ||
2 | Can you run the update_llc_test_checks on the file? It's missing some expected output. It's also worth adding some simple case for each signedness / type that have tablegen patterns added / changed. | |
19 | Unfortunately this doesn't verify. I think because the BIC code is giving incorrect Known bits. |
Thanks. Looks good, as far as I can tell.
Can you make sure there are some i64 tests, especially v1i64. Those have a different tablegen class, so it would be good to make sure they have tests covering them.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
1836–1840 | Is this chunk needed? I think the operands should be constants by definition. | |
llvm/test/CodeGen/AArch64/shift-accumulate.ll | ||
3 | I tend to use -mtriple=aarch64-none-eabi | |
5 | You can usually remove dso_local and local_unnamed_addr #0 align 32, to make the tests a little cleaner. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
1836–1838 | We don't appear to use Known2 and Known3 | |
llvm/test/CodeGen/AArch64/shift-accumulate.ll | ||
31 | Can you add the <1 x i64> version of this test (and ssra_v2i64 too). It should then test the "scalar" instructions with "d0" and "d1" operands. |
We don't appear to use Known2 and Known3