This patch adds handling of rotation patterns with constant shift amounts - the next bit will be how we want to support non-uniform constant vectors.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp | ||
---|---|---|
2115 | Probably need to either guard or assert against invalid shift amounts here. Don't want to combine -33 and +1. | |
llvm/lib/Transforms/Utils/Local.cpp | ||
2910 ↗ | (On Diff #291597) | mount -> amount |
2922 ↗ | (On Diff #291597) | Don't we need to check that the fsh operands are the same, or at least LHS->Provider and RHS->Provider are? |
2924 ↗ | (On Diff #291597) | I suspect this is inverted, but maybe I misunderstand the direction of the mapping. It would be good to at least add some tests where the fsh amount is not exactly half the bitwidth. Result = fshl(LHS, RHS, 1) Result[0] = RHS[31] Result[1] = LHS[0] ... Result[31] = LHS[30] |
Fix the -ve shift amount issue - still looking at better testing of fshl/fshr inside collectBitParts
There's a 40% code size increase on CMakeFiles/7zip-benchmark.dir/CPP/7zip/Crypto/Sha1.cpp.o, might be worth double checking that we're not missing some optimizations on funnel shifts.
I think I've worked out whats happened - we've replaced a 3 x instruction pattern (which was later matched to a single rotate instruction in the DAG) with 1 intrinsic (which matched to the same single rotate instruction) - which seems to have allowed a couple of core loops to be unrolled further/completely - leading to the code size increase.
Probably need to either guard or assert against invalid shift amounts here. Don't want to combine -33 and +1.