The optimization has been refactored to fix certain bugs and
limitations. The condition for lowering to S[LR]I has been changed
to reflect the manual pseudocode description of SLI and SRI operation.
The optimization can now handle more cases of operand type and order.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The problem with the original patch was that ElemMask would overflow, since it was only 32-bit wide. The patch now makes use of APInt to avoid this type of situation.
llvm/test/CodeGen/AArch64/arm64-sli-sri-opt.ll | ||
---|---|---|
1–2 | Please generate the test using update-llc-test-checks.py |
Thanks for the updates. Looks like it's picked up some new tricks since the last version.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
8017 | Can you add a comment saying this reconstructs the immediate from the BIC | |
8020 | I think if this used C1 = ~(C1nodeImm->getZExtValue() << C1nodeShift->getZExtValue()) here, then the IsAnd condition below can be removed. |
Addressed reviewers' comments.
Added comments, simplified some logic and updated llc tests via
update_llc_test_checks.py
Thanks. This looks good to me, but I remember Eli making some comments on the other patch. Please wait for him to comment again too.
LGTM
It's not obvious to me from reading the testcase that we have test coverage code for both the BICi and the isAllConstantBuildVector case; please check before merging.
Probably it would be worth looking into pattern-matching cases that don't involve precisely the AND you're looking for, using computeKnownBits. But that doesn't need to happen here.
Added a test for the isAllConstantBuildVector case. The BICi case happens
when the constant values' sizes are greater than 8 bits.
Can you add a comment saying this reconstructs the immediate from the BIC