This adds another pattern matcher to the combiner to generate the REV16 instruction for a case that we were not handling.
Details
Diff Detail
Event Timeline
llvm/test/CodeGen/Thumb2/thumb2-rev16.ll | ||
---|---|---|
2 | Might be best if you regenerate+commit this file against trunk and then rebase the patch so it shows the full codegen diff. |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
5701 |
That is checked here, but I will add an assert to MatchBSwapHWordOrAndAnd |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
5655 | Use current formatting rules, so lowerCamel: "matchBSwap..." | |
5658 | Also assert that N->getOpcode() == ISD::OR ? | |
5667–5669 | This is too specific - what if the operands of the "or" are commuted? This patch should have a test for that possibility: define i32 @bswap_ror_commuted(i32 %a) { %l8 = shl i32 %a, 8 %r8 = lshr i32 %a, 8 %mask_l8 = and i32 %l8, 4278255360 %mask_r8 = and i32 %r8, 16711935 %tmp = or i32 %mask_r8, %mask_l8 ret i32 %tmp } | |
5674–5675 | These could use more descriptive names: ShiftAmt{1/2}. |
Thanks for looking! Comments addressed. I have also added more test for the cases that shouldn't trigger.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
5668–5671 | This check works, but it makes me nervous because I think it requires demanded bits to alter the code for correctness. define i32 @not_rev16(i32 %a) { %l8 = shl i32 %a, 8 %r8 = lshr i32 %a, 8 %mask_r8 = and i32 %r8, 4278255360 %mask_l8 = and i32 %l8, 16711935 %tmp = or i32 %mask_r8, %mask_l8 ret i32 %tmp } To be safe, I think we should enforce that the masks and shifts are paired correctly. // Canonicalize mask ops to ensure that shift-left operand is on the left. if (Mask2 == 0xff00ff00) { std::swap(N0, N1); std::swap(Mask1, Mask2) } or maybe better - go back to the earlier rev of this patch and call this function with the operands reversed: if (SDValue BSwap = matchBSwapHWordOrAndAnd(TLI, DAG, N, N0, N1, VT, getShiftAmountTy(VT))) return BSwap; // Try again with commuted operands. if (SDValue BSwap = matchBSwapHWordOrAndAnd(TLI, DAG, N, N1, N0, VT, getShiftAmountTy(VT))) return BSwap; |
Many thanks again for the feedback! Really liked that suggestion: so now calling the same function twice but with the operands swapped to check the commuted case. Have also added the suggested test case.
There's 1 more concern for this transform - do we need to limit it when the intermediate values have other uses? This will be interesting because the answer may be different for different targets. Ie, for ARM/Thumb/AArch64, we're able to reduce the whole sequence to a single rev16, so it's always a good transform. But for x86, we're going to need a bswap and ror.
So we need to:
- Add a test like this:
define i32 @extra_maskop_uses2(i32 %a) { %l8 = shl i32 %a, 8 %r8 = lshr i32 %a, 8 %mask_l8 = and i32 %l8, 4278255360 %mask_r8 = and i32 %r8, 16711935 %or = or i32 %mask_r8, %mask_l8 %mul = mul i32 %mask_r8, %mask_l8 ; use the mask ops for some other reason %r = mul i32 %mul, %or ; and use that result for some other reason ret i32 %r }
- Copy that test (maybe the whole test file) over to llvm/tests/CodeGen/X86 and generate CHECK lines with utils/update_llc_test_checks.py.
- Possibly add some 'hasOneUse()' logic to the code to avoid regressions.
Please push the test changes to trunk before updating in this review (no need for pre-commit review unless you have questions/concerns).
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
5664–5665 | Nit: the input variables are "0" and "1", but this shifts the indexing to "1" and "2". It would be better to make this consistent (same for the "Shift" variables below this). Most of the code in DAGCombiner uses "0"-based indexing. |
Thanks, I will first add the tests (will indeed add the whole test file). Will do that on Monday, and after that, will return and revisit this patch
Fixed the off-by-1 error in the variable naming, and added hasOneUse checks for the AND instructions.
LGTM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
5664 | This check seems over-protective. If only one of the 'and' ops has extra uses, the transform is probably still worth doing. It's ok to add a TODO comment here if you want and make that a small follow-up patch after adding more tests to exercise those cases. | |
llvm/test/CodeGen/X86/rev16.ll | ||
228 ↗ | (On Diff #244946) | Remove TODO comment - this is working as expected. |
Thanks, this has been a good and useful introduction to DAGCombine for me :-)
Before committing, I will add a TODO saying that this hasOneUse is a bit over-protective for now (will follow up later), and will remove the TODO from the test case.
Use current formatting rules, so lowerCamel: "matchBSwap..."
(can fix the related function names as a preliminary step)