Previously CCMP/FCCMP instructions were only used by the
AArch64ConditionalCompares pass for control flow. This patch uses them
for SELECT like instructions as well by matching patterns in ISelLowering.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I am currently performing llvm-testsuite runs with this change. I'm seeing a couple of swings in the positive and negative direction. I'll investigate these further.
I investigated the performance changes in depth showing the change improves things:
- Most of the changes were noise. I did not consider anything changing less than 0.3%
- 9 Benchmarks improve (between 0.3% and 1.7%)
- 2 Regressions I have seen stem from the fact that we generate code for the left operand of the or first while previously we would first produce it for the right operand of the OR, this led to a case of MachineGVN not firing and another case where the slightly more limited range of immediates on ccmp compared to cmp mattered. Changing the code to handle the right side of the AND/OR first improved out luck...
- There is a regression 464.h264ref (0.5%) and 400.perlbench(0.4%). I couldn't see any obvious reasons when comparing the generated code, in fact many places improved. So I'm blaming cache effects and/or measuring noise.
Hi Matthias,
Thanks for working on this, this looks great!
I have some comments below. The patch is also a bit big and touches different areas (ISel & tablegen) but I see there is no real way to test just the tablegen changes in isolation so I think this is OK.
I'd really like Tim to cast an eye over this before it goes in, so LGTM but hold off for his.
Cheers,
James
lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
771 ↗ | (On Diff #21639) | Have you missed FCCMP out here? |
1094 ↗ | (On Diff #21639) | Was this broken before, or are you changing something, or are you just making something more explicit? |
1140 ↗ | (On Diff #21639) | This would probably look nicer with the cast to ConstantSDNode taken out of the if: auto *C = dyn_cast<ConstantSDNode>(RHS.getOperand(0)); if (...) else if (RHS.getOpcode() == ISD::SUB && C && C->isNullValue() ... |
1154 ↗ | (On Diff #21639) | I'm wondering whether this should have a maximum recursion depth. I can imagine it entirely possible that some auto-generated code could produce a *huge* conjunction ladder, and might cause us to overflow here. |
1195 ↗ | (On Diff #21639) | Need brackets around this one-liner as the else case has brackets. |
lib/Target/AArch64/AArch64ISelLowering.h | ||
503 ↗ | (On Diff #21639) | & binds to the right. |
Thanks for the review James.
The attached version is updated to ToT and addresses your comments. I backed out the unrelated cleanup of the node type.
Hi Matthias,
Sorry this took so long. I couldn't spot any problems with the code either (after James's suggested cleanups).
Tim.