This is an archive of the discontinued LLVM Phabricator instance.

AArch64: Use CMP;CCMP sequences for and/or/setcc trees.
ClosedPublic

Authored by MatzeB on Mar 10 2015, 2:40 PM.

Details

Summary

Previously CCMP/FCCMP instructions were only used by the
AArch64ConditionalCompares pass for control flow. This patch uses them
for SELECT like instructions as well by matching patterns in ISelLowering.

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB updated this revision to Diff 21639.Mar 10 2015, 2:40 PM
MatzeB retitled this revision from to AArch64: Use CMP;CCMP sequences for and/or/setcc trees..
MatzeB updated this object.
MatzeB edited the test plan for this revision. (Show Details)
MatzeB added a reviewer: t.p.northover.
MatzeB set the repository for this revision to rL LLVM.
MatzeB added a subscriber: Unknown Object (MLST).
ab added a subscriber: ab.Mar 10 2015, 2:55 PM

I am currently performing llvm-testsuite runs with this change. I'm seeing a couple of swings in the positive and negative direction. I'll investigate these further.

I investigated the performance changes in depth showing the change improves things:

  • Most of the changes were noise. I did not consider anything changing less than 0.3%
  • 9 Benchmarks improve (between 0.3% and 1.7%)
  • 2 Regressions I have seen stem from the fact that we generate code for the left operand of the or first while previously we would first produce it for the right operand of the OR, this led to a case of MachineGVN not firing and another case where the slightly more limited range of immediates on ccmp compared to cmp mattered. Changing the code to handle the right side of the AND/OR first improved out luck...
  • There is a regression 464.h264ref (0.5%) and 400.perlbench(0.4%). I couldn't see any obvious reasons when comparing the generated code, in fact many places improved. So I'm blaming cache effects and/or measuring noise.
jmolloy added a subscriber: jmolloy.May 7 2015, 1:12 AM

Hi Matthias,

Thanks for working on this, this looks great!

I have some comments below. The patch is also a bit big and touches different areas (ISel & tablegen) but I see there is no real way to test just the tablegen changes in isolation so I think this is OK.

I'd really like Tim to cast an eye over this before it goes in, so LGTM but hold off for his.

Cheers,

James

lib/Target/AArch64/AArch64ISelLowering.cpp
771 ↗(On Diff #21639)

Have you missed FCCMP out here?

1094 ↗(On Diff #21639)

Was this broken before, or are you changing something, or are you just making something more explicit?

1140 ↗(On Diff #21639)

This would probably look nicer with the cast to ConstantSDNode taken out of the if:

auto *C = dyn_cast<ConstantSDNode>(RHS.getOperand(0));
if (...)
else if (RHS.getOpcode() == ISD::SUB && C && C->isNullValue() ...
1154 ↗(On Diff #21639)

I'm wondering whether this should have a maximum recursion depth. I can imagine it entirely possible that some auto-generated code could produce a *huge* conjunction ladder, and might cause us to overflow here.

1195 ↗(On Diff #21639)

Need brackets around this one-liner as the else case has brackets.

lib/Target/AArch64/AArch64ISelLowering.h
503 ↗(On Diff #21639)

& binds to the right.

MatzeB updated this revision to Diff 25249.May 7 2015, 3:09 PM

Thanks for the review James.

The attached version is updated to ToT and addresses your comments. I backed out the unrelated cleanup of the node type.

t.p.northover accepted this revision.Jun 1 2015, 10:47 AM
t.p.northover edited edge metadata.

Hi Matthias,

Sorry this took so long. I couldn't spot any problems with the code either (after James's suggested cleanups).

Tim.

This revision is now accepted and ready to land.Jun 1 2015, 10:47 AM
This revision was automatically updated to reflect the committed changes.