This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Peep SELECT_CC patterns that match smin(a,0) and smax(a,0).
ClosedPublic

Authored by cameron.mcinally on Apr 13 2023, 9:42 AM.

Details

Summary

GitHub Issue #61767

With a previous patch to canonicalize SPF to min/max intrinsics (a266af721153), we saw a 6% performance regression from our out-of-tree compiler on 372.smithwa.

This patch attempts to recover from the SPF canonicalization by peeping smin(a,0) and smax(a,0) SELECT_CC patterns during AArch64ISelLowering. We'll match the following patterns:

(SELECT_CC setgt, lhs, 0, lhs, 0) -> (BIC lhs, (SRA lhs, typesize-1))
(SELECT_CC setlt, lhs, 0, lhs, 0) -> (AND lhs, (SRA lhs, typesize-1))

Notice that there's some noise in the existing test cases. They haven't been run through update_llc_test_checks.py since the assembly comments for constants have been added. I will update those tests pre-commit if that will help.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2023, 9:42 AM
cameron.mcinally requested review of this revision.Apr 13 2023, 9:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2023, 9:42 AM
dmgreen added inline comments.Apr 13 2023, 1:59 PM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
8945

Is this checking both operands for zero because they might still be different zeroes?

llvm/test/CodeGen/AArch64/min-max-peep.ll
2 ↗(On Diff #513275)

CHECK would never match both +cssc base cases, but the CSSC cases should all be the same between global and sdag, from the look of it. Can they share the same prefix and remove CHECK?

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
8945

I wasn't certain a Constant 0 would share the same SDValue across nodes and already had the ConstantSDNodes from the code above. That was the only motivation. What do you think?

llvm/test/CodeGen/AArch64/min-max-peep.ll
2 ↗(On Diff #513275)

I'll give that a shot.

I copied this pattern from the existing min-max.ll test, but I now see that it only differs for i8, i16, and vectors. This change is restricted to i32 and i64.

Update CSSC checks as requested.

cameron.mcinally marked an inline comment as done.Apr 13 2023, 2:54 PM
dmgreen accepted this revision.Apr 14 2023, 7:25 AM

Thanks. LGTM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
8945

It's a little subtle but there is actually two different zeros, depending on whether the constant is opaque or not. I was originally thinking of suggesting the code just checked for RHS == FVal, but the way you have it sounds better.

llvm/test/CodeGen/AArch64/min-max-peep.ll
7 ↗(On Diff #513363)

I would use "peephole" or "combine", as peeps is not something I've seen elsewhere.

2 ↗(On Diff #513275)

Looks good thanks. It may be worth adding i8 and i16 sized too. They should get legalized to i32, so should work the same way,

This revision is now accepted and ready to land.Apr 14 2023, 7:25 AM
This revision was landed with ongoing or failed builds.Apr 14 2023, 9:51 AM
This revision was automatically updated to reflect the committed changes.

"peep" -> "combine" and i8 /i16 tests added. If you see any problems, I'll fix up post-commit. Thanks, David.