This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold comparison of usub.sat
ClosedPublic

Authored by 0xdc03 on Jun 30 2023, 5:11 AM.

Details

Summary

This patch introduces a fold for the operation "usub_sat(X, C) pred C2"
where "C" and "C2" are constants. The fold is:

usub_sat(X, C) pred C2
=> (X < C)  || ((X - C) pred C2) -> when (0 pred C2) is true
=> (X >= C) && ((X - C) pred C2) -> when (0 pred C2) is false

These expressions can generally be folded into a simpler expression. As
they can sometimes emit more than one instruction, they are limited to
cases where the "usub_sat" has only one user.

Fixes #58342.

Proofs: https://alive2.llvm.org/ce/z/ws_N2J

Diff Detail

Event Timeline

0xdc03 created this revision.Jun 30 2023, 5:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2023, 5:11 AM
0xdc03 requested review of this revision.Jun 30 2023, 5:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2023, 5:11 AM
nikic added a comment.Jun 30 2023, 7:16 AM

Could you please add alive2 proofs for these transforms?

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3440

m_ImmConstant

3733

Add comment explaining the transform, something like this:

//    usub.sat(X, C) pred C2
// -> (X < C ? 0 : X - C) pred C2
// -> X >= C && (X - C) pred C2 (if 0 pred C2 is false)
// -> X < C || (X - C) pred C2 (if 0 pred C2 is true)
3743

Some of these cases will end up producing 2 instructions (add + icmp), so I think we should add a one-use limitation on the intrinsic.

llvm/test/Transforms/InstCombine/icmp-usub-sat.ll
120

Add a multi-use and vector test.

Can you both add comment explaning at least the fold taking place in the code (where it takes place) and update the summary with an alive2 link + some description of the exact folds.

goldstein.w.n added inline comments.Jun 30 2023, 8:49 AM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3438

EqPred is kind of confusing (since its a non-equality pred), maybe NewPred?

3734

m_ImmConstant() here as well.

llvm/test/Transforms/InstCombine/icmp-usub-sat.ll
120

Can you split tests to seperate patch (make series with tests as Patch1 then impl as Patch2) so we can see the diff this patch causes.

0xdc03 updated this revision to Diff 536744.Jul 3 2023, 6:25 AM
  • Redo the broken code
  • Add comments explaining everything with derivations

Depends on D154342.

0xdc03 marked 7 inline comments as done.Jul 3 2023, 6:26 AM
0xdc03 added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3438

Done!

3440

Done!

3733

Done!

3734

Done!

3743

Done!

llvm/test/Transforms/InstCombine/icmp-usub-sat.ll
120

Done! D154342

0xdc03 edited the summary of this revision. (Show Details)Jul 3 2023, 6:32 AM
0xdc03 edited the summary of this revision. (Show Details)
nikic added inline comments.Jul 3 2023, 6:36 AM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3362–3365

This should be if (match(Op1, m_APInt(COp1)) to match a constant int or int splat.

3377
3380

Check the return value -- does this actually always succeed?

It would have expected that you need to use the getEquivalentICmp overload that also provides an additional Offset (because that one *is* guaranteed to succeed).

3396

Do we ever actually take this fallback code patch? I searched for freeze, and i1 and or i1 in your tests without hits.

3433

Could fold this into foldICmpUSubSatWithConstant() and only impose the limitation if we actually need to produce more than one instruction.

0xdc03 updated this revision to Diff 536827.Jul 3 2023, 10:14 AM
0xdc03 marked 6 inline comments as done.
  • Address reviewer comments
  • Remove fold for non-splat constant vectors
0xdc03 marked 4 inline comments as done.Jul 3 2023, 10:18 AM
0xdc03 added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3362–3365

Done!

3377

Done!

3380

Done! I am a bit concerned about the codegen this causes, maybe I have done something wrong with using Builder.CreateAdd? Looking at test cases like icmp_sle_basic_positive, there is an extra add instruction generated, which I would've expected to get folded. Could this be due to a lack of nuw/nsw?

3396

Yeah, I had not come up with a test case to exercise that code path, because I didn't realize that the fold is only called when splat vectors are given as operands to the icmp, and I have no tests to check non-splat vector in the usub.sat and a splat vector in the icmp.

3433

Done!

0xdc03 updated this revision to Diff 536831.Jul 3 2023, 10:21 AM
0xdc03 marked 4 inline comments as done.
  • Remove unnecessary header "ValueTracking.h" now that isGuaranteedNotToBeUndefOrPoison is not being used.
goldstein.w.n accepted this revision.Jul 3 2023, 11:37 AM

LGTM. Since nikic was the primary review give him a few days to potentially add more feedback before pushing.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3437

nit: extra space before "a <= b" compared to below.

This revision is now accepted and ready to land.Jul 3 2023, 11:37 AM
nikic added inline comments.Jul 3 2023, 1:09 PM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3366
3367

Prefer early return over nesting.

3369

This comment isn't correct.

3373
3379

Early return

3645–3657

In terms of overall code structure, I would prefer it if we added an extra switch (II->getIntrinsicID()) here, before the isEquality() handling, where we can handle folds that apply for both equality and non-equality comparisons. This is a recurring pattern, e.g. we've seen the same issue in D152677.

0xdc03 updated this revision to Diff 537043.Jul 4 2023, 4:08 AM
  • Address reviewer comments
    • Remove unnecessary nesting
    • Add an extra switch to foldICmpIntrinsicWithConstant
0xdc03 marked 7 inline comments as done and 2 inline comments as done.Jul 4 2023, 4:11 AM
0xdc03 added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3369

Is this better: // Convert '(X - C) pred C2' into 'X pred C2' shifted by C.?

3380

I marked this as done, but I think its worth confirming this just in case you missed this.

3645–3657

Done!

0xdc03 marked an inline comment as done and an inline comment as not done.Jul 4 2023, 4:12 AM
nikic added inline comments.Jul 4 2023, 7:26 AM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3380

The codegen here is correct. This is how you do a two-sided range check (it's equivalent to something like X >= C1 && X <= C2).

3396

Redundant nullptr return...

3442

I don't think this comment adds value relative to the old one-liner...

3654

I'd move this compare() call into foldICmpUSubSatWithConstant(). Seems odd to split this operation out.

0xdc03 updated this revision to Diff 537245.Jul 5 2023, 12:49 AM
  • Address reviewer comments
    • Remove unnecessary comments
    • Move function further down for cleaner diff
    • Remove unnecessary parameter
0xdc03 marked 5 inline comments as done.Jul 5 2023, 12:53 AM
0xdc03 added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3380

Ah, fair enough them.

3396

Oof ... :facepalm:

3442

Removed it!

3654

Ah, that was bad copy pasting from the older code! Sorry about that.

0xdc03 marked 4 inline comments as done.Jul 5 2023, 12:53 AM
0xdc03 updated this revision to Diff 537253.Jul 5 2023, 1:04 AM
  • Rebase on main
nikic accepted this revision.Jul 5 2023, 1:25 AM

LGTM

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
3596
0xdc03 updated this revision to Diff 537283.Jul 5 2023, 3:20 AM
  • Rebase on updated test cases
  • Fix )
0xdc03 added a comment.Jul 5 2023, 5:35 AM

Test failure seems unrelated to the change:

Failed Tests (1):
  Clangd Unit Tests :: ./ClangdTests/ClangdServer/RespectsTweakFormatting
This revision was landed with ongoing or failed builds.Jul 5 2023, 6:27 AM
This revision was automatically updated to reflect the committed changes.