This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold signbit test of a pow2 or zero
ClosedPublic

Authored by junaire on Feb 24 2023, 10:28 PM.

Diff Detail

Event Timeline

junaire created this revision.Feb 24 2023, 10:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2023, 10:28 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
junaire requested review of this revision.Feb 24 2023, 10:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2023, 10:28 PM
goldstein.w.n added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1343

There are other power of 2 patterns. Maybe use isKnownPowerOf2?

goldstein.w.n added inline comments.Feb 24 2023, 11:07 PM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1343

disregard, misunderstood the patch.

goldstein.w.n added inline comments.Feb 24 2023, 11:09 PM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1343

m_Sub(m_Zero(), m_Value(X)) -> m_Neg(m_Value(X))?
Also do you need m_OneUse? This shouldn't ever create more instructions.

junaire updated this revision to Diff 500372.Feb 24 2023, 11:27 PM

Address comments.

junaire added inline comments.Feb 24 2023, 11:31 PM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1343

Also do you need m_OneUse? This shouldn't ever create more instructions.

I thought it was unnecessary at first, but then it cause a regression in smax_xor_pow2_neg,
so I added it.

@@ -137,7 +137,10 @@ define i8 @smax_xor_pow2_neg(i8 %x, i8 %y) {                                                                                      ; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[Y:%.*]], -128                                                                                             ; CHECK-NEXT:    br i1 [[CMP]], label [[NEG:%.*]], label [[POS:%.*]]                                                                                  ; CHECK:       neg:                                                                                                                                  -; CHECK-NEXT:    [[R:%.*]] = and i8 [[X:%.*]], 127                                                                                                   +; CHECK-NEXT:    [[NY:%.*]] = sub i8 0, [[Y]]                                                                                                        +; CHECK-NEXT:    [[YP2:%.*]] = and i8 [[NY]], [[Y]]                                                                                                  +; CHECK-NEXT:    [[X_XOR:%.*]] = xor i8 [[YP2]], [[X:%.*]]                                                                                           +; CHECK-NEXT:    [[R:%.*]] = call i8 @llvm.smax.i8(i8 [[X]], i8 [[X_XOR]])                                                                            ; CHECK-NEXT:    ret i8 [[R]]                                                                                                                         ; CHECK:       pos:                                                                                                                                   ; CHECK-NEXT:    call void @barrier()
junaire updated this revision to Diff 500383.Feb 25 2023, 12:18 AM

Update the comment.

Please can you add vector test coverage?

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1343

Use m_c_And (and add suitable test coverage)?

junaire updated this revision to Diff 500560.Feb 26 2023, 6:15 AM

Address comments, thanks @RKSimon

Please can you add vector test coverage?

Oops, missed this comment, will do.

spatel added inline comments.Feb 26 2023, 6:38 AM
llvm/test/Transforms/InstCombine/fold-signbit-test-power2.ll
14

Ths isn't testing what you expected because the and operands will be commuted before they reach the transform in this patch. This would be easier to see if you pre-commit the tests with baseline results (no pre-commit Phab review is needed for that NFC change).

grep for "thwart complexity-based canonicalization" in the test directory to see how to create a test that handles the commuted pattern.

36

See comment about commuting - same as above.

junaire updated this revision to Diff 500564.Feb 26 2023, 6:40 AM

Add vevtor tests.

junaire updated this revision to Diff 500587.Feb 26 2023, 8:11 AM

Hi @spatel, thanks for your comments, I updated the tests according to your
suggestions. However, everything stop folding after I use div instructions to
each oprands of and instruction.

Can you take a look? Is this because I missed something? Or my fold pattern is wrong.

Hi @spatel, thanks for your comments, I updated the tests according to your
suggestions. However, everything stop folding after I use div instructions to
each oprands of and instruction.

Can you take a look? Is this because I missed something? Or my fold pattern is wrong.

There shouldn't be any extra instruction between the sub (negate) and and. You just need one extra binary-op instruction to create "%x", so it stays operand 0 of the and. You don't need any extra instructions if you want "%x" to be operand 1.

I think it would be better to add the transform inside of InstCombinerImpl::foldICmpAndConstant().

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1341

"CI" should be "MinSignedC" or something like that in these comments and the code to be more specific.

Hi @spatel, thanks for your comments, I updated the tests according to your
suggestions. However, everything stop folding after I use div instructions to
each oprands of and instruction.

Can you take a look? Is this because I missed something? Or my fold pattern is wrong.

There shouldn't be any extra instruction between the sub (negate) and and. You just need one extra binary-op instruction to create "%x", so it stays operand 0 of the and. You don't need any extra instructions if you want "%x" to be operand 1.

I think it would be better to add the transform inside of InstCombinerImpl::foldICmpAndConstant().

Thanks for your comments, they are very helpful. I pushed an NFC change about the recommit test in https://github.com/llvm/llvm-project/commit/cf491a165f239abfa7ab9e707f5cbd1861a6cb20 and moved the fold pattern to the place you suggest.

Please take a look :)

We should have 2 more tests: (1) extra use of the sub and (2) extra use of the and.
As noted earlier, this patch should not require a "m_OneUse" limitation. I realize that it looks like a regression for the existing test, but that should be ok when viewed globally: we are reducing to an equality compare, and GVN, CVP, or some other pass will reduce that to the optimal form. It's just lucky (or unlucky) that the min/max folds added with D144606 are able to reduce it all within InstCombine.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1853

Formatting:

Value *X;
llvm/test/Transforms/InstCombine/fold-signbit-test-power2.ll
5

It's easier to follow the test progression if we make the test names slightly more specific.
This is a signbit test for "isNegative", so "pow2_or_zero_is_negative".
To provide a little more coverage, you could change some of the tests to include variations of that like:

define i1 @pow2_or_zero_is_negative(i8 %x) {
  %negx = sub i8 0, %x
  %pow2_or_zero = and i8 %negx, %x 
  %cmp = icmp ugt i8 %pow2_or_zero, 127
  ret i1 %cmp
}

That should already fold as expected without having to change anything in this patch, but it's good to show that we can handle it.

spatel added inline comments.Feb 27 2023, 5:40 AM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1826

Re-use this logic (put the new code inside of here)?

junaire updated this revision to Diff 500773.Feb 27 2023, 6:43 AM

Reuse existing code and add more tests.

Just a beginer to the middle-end so bear with me if I get things wrong!

spatel added inline comments.Feb 27 2023, 7:01 AM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1835–1836

It could go either way, but I prefer "C"-style notation (consistent with the comment above) for its compactness:

// (X & -X) <  0 --> X == MinSignedC
// (X & -X) > -1 --> X != MinSignedC
llvm/test/Transforms/InstCombine/fold-signbit-test-power2.ll
58

This is "is_not_negative"

105

Let's not duplicate everything for unsigned predicates. It's sufficient to change one or two of the above tests to confirm that we handle the various forms of signbit tests.

junaire updated this revision to Diff 500783.Feb 27 2023, 7:38 AM

Update tests.

junaire updated this revision to Diff 500785.Feb 27 2023, 7:42 AM

Update tests.

junaire updated this revision to Diff 500787.Feb 27 2023, 7:46 AM

Update tests

junaire updated this revision to Diff 500788.Feb 27 2023, 7:47 AM

Make sure the comments are properly aligned!

LGTM - thanks!
See inline comments for a small adjustment to the tests.

llvm/test/Transforms/InstCombine/fold-signbit-test-power2.ll
15

I'd rather not add this "is_not_negative / ult" instruction to this test. It is confusing given the test name.

40

I'd rather not add this "is_not_negative / ult" instruction to this test. It is confusing given the test name.

spatel accepted this revision.Feb 27 2023, 10:00 AM
This revision is now accepted and ready to land.Feb 27 2023, 10:00 AM
This revision was landed with ongoing or failed builds.Feb 27 2023, 11:53 PM
This revision was automatically updated to reflect the committed changes.

Thanks to everyone who helped me review the patch! My first contribution to the optimization world, nice!