This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Optimize test for same-sign of values
ClosedPublic

Authored by emgullufsen on Jun 15 2022, 1:30 PM.

Details

Summary

Add pattern-matching code for instructions testing for same-sign of two inputs, and reduce. e.g:

define i1 @samesign(i32 %x, i32 %y) {
  %a = and i32 %x, %y
  %lt = icmp slt i32 %a, 0
  %o = or i32 %x, %y
  %gt = icmp sgt i32 %o, -1
  %r = or i1 %lt, %gt
  ret i1 %r
}

becomes =>

define i1 @samesign(i32 %x, i32 %y) {
  %a = xor i32 %x, %y
  %r = icmp sgt i32 %a, -1
  ret i1 %r
}

alive2 example
godbolt

Also now covering the inverted form (example):

define i1 @samesign_inverted(i32 %x, i32 %y) {
  %a = and i32 %x, %y
  %gt = icmp sgt i32 %a, -1
  %o = or i32 %x, %y
  %lt = icmp slt i32 %o, 0
  %r = and i1 %gt, %lt
  ret i1 %r
}

becomes =>

define i1 @samesign_inverted(i32 %x, i32 %y) {
  %3 = xor i32 %x, %y
  %4 = icmp slt i32 %3, 0
  ret i1 %4
}

godbolt for inverted form
alive2 for inverted form
issue #55988

Diff Detail

Event Timeline

emgullufsen created this revision.Jun 15 2022, 1:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2022, 1:30 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
emgullufsen requested review of this revision.Jun 15 2022, 1:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2022, 1:30 PM

I am in the process of writing more regression tests, but I may need to ask a few questions on that - thank you for your help thus far.

emgullufsen edited the summary of this revision. (Show Details)Jun 15 2022, 1:39 PM
emgullufsen edited the summary of this revision. (Show Details)Jun 15 2022, 1:51 PM
spatel added inline comments.Jun 15 2022, 1:52 PM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2655

It's fine to ignore the inverted logic pattern that ends with an 'and' instruction, but please put a TODO comment on this, so we know it can be handled in a follow-up patch.

2660

Have a look at the m_Specific matcher - we really only have 2 variables in this pattern, so you can just call them X and Y as shown in your code comment.

To match commuted patterns, see m_c_Or / m_c_And.

Check this file for examples of existing code that uses those calls.

2668

ConstantInt::getAllOnesValue()

llvm/test/Transforms/InstCombine/same-sign-naive.ll
3 ↗(On Diff #437321)

You can tack more tests on to the end of related tests in "and-or-icmps.ll".
You can also get an idea about how to vary tests to get better coverage by looking at existing tests.
One common variation to test (and code for): what happens if the intermediate values have other uses?

emgullufsen added inline comments.Jun 15 2022, 3:04 PM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2655

Ah good call - will amend in updated patch (and will start looking at the inverted pattern (for a separate patch).

2660

ack - I ought to have thought to use the commutative m_c_ matchers - thank you, will amend in updated patch

2668

very cool thank you I knew there had to be a better way for that construction - will amend in updated patch

emgullufsen added inline comments.Jun 15 2022, 3:06 PM
llvm/test/Transforms/InstCombine/same-sign-naive.ll
3 ↗(On Diff #437321)

Ah I see - roger that - I'll have more tests in update patch - thank you!

emgullufsen marked an inline comment as not done.Jun 15 2022, 7:26 PM
emgullufsen edited the summary of this revision. (Show Details)

Updated patch to cover the inverted form as well and wrote 5 more tests.
Trying to ensure we only reduce iff intermediate values have one use only.

emgullufsen edited the summary of this revision. (Show Details)Jun 16 2022, 6:36 PM
emgullufsen edited the summary of this revision. (Show Details)
emgullufsen edited the summary of this revision. (Show Details)Jun 16 2022, 6:39 PM
emgullufsen marked 2 inline comments as done.Jun 17 2022, 6:00 AM
spatel edited the summary of this revision. (Show Details)Jun 17 2022, 8:14 AM

The uploaded diff is against the previous revision of this patch. Please rebase so it is against (and can be applied cleanly to) the main branch.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2649–2683

It doesn't make sense to use the commutative matcher on the first match - we're matching 2 arbitrary values, so it will succeed either way.

llvm/test/Transforms/InstCombine/and-or-icmps.ll
2103

We should have tests for commuted patterns like:

define i1 @samesign_commute1(i32 %x, i32 %y) {
	%a = and i32 %x, %y
	%lt = icmp slt i32 %a, 0
	%o = or i32 %x, %y
	%gt = icmp sgt i32 %o, -1
	%r = or i1 %gt, %lt  ; compares are swapped
	ret i1 %r
}

define i1 @samesign_commute2(i32 %x, i32 %y) {
	%a = and i32 %x, %y
	%lt = icmp slt i32 %a, 0
	%o = or i32 %y, %x  ; source values are commuted
	%gt = icmp sgt i32 %o, -1
	%r = or i1 %lt, %gt
	ret i1 %r
}

define i1 @samesign_commute3(i32 %x, i32 %y) {
	%a = and i32 %x, %y
	%lt = icmp slt i32 %a, 0
	%o = or i32 %y, %x  ; source values are commuted
	%gt = icmp sgt i32 %o, -1
	%r = or i1 %gt, %lt ; compares are swapped
	ret i1 %r
}
2156

We should have "negative" tests for patterns that fail to match one or more of the constraints. Example:

define i1 @samesign_inverted_wrong_cmp(i32 %x, i32 %y) {
  %a = and i32 %x, %y
  %gt = icmp sgt i32 %a, 0 ; this is not a sign-bit test
  %o = or i32 %x, %y
  %lt = icmp slt i32 %o, 0
  %r = and i1 %gt, %lt
  ret i1 %r
}

Ah of course - sorry for that - I will submit a new diff for this revision (including suggested changes) against the main branch (not previous diff).

emgullufsen added inline comments.Jun 17 2022, 11:00 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2649–2683

Good call, that makes sense - will amend in updated diff (which will be against main, not stacked on previous diff).

llvm/test/Transforms/InstCombine/and-or-icmps.ll
2103

Oh yes I see - cool thank you I will get these commuted tests included in next diff.

2156

Ah I see, gotcha - thanks for your help I really appreciate it!

emgullufsen marked 2 inline comments as not done.Jun 17 2022, 11:25 AM
emgullufsen marked an inline comment as not done.

Not using m_c matchers on first match and added "Negative" tests and tests for commutativity of inputs and compares.

A few of the tests contained incorrect syntax in the CHECK lines. They were re-capturing variables (i.e. [[x:%.*]]) instead of matching where needed [[x]].

Fixed more incorrect use of capture in CHECK line in one test.

Fixed one more incorrect use of capturing in CHECK line.

Fixed one more incorrect use of capturing in CHECK line.

Did you generate the CHECK lines manually? We don't want to do that generally.
Sorry - I should have mentioned this earlier, but we prefer to use a script that is shown on the first line of the test file :
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py

Oh dang my bad on that - I had actually read in the testing infra guide about those scripts, and really ought to have used them - I'll do so now.

Assertions in test CHECK lines now generated via utils/update_test_checks.py

Added four more negative tests that violate constraints.

Added tests to make sure other sign-bit tests are picked up as well

I committed the tests with baseline results, so we have those in place independently of the code change. I then updated the tests to provide a bit more coverage. Please rebase/update after:
feb4b336acc71

It's programmer preference, but I think it'd be easier to read with less IsAnd ? () :() constructs, so something like this:

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index cc06e2fd45db..4f8814475e12 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -2647,6 +2647,38 @@ Value *InstCombinerImpl::foldAndOrOfICmps(ICmpInst *LHS, ICmpInst *RHS,
     }
   }
 
+  // Match naive pattern (and its inverted form) for checking if two values
+  // share same sign. An example of the pattern: (icmp slt (X & Y), 0) | (icmp
+  // sgt (X | Y), -1) -> (icmp sgt (X ^ Y), -1) Inverted form (example): (icmp
+  // slt (X | Y), 0) & (icmp sgt (X & Y), -1) -> (icmp slt (X ^ Y), 0)
+  bool TrueIfSignedL, TrueIfSignedR;
+  if (InstCombiner::isSignBitCheck(PredL, *LHSC, TrueIfSignedL) &&
+      InstCombiner::isSignBitCheck(PredR, *RHSC, TrueIfSignedR) &&
+      (RHS->hasOneUse() || LHS->hasOneUse())) {
+    Value *X, *Y;
+    if (IsAnd) {
+      if ((TrueIfSignedL && !TrueIfSignedR &&
+           match(LHS0, m_Or(m_Value(X), m_Value(Y))) &&
+           match(RHS0, m_c_And(m_Specific(X), m_Specific(Y)))) ||
+          (!TrueIfSignedL && TrueIfSignedR &&
+           match(LHS0, m_And(m_Value(X), m_Value(Y))) &&
+           match(RHS0, m_c_Or(m_Specific(X), m_Specific(Y))))) {
+        Value *NewXor = Builder.CreateXor(X, Y);
+        return Builder.CreateIsNeg(NewXor);
+      }
+    } else {
+      if (((TrueIfSignedL && !TrueIfSignedR &&
+            match(LHS0, m_And(m_Value(X), m_Value(Y))) &&
+            match(RHS0, m_c_Or(m_Specific(X), m_Specific(Y))))) ||
+          (!TrueIfSignedL && TrueIfSignedR &&
+           match(LHS0, m_Or(m_Value(X), m_Value(Y))) &&
+           match(RHS0, m_c_And(m_Specific(X), m_Specific(Y))))) {
+        Value *NewXor = Builder.CreateXor(X, Y);
+        return Builder.CreateIsNotNeg(NewXor);
+      }
+    }
+  }
+
   return foldAndOrOfICmpsUsingRanges(LHS, RHS, IsAnd);
 }
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2657

The use checks are too strict. This optimization should be a win as long as any one of the intermediate values has one use. We are creating 2 new values, so as long as we can remove one of the old ones + the final value, it is an improvement.

I'd make this:

(LHS->hasOneUse() || RHS->hasOneUse())

..and remove the use checks for LHS0 and RHS0 under here.

2671–2675

There's a Builder shortcut to create the new ICmp: CreateIsNeg / CreateIsNotNeg.

I committed the tests with baseline results, so we have those in place independently of the code change. I then updated the tests to provide a bit more coverage. Please rebase/update after:
feb4b336acc71

Sounds good will do, thank you!

It's programmer preference, but I think it'd be easier to read with less IsAnd ? () :() constructs, so something like this:

Sounds good, I prefer with less ternary operator as well - I had been a bit myopic about only building the returned instructions (calling Builder.CreateXor and the like) once. This is definitely nicer without so many conditionals - will amend in updated diff. Thank you for the diff (and copious help) as well!

emgullufsen added inline comments.Jun 19 2022, 7:53 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2657

Alrighty I see now - sounds good thank you - will amend in updated diff.

2671–2675

Aha very cool thank you - will amend in updated diff.

Prefer less IsAnd constructs in matching logic, update test results.

spatel accepted this revision.Jun 19 2022, 1:01 PM

LGTM

This revision is now accepted and ready to land.Jun 19 2022, 1:01 PM
This revision was landed with ongoing or failed builds.Jun 19 2022, 1:44 PM
This revision was automatically updated to reflect the committed changes.