This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] reduce test-for-overflow of shifted value
ClosedPublic

Authored by tianz on Aug 29 2022, 2:29 PM.

Details

Summary

Refer to issue #57338. The added code makes the following transformations:
For unsigned predicates / eq / ne:

icmp pred (x << 1), x --> icmp getSignedPredicate(pred) x, 0
icmp pred x, (x << 1) --> icmp getSignedPredicate(pred) 0, x

Some examples:
https://alive2.llvm.org/ce/z/ckn4cj
https://alive2.llvm.org/ce/z/h-4bAQ

Diff Detail

Event Timeline

tianz created this revision.Aug 29 2022, 2:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 29 2022, 2:29 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
tianz requested review of this revision.Aug 29 2022, 2:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 29 2022, 2:29 PM
tianz updated this revision to Diff 456728.Aug 30 2022, 11:06 AM

Rewrote with m_Specific

tianz updated this revision to Diff 456733.Aug 30 2022, 11:13 AM
This comment was removed by tianz.
tianz updated this revision to Diff 456735.Aug 30 2022, 11:18 AM

Removed unnecessary parentheses

spatel added inline comments.Aug 30 2022, 11:48 AM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
4556–4557

Nice! I couldn't tell if there was a simple way to express the transform for all predicates, but that looks right.
To verify, please add at least a couple of Alive2 proof links to the patch summary.

llvm/test/Transforms/InstCombine/icmp-shl-1-overflow.ll
13
  1. We can reduce the tests by 2x without losing much coverage: drop all of the existing vector tests, and change some of the scalar tests to vector types. You can also vary the scalar/element type from i8 to show that the transform doesn't care what the type is (can even use strange types like <3 x i7>).
  1. For vectors, you can replace a "1" element with "poison" to show that the transform works even if the vector constant is not fully defined (has "don't care" elements).
  1. We also want at least a couple of negative tests to verify that the transform does not fire when it is not intended (for example, a test with a signed predicate and a test where the shift is by something other than "1").
  1. Create an extra use of the shl value in at least one test, so we know that the transform works independent of other uses.
  1. If some other transform alters the code before it reaches the new transform, then adjustments will be needed (grep for "thwart complexity-based canonicalization" in this test dir to how to work around operands that get commuted.

Once you have the tests updated, let's pre-commit them with the baseline CHECK lines. That way we can confirm that we are testing the expected patterns.

tianz updated this revision to Diff 456871.Aug 31 2022, 12:01 AM

Updated tests according to review comments

tianz edited the summary of this revision. (Show Details)Aug 31 2022, 12:04 AM
tianz marked an inline comment as done.Aug 31 2022, 12:14 AM
tianz added inline comments.
llvm/test/Transforms/InstCombine/icmp-shl-1-overflow.ll
13

Thanks for all the suggestions! I've updated the tests according to #1 - #4. I don't find "thwart complexity-based canonicalization" necessary, but definitely correct me if I'm wrong.

For some of the tests, the result code is actually altered after the transform I added, so I had to modify the check to compare against the altered code (e.g. I transform uge i8 %x, %add into sge i8 0, %x, but the code is then altered by some other transform and eventually becomes slt i8 %x, 1 ) I'm sure what's the best way to handle that in the tests.

Regarding your last comment, I'm not exactly sure what you meant. Would you mind giving some examples on how to add the baseline CHECK lines?

Thanks a lot!

spatel added inline comments.Aug 31 2022, 6:17 AM
llvm/test/Transforms/InstCombine/icmp-shl-1-overflow.ll
13

Baseline tests: create a patch with only these tests and the current CHECK lines after running "utils/update_test_checks.py".
Then, we'll commit those to the main branch to show any transforms we already do (before the code change in this patch).
After that, rebase/update this patch after applying the code change. This patch will then show the test diffs created directly by this patch.

That helps in a few ways: it makes it clear what current output looks like, it makes the review easier by showing diffs, and we won't lose test coverage even if this patch has to be reverted for some reason.

If you don't have commit privileges, you can post the baseline tests as is its own review here on Phabricator, and I can push it for you.

The tests with commuted operands (half of the tests) are not exercising the code path that you want because the operands are changed before we reach the new transform. Example:

define <2 x i1> @icmp_shl_ugt_2(<2 x i32> %x) {
  %add = shl <2 x i32> %x, <i32 1, i32 1>
  %cmp = icmp ugt <2 x i32> %x, %add
  ret <2 x i1> %cmp
}

% opt -instcombine icmp.ll -S
...
define <2 x i1> @icmp_shl_ugt_2(<2 x i32> %x) {
  %add = shl <2 x i32> %x, <i32 1, i32 1>
  %cmp = icmp ult <2 x i32> %add, %x  ; these got switched based on operand "complexity"
  ret <2 x i1> %cmp
}
tianz added inline comments.Sep 1 2022, 8:14 PM
llvm/test/Transforms/InstCombine/icmp-shl-1-overflow.ll
13

Thanks for the explanation! Here are the baseline tests https://reviews.llvm.org/D133182. I updated them with "thwart complexity-based canonicalization" when %add is the second operand. Please push it for me it they look good to you, since I don't have commit privileges yet. I'll then update the tests here as well!

tianz updated this revision to Diff 457704.Sep 2 2022, 2:23 PM
tianz marked 2 inline comments as done.
tianz added inline comments.
llvm/test/Transforms/InstCombine/icmp-shl-1-overflow.ll
13

I have updated the tests here as well!

spatel accepted this revision.Sep 3 2022, 5:22 AM

LGTM

llvm/test/Transforms/InstCombine/icmp-shl-1-overflow.ll
117

You changed this test (and one below) because there's a follow-on transform that would fold the add into the equality comparison?
It's fine to demonstrate a sequence of transforms as long as we're starting from the transform in this patch.

In fact, this result shows that we missed another optimization - the result is 0 when the input is signed > 42 or signed < -42, and that can be reduced to:
https://alive2.llvm.org/ce/z/erdvwc

This revision is now accepted and ready to land.Sep 3 2022, 5:22 AM
spatel added inline comments.Sep 3 2022, 5:48 AM
llvm/test/Transforms/InstCombine/icmp-shl-1-overflow.ll
117
spatel added inline comments.Sep 3 2022, 5:50 AM
llvm/test/Transforms/InstCombine/icmp-shl-1-overflow.ll
117

Oops - copied the wrong link. Should be:
https://github.com/llvm/llvm-project/issues/57542

This revision was automatically updated to reflect the committed changes.