This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] C0 >>{ashr, exact} (X - C1) --> (C0 << C1) >>{ashr} X
ClosedPublic

Authored by nico-abram on Apr 25 2022, 2:52 AM.

Details

Summary

As described in https://github.com/llvm/llvm-project/issues/55016
https://alive2.llvm.org/ce/z/pax7DF

Also includes C0 >> (X - C1) --> (C0 << C1) >> X for positive C0 , which was mentioned in the gh issue as the more common case
https://alive2.llvm.org/ce/z/sRq7w9

Diff Detail

Event Timeline

nico-abram created this revision.Apr 25 2022, 2:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 2:52 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
nico-abram requested review of this revision.Apr 25 2022, 2:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 2:52 AM
  1. Please write a more general proof for these transforms. See examples in the comments here: https://github.com/llvm/llvm-project/issues/54890
  2. Does the lshr transform require "exact" or not?
  3. We need several tests - including "negative" tests that show that the transform does NOT occur when the pre-conditions are not met. You can use the existing tests for the 'shl' transform as a template (those are just below the tests added in this patch).
  4. It would be better to create a small helper function rather than duplicate code for these transforms - can this live under InstCombinerImpl::commonShiftTransforms()? Would it include the existing shl transform too?

This is an alternate proof that I came up with for the previous patch:
https://alive2.llvm.org/ce/z/pnXQYR

Would it be easier to make a common shift transform if we implemented it using count leading/trailing zeros?

nico-abram added a comment.EditedApr 25 2022, 11:21 AM
  1. Please write a more general proof for these transforms. See examples in the comments here: https://github.com/llvm/llvm-project/issues/54890
  2. Does the lshr transform require "exact" or not?

Is this enough proof for the ashr case? https://alive2.llvm.org/ce/z/iLuDyE
EDIT: This formulation is much simpler: https://alive2.llvm.org/ce/z/pax7DF

I don't think the lshr without exact is correct (It becomes more poisonous according to alive2), and it looks like we can only do it for positive numbers (which makes sense to me): https://alive2.llvm.org/ce/z/sRq7w9

nico-abram edited the summary of this revision. (Show Details)

Updating the diff with the condition change and more tests. Will see about avoiding the duplication and moving into commonShiftTransforms

I collapsed all three into InstCombinerImpl::commonShiftTransforms, right below the pre-existing pre-shift combine.

Had to add setHasNoSignedWrap(false); for the Shl case. Otherwise the @shl_nsw_add_negative test case was generating a shl nsw i32 1 instead of shl i32 1. Not entirely sure why, possibly because I'm genering the Op with BinaryOperator::Create(I.getOpcode()?

I collapsed all three into InstCombinerImpl::commonShiftTransforms, right below the pre-existing pre-shift combine.

Nice!

Had to add setHasNoSignedWrap(false); for the Shl case. Otherwise the @shl_nsw_add_negative test case was generating a shl nsw i32 1 instead of shl i32 1. Not entirely sure why, possibly because I'm genering the Op with BinaryOperator::Create(I.getOpcode()?

Hmm - I'm not sure what that is about. Let's pre-commit the tests with baseline CHECK lines to confirm that we are testing the expected patterns. First, we need to add at least 2 more tests though that do not have "exact" - we want to verify that this transform does not fire without "exact".
If you do not have commit access yet, let me know, and I'll push the tests for you.

llvm/test/Transforms/InstCombine/shift-add.ll
204

If the constant is not negative, the ashr will get converted to lshr before we hit the new transform, so this test is not providing the coverage that we want. How about something like this:

%a = add i8 %x, -4
%r = ashr exact i8 -112, %a ; 0b1001_0000

Updated the diff with the 2 tests to ensure the opt is not applied on non-exact ashr/lshr ashr_add_negative_shift_no_exact and lshr_add_negative_shift_no_exact.

To be sure I understand correctly: You mean adding just the tests, and have the same kind of CHECK-NEXT FileCheck directives, but with the expected input IR i.e the same IR the tests contain, to verify the inputs are not being affected by unrelated optimizations? I'm attaching baseline.patch

with that changeset. If needed I can open a new revision for the baseline test cases, or just update the diff of this one. I ran the tests locally on the baseline and they passed.

I do not have commit access, so it would be great if you could push for me. Please use "Nicolas Abram Lujan abramlujan@gmail.com" if a name/email is needed.

To be sure I understand correctly: You mean adding just the tests, and have the same kind of CHECK-NEXT FileCheck directives, but with the expected input IR i.e the same IR the tests contain, to verify the inputs are not being affected by unrelated optimizations?

That is correct - we want to have all of the tests in place before this patch with the current output tested. That way if this patch has to be reverted we will still have test coverage in place for the transform. It also makes it easier to see exactly what is changing with this patch (and what is intentionally not changing).

I'm attaching baseline.patch

with that changeset. If needed I can open a new revision for the baseline test cases, or just update the diff of this one. I ran the tests locally on the baseline and they passed.

I do not have commit access, so it would be great if you could push for me. Please use "Nicolas Abram Lujan abramlujan@gmail.com" if a name/email is needed.

I made several changes/additions to that set of tests and pushed that patch here:
fd9026131e6c

Please have a look, then update this patch by running update_test_checks.py on those tests and re-uploading.

We should not need to set NewShiftOp->setHasNoSignedWrap(false);. If you are seeing a test difference with that line in place, we need to investigate further.

nico-abram updated this revision to Diff 425522.EditedApr 27 2022, 8:04 AM

Updated the patch with update_test_checks.py run with the changes applied on top of https://reviews.llvm.org/rGfd9026131e6c5de6f4e582d5c85c5b5588672dfb
Looks like what I'd expect. It's optimizing it through the opaque use call, but that doesn't look like a problem to me.

I also removed the setHasNoSignedWrap(false) and it did not change the test results or the update_test_checks changes. Not sure what I was seeing yesterday.

Updated the patch with update_test_checks.py run with the changes applied on top of https://reviews.llvm.org/rGfd9026131e6c5de6f4e582d5c85c5b5588672dfb
Looks like what I'd expect. It's optimizing it through the opaque use call, but that doesn't look like a problem to me.

I also removed the setHasNoSignedWrap(false) and it did not change the test results or the upodate_test_checks changes. Not sure what I was seeing yesterday.

Please rebase this patch on latest main - that way we'll just see the diffs in the tests.

I think this is correct and sufficiently tested now. See inline comments for a couple of nits. Once those are addressed, I can push the patch on your behalf.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
426–427

This is more verbose than needed - if we're using &I, then we can just grab the opcode from it via I.getOpcode().
So we can either pass things in as params or use refs, instead of partly using both ways?

auto isSuitableForPreShift = [](Instruction &I, const APInt *AC, unsigned PosOffset) {
auto isSuitableForPreShift = [&I, AC, PosOffset]() {
458

Instead of a lambda + switch for setting the flags, I'd prefer the shorter and more direct:

if (I.getOpcode() == Instruction::Shl) {
   NewShiftOp->setHasNoUnsignedWrap(I.hasNoUnsignedWrap());
else
   NewShiftOp->setIsExact();

That might tilt the whole thing towards a lambda-less implementation, but you can decide which is easier to read.

nico-abram marked an inline comment as done.

Addressed comments, I kept the lambda.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 27 2022, 11:21 AM
This revision was automatically updated to reflect the committed changes.