This is an archive of the discontinued LLVM Phabricator instance.

[Transform] Improve fold of sadd.with.overflow
ClosedPublic

Authored by dlrobertson on Mar 2 2019, 6:18 PM.

Diff Detail

Event Timeline

dlrobertson created this revision.Mar 2 2019, 6:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2019, 6:18 PM
dlrobertson marked 2 inline comments as done.Mar 2 2019, 6:23 PM

Comments and feedback would be very much appreciated. I did my best to follow https://llvm.org/docs/Contributing.html and https://llvm.org/docs/Phabricator.html, but I might have missed something. Please let me know if I did.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2045

I was not entirely sure how to solve the last case. sadd.with.overflow returns an aggregate, while add nsw does not. I was not sure how to construct the aggregate, so I did not address the last case in my first attempt.

2064

This should be similar, but since I'm *very* new to LLVM, I'd like to keep this diff as small in scope as possible. I'd like to address uadd.with.overflow in a follow-up.

nikic added inline comments.Mar 3 2019, 1:46 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1825

nit: Unnecessary braces.

2045

You can construct the aggregate with the CreateOverflowTuple helper, with overflow set to a false constant.

It's probably better to leave it out in this patch though, as this is quite a different optimization. Generally I'd drop all the sign/range checks here, as the fold is valid in either case -- there's just an even better one possible that's not implemented yet.

2048

nit: The variables used in the matchers don't need to be initialized, and it's usually not done.

2050

It is better to use an m_APInt matcher for these, because it will also work with (splat) vector constants (and you're working on APInts below anyway).

2055

Consider something like %y = add nsw i8 %x, 127; sadd.with.overflow(%y, 127). The constants have same sign and you will transform this to sadd.with.overflow(%x, 127+127), where 127+127 = -2.

This transform can't be applied if the result of the addition overflows. You can use sadd_ov on APInt to check for that.

llvm/test/Transforms/InstCombine/call-add-with-overflow.ll
1 ↗(On Diff #189067)

To make future diffs easier, it's preferred to automatically generate checks using the utils/update_test_checks.py script.

22 ↗(On Diff #189067)

As we recently added vector support for these intrinsics, we should also have some vector tests. Good cases to cover are:

  • Splat constant <2 x i32> <i32 42, i32 42> (will work out of the box once you switch to m_APInt)
  • Splat constant with undef <2 x i32> <i32 42, i32 undef> (will also work out of the box)
  • Non-splat constant <2 x i32> <i32 42, i32 24> (will not work, and that's okay)

Additionally, the overflowing case mentioned above should be checked, as a negative example. The inverted case here with -7 and 13 rather than the other way around, would also be good.

nagisa added a subscriber: nagisa.Mar 3 2019, 3:03 AM
nagisa added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2043

This last case can also be folded into an overflow call. While not optimal, it "feels" to me that we better do that, rather than skipping this case entirely, if just for now.

dlrobertson marked 12 inline comments as done.Mar 3 2019, 4:46 PM
dlrobertson added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2043

Per the comment below I'll update this in a follow up PR if that is okay.

2045

Ah! That makes perfect sense. Thanks!

It's probably better to leave it out in this patch though, as this is quite a different optimization.

Will do.

llvm/test/Transforms/InstCombine/call-add-with-overflow.ll
1 ↗(On Diff #189067)

Thanks!

22 ↗(On Diff #189067)

Splat constant with undef <2 x i32> <i32 42, i32 undef> (will also work out of the box)

The test I wrote for this did not fold, so I may have done something wrong.

dlrobertson marked 4 inline comments as done.
  • Addressed nits
  • Use m_APInt instead of ConstantInt
  • Updated tests
  • Removed special cases (Will add them in a follow-up)
dlrobertson marked an inline comment as done.Mar 3 2019, 5:01 PM
dlrobertson added inline comments.
llvm/test/Transforms/InstCombine/call-add-with-overflow.ll
58 ↗(On Diff #189104)

Note: this did not fold.

nikic added inline comments.Mar 4 2019, 12:40 AM
llvm/test/Transforms/InstCombine/call-add-with-overflow.ll
58 ↗(On Diff #189104)

Ooops, my bad. I was under the impression that m_APInt ignores undefs, but apparently only matchers for specific constants like m_Zero() do that. Something we might want to improve.

71 ↗(On Diff #189104)

One more test that would be good to have is a negative test for add (without nsw) with sadd. Could also have add nuw (no fold) and add nuw nsw (fold) variants to be thorough.

lebedev.ri added inline comments.
llvm/test/Transforms/InstCombine/call-add-with-overflow.ll
58 ↗(On Diff #189104)

I tried to touch upon that in D47983.
It would require gradual migration.

  • Add tests
ormris removed a subscriber: ormris.Mar 4 2019, 10:34 AM
nikic added a comment.Mar 4 2019, 11:40 AM

I've committed your tests with the current output in rL355328. Can you please rebase this revision over the committed tests, so that only the changes made by the new transform remain?

  • Rebase on master

I've committed your tests with the current output in rL355328. Can you please rebase this revision over the committed tests, so that only the changes made by the new transform remain?

Is this the preferred way to add tests? E.g. for the future changes I make for uadd.with.overflow, should I add the tests in a diff and then follow the diff up with another that also contains code changes?

I've committed your tests with the current output in rL355328. Can you please rebase this revision over the committed tests, so that only the changes made by the new transform remain?

Is this the preferred way to add tests? E.g. for the future changes I make for uadd.with.overflow, should I add the tests in a diff and then follow the diff up with another that also contains code changes?

Yes, absolutely.

  1. That makes the diff readable. The changes will be visible, not just the test addition.
  2. That helps with test coverage. You want those tests regardless of whether the change itself will land or be reverted.
nikic accepted this revision.Mar 4 2019, 11:51 PM

LGTM. @spatel Do you have anything to add?

This revision is now accepted and ready to land.Mar 4 2019, 11:51 PM

Yes, absolutely.

  1. That makes the diff readable. The changes will be visible, not just the test addition.
  2. That helps with test coverage. You want those tests regardless of whether the change itself will land or be reverted.

Yeah that makes sense. Thanks for the explanation. I'll try to do this for future changes.

spatel accepted this revision.Mar 5 2019, 6:26 AM

LGTM - see inline for a couple of minor points.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2038–2040

Nit: prefer to give constant values a 'C' name and write a code comment using those names to describe the transform.
So something like:

// If we have 2 constant operands whose sum does not overflow:
// saddo (X +nsw C0), C1 --> saddo X, C0+C1
const APInt *C0, *C1;
if (match(...))
llvm/test/Transforms/InstCombine/sadd-with-overflow.ll
45–47

Nit: Ideally, the positive test would use constants at the limit, and the negative test would use constants just over that limit. So the 1st test would be something like:

%2 = add nsw i8 %0, 100
%3 = tail call { i8, i1 } @llvm.sadd.with.overflow.i8(i8 %2, i8 27)
ret { i8, i1 } %3

and this one:

%2 = add nsw i8 %0, 100
%3 = tail call { i8, i1 } @llvm.sadd.with.overflow.i8(i8 %2, i8 28)
ret { i8, i1 } %3

That gives us just a bit more coverage in case the underlying code breaks somehow, and makes it clearer how the transform is implemented.

  • Fix nits
  • Update tests
dlrobertson marked 3 inline comments as done.Mar 5 2019, 6:56 PM
dlrobertson added inline comments.
llvm/test/Transforms/InstCombine/sadd-with-overflow.ll
45–47

Good point. Added in the update.

This revision was automatically updated to reflect the committed changes.