Fold add nsw + sadd.with.overflow with constants in some cases.
Details
Diff Detail
Event Timeline
Comments and feedback would be very much appreciated. I did my best to follow https://llvm.org/docs/Contributing.html and https://llvm.org/docs/Phabricator.html, but I might have missed something. Please let me know if I did.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2045 | I was not entirely sure how to solve the last case. sadd.with.overflow returns an aggregate, while add nsw does not. I was not sure how to construct the aggregate, so I did not address the last case in my first attempt. | |
2063 | This should be similar, but since I'm *very* new to LLVM, I'd like to keep this diff as small in scope as possible. I'd like to address uadd.with.overflow in a follow-up. |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1825 | nit: Unnecessary braces. | |
2045 | You can construct the aggregate with the CreateOverflowTuple helper, with overflow set to a false constant. It's probably better to leave it out in this patch though, as this is quite a different optimization. Generally I'd drop all the sign/range checks here, as the fold is valid in either case -- there's just an even better one possible that's not implemented yet. | |
2048 | nit: The variables used in the matchers don't need to be initialized, and it's usually not done. | |
2050 | It is better to use an m_APInt matcher for these, because it will also work with (splat) vector constants (and you're working on APInts below anyway). | |
2055 | Consider something like %y = add nsw i8 %x, 127; sadd.with.overflow(%y, 127). The constants have same sign and you will transform this to sadd.with.overflow(%x, 127+127), where 127+127 = -2. This transform can't be applied if the result of the addition overflows. You can use sadd_ov on APInt to check for that. | |
llvm/test/Transforms/InstCombine/call-add-with-overflow.ll | ||
1 ↗ | (On Diff #189067) | To make future diffs easier, it's preferred to automatically generate checks using the utils/update_test_checks.py script. |
22 ↗ | (On Diff #189067) | As we recently added vector support for these intrinsics, we should also have some vector tests. Good cases to cover are:
Additionally, the overflowing case mentioned above should be checked, as a negative example. The inverted case here with -7 and 13 rather than the other way around, would also be good. |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2043 | This last case can also be folded into an overflow call. While not optimal, it "feels" to me that we better do that, rather than skipping this case entirely, if just for now. |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2043 | Per the comment below I'll update this in a follow up PR if that is okay. | |
2045 | Ah! That makes perfect sense. Thanks!
Will do. | |
llvm/test/Transforms/InstCombine/call-add-with-overflow.ll | ||
1 ↗ | (On Diff #189067) | Thanks! |
22 ↗ | (On Diff #189067) |
The test I wrote for this did not fold, so I may have done something wrong. |
- Addressed nits
- Use m_APInt instead of ConstantInt
- Updated tests
- Removed special cases (Will add them in a follow-up)
llvm/test/Transforms/InstCombine/call-add-with-overflow.ll | ||
---|---|---|
58 ↗ | (On Diff #189104) | Note: this did not fold. |
llvm/test/Transforms/InstCombine/call-add-with-overflow.ll | ||
---|---|---|
58 ↗ | (On Diff #189104) | Ooops, my bad. I was under the impression that m_APInt ignores undefs, but apparently only matchers for specific constants like m_Zero() do that. Something we might want to improve. |
71 ↗ | (On Diff #189104) | One more test that would be good to have is a negative test for add (without nsw) with sadd. Could also have add nuw (no fold) and add nuw nsw (fold) variants to be thorough. |
I've committed your tests with the current output in rL355328. Can you please rebase this revision over the committed tests, so that only the changes made by the new transform remain?
I've committed your tests with the current output in rL355328. Can you please rebase this revision over the committed tests, so that only the changes made by the new transform remain?
Is this the preferred way to add tests? E.g. for the future changes I make for uadd.with.overflow, should I add the tests in a diff and then follow the diff up with another that also contains code changes?
Yes, absolutely.
- That makes the diff readable. The changes will be visible, not just the test addition.
- That helps with test coverage. You want those tests regardless of whether the change itself will land or be reverted.
Yeah that makes sense. Thanks for the explanation. I'll try to do this for future changes.
LGTM - see inline for a couple of minor points.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2038–2040 | Nit: prefer to give constant values a 'C' name and write a code comment using those names to describe the transform. // If we have 2 constant operands whose sum does not overflow: // saddo (X +nsw C0), C1 --> saddo X, C0+C1 const APInt *C0, *C1; if (match(...)) | |
llvm/test/Transforms/InstCombine/sadd-with-overflow.ll | ||
36–37 ↗ | (On Diff #189228) | Nit: Ideally, the positive test would use constants at the limit, and the negative test would use constants just over that limit. So the 1st test would be something like: %2 = add nsw i8 %0, 100 %3 = tail call { i8, i1 } @llvm.sadd.with.overflow.i8(i8 %2, i8 27) ret { i8, i1 } %3 and this one: %2 = add nsw i8 %0, 100 %3 = tail call { i8, i1 } @llvm.sadd.with.overflow.i8(i8 %2, i8 28) ret { i8, i1 } %3 That gives us just a bit more coverage in case the underlying code breaks somehow, and makes it clearer how the transform is implemented. |
llvm/test/Transforms/InstCombine/sadd-with-overflow.ll | ||
---|---|---|
36–37 ↗ | (On Diff #189228) | Good point. Added in the update. |
nit: Unnecessary braces.