This is an archive of the discontinued LLVM Phabricator instance.

[InstructionSimplify] Add support for saturating add/sub
ClosedPublic

Authored by nikic on Nov 14 2018, 8:46 AM.

Details

Summary

Add support for saturating add/sub in InstructionSimplify. In particular, the following simplifications are supported:

  • sat(X + 0) -> X
  • sat(X + undef) -> -1
  • sat(X uadd MAX) -> MAX
  • (and commutative variants)
  • sat(X - 0) -> X
  • sat(X - X) -> 0
  • sat(X - undef) -> 0
  • sat(undef - X) -> 0
  • sat(0 usub X) -> 0
  • sat(X usub MAX) -> 0

Diff Detail

Event Timeline

nikic created this revision.Nov 14 2018, 8:46 AM
spatel added a subscriber: spatel.Nov 15 2018, 7:11 AM
spatel added inline comments.
test/Transforms/InstSimplify/saturating-add-sub.ll
62

This result doesn't match the scalar logic for undefs. I didn't step through it, but I'm guessing it's because <i8 undef, i8, undef> matched successfully with m_AllOnes(). Not sure if that should be considered a bug or not.

Generally, we do folds with undef operands before anything else, so we don't see that potential difference. If that doesn't resolve it, then there's definitely an underlying bug in m_Undef().

Nit 1: I prefer that each test is its own function. I had to step back through the CHECK lines twice to make sure I wasn't seeing things. If each test was its own function, you could interleave scalar/vector, and it would be more obvious if there was a difference between those.

Nit 2: Please commit the tests to trunk with baseline CHECKs as a preliminary commit to the actual code commit. That way, we won't lose the test coverage even if the code patch is reverted for some reason.

spatel added inline comments.
test/Transforms/InstSimplify/saturating-add-sub.ll
76

Looking at the matcher implementation, just rearranging the code in this patch isn't going to cover up the discrepancy between scalar and vector. If you want to ignore that to make progress here, just adjust this test with the standard spelling of vector undef:

%x4 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a, <2 x i8> undef)
nikic added inline comments.Nov 16 2018, 7:17 AM
test/Transforms/InstSimplify/saturating-add-sub.ll
62

I think that the matcher issue is a bug, and have submitted D54631 to fix it. At least this behavior should be consistent, i.e. either m_AllOnes() matches both scalar and vector undef, or it matches neither. Right now it matches vector, but not scalar, which is neither here nor there...

I also checked whether using <2 x i8> undef allows us to work around the problem for the purposes of this test, but it does not change the result. I'm assuming that <2 x i8> <i8 undef, i8 undef> gets normalized to <2 x i8> undef at some point.

Changing the check order so that the undef check comes first would work, but I'd prefer to fix the matcher implementation than have this somewhat subtle workaround.

Regarding commits, I don't have commit access so can't apply those directly. Should I open another patch that includes just the baseline checks?

spatel added inline comments.Nov 16 2018, 7:34 AM
test/Transforms/InstSimplify/saturating-add-sub.ll
62

Thanks - I wasn't sure if you were willing to fix the underlying problem, so that's the reason I suggested the work-around for this patch.

You're correct that <2 x i8> <i8 undef, i8 undef> becomes <2 x i8> undef. That should be somewhere in ConstantFolding.

I will commit the baseline test files on your behalf shortly. Then, you can rebase D54631 and this one.

spatel added inline comments.Nov 16 2018, 8:38 AM
test/Transforms/InstSimplify/saturating-add-sub.ll
170

That should be "ssub" not "usub". Have a look at:
rL347060
and make sure I didn't introduce any typos while translating the tests to separate functions.

nikic updated this revision to Diff 174391.Nov 16 2018, 9:47 AM

Rebase over baseline test.

nikic added inline comments.Nov 16 2018, 9:53 AM
test/Transforms/InstSimplify/saturating-add-sub.ll
170

Thanks a lot for splitting up the tests! I've now rebased and updated the test output. I made two minor adjustments in the tests: In usub_vector_undef I changed the RHS to undef,undef (as 0,undef is effectively 0). In usub_vector_undef_commute I moved the undef operand to the LHS.

spatel added inline comments.Nov 18 2018, 7:04 AM
lib/Analysis/InstructionSimplify.cpp
4925–4928

It's slightly better to return a constant when one of the operands is undef because that eliminates the use of a variable. That also matches the behavior of the subtracts.

For sadd_sat, we can always return 0, and for uadd_sat, we can always return -1?

nikic added inline comments.Nov 18 2018, 7:25 AM
lib/Analysis/InstructionSimplify.cpp
4925–4928

The question of undef handling is also discussed here: https://reviews.llvm.org/D54237#1294571

The problem is that we can't return 0 for sadd_sat, due to the asymmetric range for signed integers. If one operand is -128, the largest number you can reach is -128 + 127 = -1, not 0. That's why I'm returning the other operand here, so that handling is consistent for uadd/sadd.

But ... now that I think about it, shouldn't it be legal to return -1 for *both* the signed and the unsigned case?

spatel added a subscriber: nlopes.Nov 18 2018, 7:41 AM
spatel added inline comments.
lib/Analysis/InstructionSimplify.cpp
4925–4928

Return -1 for both sounds good to me. cc'ing @nlopes in case he sees an alternative.

In the earlier discussion, you said (uadd_sat X, undef) could return 0, but I don't think that's possible. Say for example that X=42. There's no undef value that we can add to get us to 0 (or any value <42). uadd_sat must go to all-ones or choose "0" as the undef and return X.

nikic updated this revision to Diff 174541.Nov 18 2018, 7:48 AM
nikic edited the summary of this revision. (Show Details)

Return -1 for both signed and unsigned saturating addition with undef operand.

spatel accepted this revision.Nov 19 2018, 10:37 AM

LGTM.

Have you requested commit privileges?

This revision is now accepted and ready to land.Nov 19 2018, 10:37 AM
nikic added a comment.Nov 20 2018, 3:06 AM

I've requested commit access, but would prefer it if you could commit these patches in the meantime, to avoid holding things up for too long.

This revision was automatically updated to reflect the committed changes.