This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] ssubo X, C -> saddo X, -C
ClosedPublic

Authored by dlrobertson on Apr 1 2019, 5:43 AM.

Details

Summary

ssubo X, C is equivalent to saddo X, -C. Make the transformation in
InstCombine and allow the logic implemented for saddo to fold prior
usages of add nsw or sub nsw with constants.

Diff Detail

Repository
rL LLVM

Event Timeline

dlrobertson created this revision.Apr 1 2019, 5:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 1 2019, 5:43 AM
lebedev.ri added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2160–2164 ↗(On Diff #193069)

Hmm, why not just:

APInt NewC;
if (match(Arg1, m_APInt(C)) &&
    !subWithOverflow(NewC, 0, *C, /*IsSigned=*/II->getIntrinsicID() == Intrinsic::ssub_with_overflow)) {

?

RKSimon added a subscriber: RKSimon.
RKSimon added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2161 ↗(On Diff #193069)

Is C->isMaxSignedValue() necessary?

nikic added inline comments.Apr 1 2019, 9:47 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2161 ↗(On Diff #193069)

Checking for max signed value is not necessary, as -SignedMax doesn't overflow, only -SignedMin does.

For parity with ssub.sat canonicalization I'd suggest to use the same implementation here, which will also handle non-splat vector constants:

Constant *C;
if (match(Arg1, m_Constant(C)) && C->isNotMinSignedValue()) {
    Value *NegVal = ConstantExpr::getNeg(C);
    // Create ssubo with NegVal here.
}

Gather some inspiration from ssub.sat

dlrobertson marked 5 inline comments as done.Apr 1 2019, 6:53 PM
dlrobertson added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2160–2164 ↗(On Diff #193069)

Thanks for the suggestion, but as mentioned above I think isNotMinSignedValue is what I need here. If I am misunderstanding something, please let me know.

2161 ↗(On Diff #193069)

Ah! Right. I updated the diff to look more like ssub.sat.

dlrobertson marked an inline comment as done.

Update comment

Looks good.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2170–2171 ↗(On Diff #193222)

Shouldn't we do this first?

dlrobertson marked 2 inline comments as done.Apr 2 2019, 1:38 PM
dlrobertson added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2170–2171 ↗(On Diff #193222)

Originally I did this so we wouldn't run this function twice (once here and once as sadd_with_overflow) , but you're correct. We need to run this first so we don't miss the optimization if the first arg is the constant.

dlrobertson marked an inline comment as done and an inline comment as not done.Apr 2 2019, 1:44 PM
dlrobertson added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2170–2171 ↗(On Diff #193222)

Actually this function doesn't call cannonicalizeConstantArg0ToArg1 so this is not the case. This function should be called when sadd_with_overflow is run, so I can't currently think of a reason why this would need to be called first.

nikic added inline comments.Apr 3 2019, 8:09 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2170–2171 ↗(On Diff #193222)

Both orders are generally okay, and both may result in more/less work depending on the case we hit. If the common transform is first, then it might remove the intrinsic entirely right away, rather than first converting to saddo and only removing it afterwards. On the other hand, if it doesn't apply, but the saddo conversion does, then the checks will be run twice, as you already said.

I think @lebedev.ri's suggestion here is mainly a coding style choice to start with the generic transforms first and specific transforms afterwards.

lebedev.ri requested changes to this revision.Apr 8 2019, 11:31 PM
lebedev.ri added a reviewer: lebedev.ri.

Some notes

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2161 ↗(On Diff #193222)

This will be yet another miscompile for vectors if there is undef element.
You need to do

if (match(Arg1, m_Constant(C)) && C->isNotMinSignedValue() && !C->containsUndefElement()) {
2170–2171 ↗(On Diff #193222)

I think @lebedev.ri's suggestion here is mainly a coding style choice to start with the generic transforms first and specific transforms afterwards.

Yes

2221 ↗(On Diff #193222)

This is also a miscompile for vectors for undef elements (please submit fix as a new patch)

This revision now requires changes to proceed.Apr 8 2019, 11:31 PM
nikic added inline comments.Apr 8 2019, 11:56 PM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2161 ↗(On Diff #193222)

isNotMinSignedValue() does not accept undef elements. Correctly handling non-splat vectors and undefs is why this function exists (rather than just negating isMinSignedValue).

2221 ↗(On Diff #193222)

See above.

lebedev.ri added inline comments.Apr 9 2019, 12:08 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2161 ↗(On Diff #193222)

Hmm, very subtle. I now can notice those tests, at least in this testfile, so this won't regress if that changes.
Ok, sorry.

2170–2171 ↗(On Diff #193222)

Then only this question remains

dlrobertson marked an inline comment as not done.

Sorry for the delay. Made the requested changes.

lebedev.ri accepted this revision.Apr 9 2019, 8:48 AM

LG.
@nikic ?

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2155 ↗(On Diff #194346)

This does not create a copy of the original constant though, it just negates it.

This revision is now accepted and ready to land.Apr 9 2019, 8:48 AM
nikic accepted this revision.Apr 9 2019, 9:02 AM

LG with nit.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2146 ↗(On Diff #194346)

This assert is a bit odd, because it checks for more than one arg, while ssubo requires exactly two. I think it can just be dropped entirely, we don't seem to have similar asserts in the surrounding code for other intrinsics.

Removed an unneeded assert.

dlrobertson marked an inline comment as done.Apr 9 2019, 4:16 PM
dlrobertson added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2146 ↗(On Diff #194346)

foldIntrinsiceWithOverflowCommon uses getArgOperand(1) too, so we'd hit that before the assert.

This revision was automatically updated to reflect the committed changes.