ssubo X, C is equivalent to saddo X, -C. Make the transformation in
InstCombine and allow the logic implemented for saddo to fold prior
usages of add nsw or sub nsw with constants.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2160–2164 ↗ | (On Diff #193069) | Hmm, why not just: APInt NewC; if (match(Arg1, m_APInt(C)) && !subWithOverflow(NewC, 0, *C, /*IsSigned=*/II->getIntrinsicID() == Intrinsic::ssub_with_overflow)) { ? |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2161 ↗ | (On Diff #193069) | Is C->isMaxSignedValue() necessary? |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2161 ↗ | (On Diff #193069) | Checking for max signed value is not necessary, as -SignedMax doesn't overflow, only -SignedMin does. For parity with ssub.sat canonicalization I'd suggest to use the same implementation here, which will also handle non-splat vector constants: Constant *C; if (match(Arg1, m_Constant(C)) && C->isNotMinSignedValue()) { Value *NegVal = ConstantExpr::getNeg(C); // Create ssubo with NegVal here. } |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2160–2164 ↗ | (On Diff #193069) | Thanks for the suggestion, but as mentioned above I think isNotMinSignedValue is what I need here. If I am misunderstanding something, please let me know. |
2161 ↗ | (On Diff #193069) | Ah! Right. I updated the diff to look more like ssub.sat. |
Looks good.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2170–2171 ↗ | (On Diff #193222) | Shouldn't we do this first? |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2170–2171 ↗ | (On Diff #193222) | Originally I did this so we wouldn't run this function twice (once here and once as sadd_with_overflow) , but you're correct. We need to run this first so we don't miss the optimization if the first arg is the constant. |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2170–2171 ↗ | (On Diff #193222) | Actually this function doesn't call cannonicalizeConstantArg0ToArg1 so this is not the case. This function should be called when sadd_with_overflow is run, so I can't currently think of a reason why this would need to be called first. |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2170–2171 ↗ | (On Diff #193222) | Both orders are generally okay, and both may result in more/less work depending on the case we hit. If the common transform is first, then it might remove the intrinsic entirely right away, rather than first converting to saddo and only removing it afterwards. On the other hand, if it doesn't apply, but the saddo conversion does, then the checks will be run twice, as you already said. I think @lebedev.ri's suggestion here is mainly a coding style choice to start with the generic transforms first and specific transforms afterwards. |
Some notes
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2161 ↗ | (On Diff #193222) | This will be yet another miscompile for vectors if there is undef element. if (match(Arg1, m_Constant(C)) && C->isNotMinSignedValue() && !C->containsUndefElement()) { |
2170–2171 ↗ | (On Diff #193222) |
Yes |
2221 ↗ | (On Diff #193222) | This is also a miscompile for vectors for undef elements (please submit fix as a new patch) |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2161 ↗ | (On Diff #193222) | Hmm, very subtle. I now can notice those tests, at least in this testfile, so this won't regress if that changes. |
2170–2171 ↗ | (On Diff #193222) | Then only this question remains |
LG with nit.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2146 ↗ | (On Diff #194346) | This assert is a bit odd, because it checks for more than one arg, while ssubo requires exactly two. I think it can just be dropped entirely, we don't seem to have similar asserts in the surrounding code for other intrinsics. |
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
2146 ↗ | (On Diff #194346) | foldIntrinsiceWithOverflowCommon uses getArgOperand(1) too, so we'd hit that before the assert. |