This is an archive of the discontinued LLVM Phabricator instance.

[instcombine] Fold overflow check using overflow intrinsic to comparison
ClosedPublic

Authored by reames on Jun 25 2021, 11:00 AM.

Details

Summary

This follows up to D104665 (which added umulo handling alongside the existing uaddo case), and generalizes for the remaining overflow intrinsics.

I went to add analogous handling to LVI, and discovered that LVI already had a more general implementation. Instead, we can port was LVI does to instcombine. :)

A couple of points for review discussion:

  • I could have just generated two compares, and let instcombine refold them into the sub form. Not sure which is cleaner here.
  • Sanjay was nice enough to improve instcombine such that the profitability of ssub and usub is no longer questionable. Thanks!

Diff Detail

Event Timeline

reames created this revision.Jun 25 2021, 11:00 AM
reames requested review of this revision.Jun 25 2021, 11:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 25 2021, 11:00 AM
spatel added inline comments.Jun 30 2021, 8:06 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3119

Why are we leaving this code in? Is there some pattern that would escape the new proposed block, but still be caught by this?

llvm/test/Transforms/InstCombine/ssubo.ll
59

I focused in on this example, and we are missing a fold:
https://alive2.llvm.org/ce/z/EfBUXP

Does that hold for all of the overflow ops? Could we produce the optimal icmp directly (no add with constant necessary)?

reames added inline comments.Jun 30 2021, 8:25 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3119

This is purely a rebase issue. This code is dead and will be removed. Oops.

llvm/test/Transforms/InstCombine/ssubo.ll
59

Hm, have to give this one some thought. Would you mind if we did this incrementally? i.e. land this, then improve?

spatel added inline comments.Jun 30 2021, 8:56 AM
llvm/test/Transforms/InstCombine/ssubo.ll
59

I had to try to generalize this, and I think I have it:
https://alive2.llvm.org/ce/z/j4TDR_

I'll work on that patch now, so we don't have to worry about regressions for this set of tests at least (not sure if that is enough on its own for all tests).

spatel added inline comments.Jun 30 2021, 10:52 AM
llvm/test/Transforms/InstCombine/ssubo.ll
59

Can you update here after c7b658aeb526 ?

reames edited the summary of this revision. (Show Details)Jun 30 2021, 11:07 AM
reames added inline comments.Jun 30 2021, 11:17 AM
llvm/test/Transforms/InstCombine/smulo.ll
38–39

There might be another lurking fold here, but I'm not quite seeing it.

This is basically implementing a lookup table for the top two bits with the original mapping being:
00 = false
01 = true
10 = true
11 = false

We could use a shift and popct here, but that's slower. Any tricks you can think of for this formulation?

One the later cases is also a region lookup test, but with 8 regions not 4.

spatel accepted this revision.Jul 1 2021, 7:35 AM

LGTM - I'm not familiar with the LVI version, so it would be good to explain it here and/or point back to the original.

I committed one more icmp fold, so that should help reduce the risk of regressions:
0c400e895306
I think we still want to find the signed predicate versions of those folds. I didn't check if the code here always manages to avoid those or if something else in instcombine already manages to squash them.

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3142

Describe the trick in the code comment?

llvm/test/Transforms/InstCombine/smulo.ll
38–39

I don't see it either. I'm visualizing it as an abs() pattern, but that doesn't quite work since SMIN is not an overflow.

This revision is now accepted and ready to land.Jul 1 2021, 7:35 AM
This revision was landed with ongoing or failed builds.Jul 1 2021, 9:42 AM
This revision was automatically updated to reflect the committed changes.