This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] try to determine "exact" for sdiv
ClosedPublic

Authored by spatel on Oct 14 2022, 10:00 AM.

Details

Summary

If the divisor is a power-of-2 or negative-power-of-2 and the dividend is known to have >= trailing zeros than the divisor, the division is exact:
https://alive2.llvm.org/ce/z/UGBksM (general proof)
https://alive2.llvm.org/ce/z/D4yPS- (examples based on regression tests)

This isn't the most direct optimization (we could create ashr in these examples instead of relying on existing folds for exact divides), but it's possible that there's a more general constraint than just a pow2 divisor, so this might be extended in the future.

This should solve issue #58348.

Diff Detail

Event Timeline

spatel created this revision.Oct 14 2022, 10:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2022, 10:00 AM
spatel requested review of this revision.Oct 14 2022, 10:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2022, 10:00 AM
nikic accepted this revision.Oct 14 2022, 12:13 PM

LGTM

llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1351

Side note, as you are working on division transforms. This has an obvious generalization to any nonneg / neg (https://alive2.llvm.org/ce/z/bYVnFG), and similar for neg / nonneg and neg / neg. It does require adding two negations in the general case, but I believe we consider that worthwhile to relax sdiv to udiv -- or at least we do the same transform based on range information in CVP.

This revision is now accepted and ready to land.Oct 14 2022, 12:13 PM
spatel added inline comments.Oct 16 2022, 6:59 AM
llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1351

Thanks for pointing that out. I didn't know CVP did that transform. Looks like it was:
8d487668d09fb0e

And the instcombine change/enhancement was implemented/mentioned here:
0fdcca07ad2c0bdc2

Scanning over x86 instruction timings at least, it's not clear if unsigned div (DIV) is any faster than signed div (IDIV). So that seems questionable as an IR transform since it's not reversible in general, but if the negates are deleted/noise in most cases, then it's ok.

This revision was landed with ongoing or failed builds.Oct 16 2022, 8:14 AM
This revision was automatically updated to reflect the committed changes.