This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Teach SimplifyDemandedBits that udiv doesn't demand low dividend bits that are zero in the divisor
ClosedPublic

Authored by bkramer on May 9 2018, 10:17 AM.

Details

Summary

This is safe as long as the udiv is not exact. The pattern is not common in
C++ code, but comes up all the time in code generated by XLA's GPU backend.

Diff Detail

Repository
rL LLVM

Event Timeline

bkramer created this revision.May 9 2018, 10:17 AM
sanjoy added inline comments.May 9 2018, 10:52 AM
lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
556 ↗(On Diff #145954)

Can we also propagate the demanded bits of the division result (in some form) into DemandedMaskIn?

bkramer added inline comments.May 9 2018, 11:28 AM
lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
556 ↗(On Diff #145954)

Maybe. This is hard for non-power of 2 divisors though and probably not worth the effort :|

sanjoy added inline comments.May 9 2018, 1:54 PM
lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
556 ↗(On Diff #145954)

I think demanded-bits should compose which means you can treat a /u 12 as (a >> 2) /u 3. Since in udiv input bit at index i can only affect output bits at index i and below (i.e. less significant) we should be able to left-smear the demanded bits demanded bits for (a >> 2) /u 3 into the demanded bits for (a >> 2) and then apply the rule for >>.

For instance, if we demand bit only 5 for an i8 a /u 12 == (a >> 2) /u 3 then we can say:

  • We only need bits 5, 6, 7 and from a >> 2
  • Thus we only need bit 7 from a

I'd also be fine with a TODO here since it does seem complex.

bkramer updated this revision to Diff 146006.May 9 2018, 2:42 PM

Add a FIXME for using the demanded mask of the udiv result.

sanjoy accepted this revision.May 9 2018, 3:24 PM

lgtm

This revision is now accepted and ready to land.May 9 2018, 3:24 PM
This revision was automatically updated to reflect the committed changes.