This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0"
ClosedPublic

Authored by EgorBo on May 4 2020, 3:00 PM.

Details

Summary

"X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two)
However, "X % Y" can also be represented as "X - (X / Y) * Y" so if I rewrite the initial expression:
"X - (X / C) * C == 0" it's not currently optimized to "X & C-1 == 0", see godbolt: https://godbolt.org/z/KzuXUj

This is my first contribution to LLVM so I hope I didn't mess things up

Diff Detail

Event Timeline

EgorBo created this revision.May 4 2020, 3:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2020, 3:00 PM
EgorBo edited the summary of this revision. (Show Details)
EgorBo updated this revision to Diff 261950.May 4 2020, 3:53 PM

Fix test name

lebedev.ri added inline comments.May 4 2020, 10:59 PM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1362–1363 ↗(On Diff #261950)

I think we should first instead fold X - (X / C) * C/((X / -C1) >> C2)) + X into X % C.
We have integer division/remainder either way, but with remainder it's less instructions.

But that won't help here, because your target pattern @is_rem32_pos_decomposed_i8
is currently folded into

define i1 @is_rem32_pos_decomposed_i8(i8 %x) {
  %d.neg = sdiv i8 %x, -32
  %m.neg = shl i8 %d.neg, 5
  %s = sub i8 0, %x
  %r = icmp eq i8 %m.neg, %s
  ret i1 %r
}

instead of

define i1 @tgt(i8 %x) {
  %d.neg = sdiv i8 %x, -32
  %m.neg = shl i8 %d.neg, 5
  %m.neg.add = add i8 %m.neg, %x
  %r = icmp eq i8 %m.neg.add, 0
  ret i1 %r
}

For that, i'd think we should extend foldIRemByPowerOfTwoToBitTest().

1391 ↗(On Diff #261950)

Don't use fixed-size integers, use APInt.

lebedev.ri added inline comments.May 4 2020, 11:06 PM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1362–1363 ↗(On Diff #261950)

Edit: err, actually, disregard foldIRemByPowerOfTwoToBitTest() comment.
We should also fold

define i1 @is_rem32_pos_decomposed_i8(i8 %x) {
  %d.neg = sdiv i8 %x, -32
  %m.neg = shl i8 %d.neg, 5
  %s = sub i8 0, %x
  %r = icmp eq i8 %m.neg, %s
  ret i1 %r
}

into

define i1 @is_rem32_pos_decomposed_i8(i8 %x) {
  %s = srem i8 %x, 32
  %r = icmp eq i8 %s, 0
  ret i1 %r
}

and then foldIRemByPowerOfTwoToBitTest() will just work.

EgorBo updated this revision to Diff 262062.May 5 2020, 4:29 AM

Moved to InstCombiner::visitAdd

EgorBo added a comment.May 5 2020, 4:31 AM

@lebedev.ri Thanks for review!
I've moved the transformation to "InstCombiner::visitAdd"

So now it converts "((X / C1) << C2) + X" to "X % C1" thus "((X / C1) << C2) + X == 0" is automatically optimized to "X & C-1 == 0" later.

lebedev.ri added inline comments.May 5 2020, 1:58 PM
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1317–1318
// ((X / C1) << C2) + X => X % -C1
// where -C1 = 1 << C2
1319–1320

Do we care about non-splat vector support?
This will only work for scalars/splat vectors.

1325

Just return the new srem.

llvm/test/Transforms/InstCombine/icmp-div-constant.ll
41 ↗(On Diff #262062)

Please pull into a new file. Some comments:

  • there should be no icmp here, this isn't the pattern we are folding (although yes, that is the larger pattern)
  • there shouldn't be nsw flags since we don't use them
  • there shouldn't be mul, we are looking for shl
  • there shouldn't be sub, we are looking for add

And this needs more tests, in particular:

  • splat vector test
  • non-splat vector test
  • vector splat with undef lane test: there's two constants here, so two tests where either one of them contains undef lane, and a test where both of them contain undef lane
  • one test with extra uses (see @use in this file) on every of the values - we don't care whether there are extra uses
  • Negative tests - what are preconditions on the constants?
EgorBo updated this revision to Diff 262275.May 5 2020, 5:50 PM

Add more tests (TODO: add non-splat vectors)

EgorBo updated this revision to Diff 262276.May 5 2020, 5:53 PM

Add more tests (TODO: non-splat vectors)

EgorBo updated this revision to Diff 262685.May 7 2020, 10:12 AM
EgorBo marked an inline comment as done.

Add more tests (undef, negative, non-splat vectors)

@lebedev.ri can you please re-review? do you want me to add more tests?

Regarding the non-splat vectors, I am not super familiar with it yet, will it complicate my code significantly?

Please feel free to add weekly-ish "ping" comments, i lost track of this, sorry :/
This looks good for the time being modulo nits.

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1317–1318

Eeeh, i walked into that one myself, but let's explicitly spell out sigdness of ops:
((X s/ C1) << C2) + X => X s% -C1

1320

if (match(LHS, m_Shl(m_SDiv(m_Specific(RHS), m_APInt(C1)), m_APInt(C2)))) {

1321–1323

I'm not sure why all these preconditions are needed?
I think you only want -C1 == 1 << C2, i.e. -(*C1) == (one << *C2).
https://rise4fun.com/Alive/wslBO

1324

Why not just -(*C1)?
While there, we now do that twice, so cache it in some new variable.

Reverse-ping.

EgorBo updated this revision to Diff 269944.Jun 10 2020, 12:25 PM

@lebedev.ri I am so sorry for this huge delay, totally forgot about this pr and broke my notifications, I promise I'll respond without delays in future :-) Addressed your feedback I beleive.

lebedev.ri accepted this revision.Jun 10 2020, 12:55 PM

LGTM, thanks you.
If you need help committing this, do specify the "name <e@ma.il>" to be used.

This revision is now accepted and ready to land.Jun 10 2020, 12:55 PM

LGTM, thanks you.
If you need help committing this, do specify the "name <e@ma.il>" to be used.

@lebedev.ri I'm not sure I understand what you mean, do I need to add a header on top of that diff? something like

From: EgorBo <egorbo@gmail.com>


?

LGTM, thanks you.
If you need help committing this, do specify the "name <e@ma.il>" to be used.

@lebedev.ri I'm not sure I understand what you mean, do I need to add a header on top of that diff? something like

From: EgorBo <egorbo@gmail.com>


?

https://llvm.org/docs/DeveloperPolicy.html#attribution-of-changes
https://llvm.org/docs/DeveloperPolicy.html#commit-messages

If you’re not the original author, ensure the ‘Author’ property of the commit is set to the original author and the ‘Committer’ property is set to yourself. You can use a command similar to git commit --amend --author="John Doe <jdoe@llvm.org> to correct the author property if it is incorrect. See Attribution of Changes for more information including the method we used for attribution before the project migrated to git.

This revision was automatically updated to reflect the committed changes.