This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] PR37603: low bit mask canonicalization
ClosedPublic

Authored by lebedev.ri on May 27 2018, 11:14 AM.

Details

Summary

This is PR37603.

https://godbolt.org/g/VCMNpS
https://rise4fun.com/Alive/idM

When doing bit manipulations, it is quite common to calculate some bit mask,
and apply it to some value via and.

The typical C code looks like:

int mask_signed_add(int nbits) {
    return (1 << nbits) - 1;
}

which is translated into (with -O3)

define dso_local i32 @mask_signed_add(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 1, %0
  %3 = add nsw i32 %2, -1
  ret i32 %3
}

But there is a second, less readable variant:

int mask_signed_xor(int nbits) {
    return ~(-(1 << nbits));
}

which is translated into (with -O3)

define dso_local i32 @mask_signed_xor(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 -1, %0
  %3 = xor i32 %2, -1
  ret i32 %3
}

Since we created such a mask, it is quite likely that we will use it in and next.
And then we may get rid of not op by folding into andn.

But now that i have actually looked:
https://godbolt.org/g/VTUDmU
_some_ backend changes will be needed too.
We clearly loose bzhi recognition.

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri created this revision.May 27 2018, 11:14 AM
lebedev.ri edited the summary of this revision. (Show Details)May 27 2018, 11:16 AM

Actually run $ ninja check-llvm -- update test/Transforms/InstCombine/rem.ll

lebedev.ri added inline comments.May 27 2018, 2:56 PM
test/Transforms/InstCombine/set-lowbits-mask-canonicalize.ll
20–21 ↗(On Diff #148763)

Note: i'm not *sure* we want to go this way.
Maybe we want to canonicalize the other way around?

(Another unrelated question could be,
do we want to "canonicalize" all add nuw i32 %val, -1 to xor i32 %val, -1?
I'm not sure here, because we can't go from not to dec as per alive,
and the latter does not produce lea)

This is a good IR canonicalization regardless of what happens in the backend because a 'not' is better for bit-tracking analysis and other transforms than an 'add'.

I don't understand the nuw question:
https://rise4fun.com/Alive/ydD

We're missing this in instsimplify?

This is a good IR canonicalization regardless of what happens in the backend because a 'not' is better for bit-tracking analysis and other transforms than an 'add'.

Yep, i thought so too..

I don't understand the nuw question:
https://rise4fun.com/Alive/ydD

We're missing this in instsimplify?

I think we do? https://godbolt.org/g/rT3k7v
But i was only thinking about replacing and nuw with xor there.

This is a good IR canonicalization regardless of what happens in the backend because a 'not' is better for bit-tracking analysis and other transforms than an 'add'.

Yep, i thought so too..

I don't understand the nuw question:
https://rise4fun.com/Alive/ydD

We're missing this in instsimplify?

I think we do? https://godbolt.org/g/rT3k7v
But i was only thinking about replacing and nuw with xor there.

I'm still not clear on the nuw question:
https://godbolt.org/g/mTHjhc

Are you trying to transform with a constant other than -1? If -1, then we should have simplified the IR to a constant.

spatel added a comment.Jun 4 2018, 6:25 AM

We didn't resolve the 'nuw' question - am I not seeing the scenario that you asked about?

lib/Transforms/InstCombine/InstCombineAddSub.cpp
1102 ↗(On Diff #148763)

The comment should match what the code is doing:
~(-1 << NBits)

1103 ↗(On Diff #148763)

The IR motivation is what I'd prefer to state here, so something like:
Because a 'not' is better for bit-tracking analysis and other transforms than an 'add'.

1112 ↗(On Diff #148763)

Is this always safe to cast rather than dyn_cast? What happens if NBits is a constant expression?

lebedev.ri updated this revision to Diff 149752.Jun 4 2018, 6:49 AM
lebedev.ri marked 3 inline comments as done.

Address review notes.

We didn't resolve the 'nuw' question - am I not seeing the scenario that you asked about?

True.
It was just a passing-by thought.
Here i don't particularly care about that fold.
I was just thinking about profitability of transform to xor %x, -1,
and thought whether we could always do that if the input is add %x, -1.

I can submit that fold as a follow-up.

lebedev.ri added a comment.EditedJun 4 2018, 6:49 AM
This comment has been deleted.
lib/Transforms/InstCombine/InstCombineAddSub.cpp
1112 ↗(On Diff #148763)

New test with constant seems to work.
So in this case, the constant folding seems to already happen,
but i agree i guess it is better to be proactively safer here.

spatel accepted this revision.Jun 4 2018, 7:01 AM

I was just thinking about profitability of transform to xor %x, -1,
and thought whether we could always do that if the input is add %x, -1.

I can submit that fold as a follow-up.

Sounds good. This patch LGTM.

This revision is now accepted and ready to land.Jun 4 2018, 7:01 AM

I was just thinking about profitability of transform to xor %x, -1,
and thought whether we could always do that if the input is add %x, -1.

I can submit that fold as a follow-up.

Sounds good. This patch LGTM.

Thank you for the review!
To not degrade x86 backend, this can't land until after D47453.

spatel added inline comments.Jun 4 2018, 7:14 AM
test/Transforms/InstCombine/set-lowbits-mask-canonicalize.ll
204–211 ↗(On Diff #149752)

I didn't see this test initially - it's ok to add it here, but it's better to include under test/Transforms/InstSimplify instead (if there's nothing like it already there).

But this doesn't exercise the case that I was thinking of. We should always constant fold simple constants like this. It's *constant expressions* that we (or at least me...multiple times!) forget and then result in fuzzer failures some time later. So the test will need a global variable or some other magic, but now that the code is dyn_cast'ing, it's probably a moot point.