This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] don't lose nsw/nuw from add by converting to xor
ClosedPublic

Authored by spatel on Feb 8 2017, 2:45 PM.

Details

Summary

I was looking at an improvement to InstSimplify when I noticed that this add fold in InstCombine was preventing icmp simplifies.

There's a later fold that special-cases (add (xor with signbit), C), so I don't think we're losing anything by limiting this.

'test20' and 'xor_sign_bit' in add.ll verify that the later fold still works, so there are no regressions in existing tests.

Diff Detail

Event Timeline

spatel created this revision.Feb 8 2017, 2:45 PM
efriedma requested changes to this revision.Feb 8 2017, 3:05 PM

We don't want to go down the path of forbidding optimizations on add/sub because we can deduce the operation doesn't overflow. There's been a lot of activity recently surrounding the representation of predicates/assertions/etc.; the right solution probably lies in that direction.

This revision now requires changes to proceed.Feb 8 2017, 3:05 PM
spatel added a comment.Feb 8 2017, 3:25 PM

We don't want to go down the path of forbidding optimizations on add/sub because we can deduce the operation doesn't overflow. There's been a lot of activity recently surrounding the representation of predicates/assertions/etc.; the right solution probably lies in that direction.

Let me make sure I'm not completely lost. I'm working on an instcombine icmp+add patch that relies on nsw to eliminate the add:
https://godbolt.org/g/yzoxeY

Should I not pursue that because the transform relies on nsw?

instcombine is allowed to use nsw to permit transformations which would otherwise be illegal. instcombine should not avoid transformations for fear of dropping nsw or other metadata.

spatel added a comment.Feb 9 2017, 7:48 AM

instcombine is allowed to use nsw to permit transformations which would otherwise be illegal. instcombine should not avoid transformations for fear of dropping nsw or other metadata.

Thanks!

So for this patch, is instcombine behaving as expected already? Ie, it's ok to drop the nsw, some other pass has the responsibility to optimize this code:
https://godbolt.org/g/ezOlu6

Or should instcombine attempt to preserve the nsw despite changing the add into xor? The current mechanism would be to add an icmp+llvm.assume? Any better ideas?

Fundamentally, many optimizations which simplify code involve dropping information which could be derived from the unoptimized code; we don't attempt to keep around all possible information which could be used by the optimizer. For example, combining nsw "a + 1 - 1" to "a" drops the information that "a" is not INT_MAX, and that's fine.

You can't actually use llvm.assume to record nsw information because nsw doesn't act like other metadata: overflow produces poison rather than undefined behavior.

spatel updated this revision to Diff 88097.Feb 11 2017, 10:21 AM
spatel edited edge metadata.

Patch updated:
If llvm.assume can't be used to solve this, I don't know of an existing way to do it, so I'm making this an NFC documentation patch for the current behavior.

sanjoy edited edge metadata.EditedFeb 12 2017, 9:12 PM

I think we can get the best of both worlds here -- both X +nsw INT_MIN and X +nuw INT_MIN can be transformed to X | INT_MIN. http://rise4fun.com/Alive/ZpV

And then (X | INT_MIN) s< 0 can be folded to false.

[Edit: minor typo fixed]

spatel updated this revision to Diff 88197.Feb 13 2017, 6:44 AM

Patch updated:
As suggested by Sanjoy, use 'or' rather than 'xor' when we have {nuw,nsw}. This allows the cmp tests to succeed.
There's a regression in the first add test because we don't recognize the larger pattern, but I've marked that with FIXME since this is still an overall improvement IMO.

sanjoy added inline comments.Feb 15 2017, 11:43 PM
test/Transforms/InstCombine/add.ll
252

Can we add a simple peephole to not regress this case? I.e. (xor SIGN_BIT (add nsw X SIGN_BIT)) to X?

(I'm fine doing that in an immediately following change and not in this change specifically).

test/Transforms/InstCombine/icmp-add.ll
226

Why not just delete the comment?

sanjoy accepted this revision.Feb 15 2017, 11:43 PM

Other than that LGTM.

spatel added inline comments.Feb 16 2017, 6:38 AM
test/Transforms/InstCombine/add.ll
252

Sure - I didn't know if combining those 2 steps into 1 patch would be welcome, but it makes sense to not introduce a known regression.

test/Transforms/InstCombine/icmp-add.ll
226

I'd prefer to leave it here to supply the motivation. I often see tests like this that would seem to belong in InstSimplify rather than InstCombine, and I can't tell if that's intentional or just an artifact of LLVM optimization history (and therefore, the test is redundant and removeable).

spatel added inline comments.Feb 18 2017, 2:23 PM
test/Transforms/InstCombine/add.ll
252
This revision was automatically updated to reflect the committed changes.