This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] (a+b) < a && (a+b) != 0 -> (0-b) < a iff a/b != 0 (PR43259)
ClosedPublic

Authored by lebedev.ri on Sep 20 2019, 11:02 AM.

Details

Summary

This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens -fsanitize=pointer-overflow
overhead from 25% to 50%, which strongly implies missing folds.

For

#include <cassert>
char* test(char& base, signed long offset) {
  __builtin_assume(offset < 0);
  return &base + offset;
}

We produce

https://godbolt.org/z/r40U47

and again those two icmp's can be merged:

Name: 0
Pre: C != 0
  %adjusted = add i8 %base, C
  %not_null = icmp ne i8 %adjusted, 0
  %no_underflow = icmp ult i8 %adjusted, %base
  %r = and i1 %not_null, %no_underflow
=>
  %neg_offset = sub i8 0, C
  %r = icmp ugt i8 %base, %neg_offset

https://rise4fun.com/Alive/ALap
https://rise4fun.com/Alive/slnN

There are 3 other variants of this pattern,
i believe they all will go into InstSimplify.

https://bugs.llvm.org/show_bug.cgi?id=43259

Diff Detail

Event Timeline

lebedev.ri created this revision.Sep 20 2019, 11:02 AM

Rebased after changing variable name.

spatel added inline comments.Sep 23 2019, 1:06 PM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1152–1153

Does this diff mean the existing code is buggy or somehow less powerful? Is there a path where the context instruction is not the 'and' or 'or'?

lebedev.ri marked an inline comment as done.Sep 23 2019, 1:12 PM
lebedev.ri added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1152–1153

The current code is less powerful.
Without this, none of those folds happen.
In SQ, CxtI is nullptr.

lebedev.ri marked an inline comment as done.Sep 23 2019, 1:13 PM
spatel added inline comments.Sep 23 2019, 1:21 PM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1152–1153

Then we should make that change ahead of this patch? IMO, it's ok to change it without adding a special test, but it's independent of the main diff since we're already calling isKnownNonZero() from foldUnsignedUnderflowCheck().

lebedev.ri added inline comments.Sep 23 2019, 1:23 PM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1152–1153

I didn't precommit this specifically because i wasn't able to come up with a test for this,
even though i use it in foldUnsignedUnderflowCheck() already.
But okay, let me precommit it...

lebedev.ri marked 2 inline comments as done.

Precommitted SimplifyQuery changes.

spatel added inline comments.Sep 23 2019, 2:51 PM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1097–1098

I don't think this pre-condition is correct. Both values must be non-zero, not just 1?
https://rise4fun.com/Alive/U7I

Need a test like this?

define i1 @t9_nonzero_offset(i8 %base, i8 %offset) {
   %cmp = icmp slt i8 %offset, 0
   call void @llvm.assume(i1 %cmp)
   %adjusted = add i8 %base, %offset
   call void @use8(i8 %adjusted)
   %not_null = icmp ne i8 %adjusted, 0
   %no_underflow = icmp ult i8 %adjusted, %offset
   %r = and i1 %not_null, %no_underflow
   ret i1 %r
}
lebedev.ri planned changes to this revision.Sep 23 2019, 3:03 PM
lebedev.ri added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1097–1098

Uhm, we are going in circles here. :)
Yes and no, the code as-is is incorrect, but we still can fold:
https://rise4fun.com/Alive/slDe
The replacement is semi-commutative here, we just need to carefully
pick which value we will be negating - the one we know is non-zero.

lebedev.ri marked 2 inline comments as done.
lebedev.ri edited the summary of this revision. (Show Details)

We only need to know that one of the values is non-zero, and we must negate that specific value!
https://rise4fun.com/Alive/slnN

xbolva00 added inline comments.Sep 24 2019, 4:13 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1064

I think we dont need this lambda.. Just inline it.

lebedev.ri marked an inline comment as done.Sep 24 2019, 4:19 AM
lebedev.ri added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1064

I don't see why spelling all that each time is better.

xbolva00 added inline comments.Sep 24 2019, 4:55 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1064

So just add a new overload which takes

isKnownNonZero(V, Q)?

lebedev.ri added inline comments.Sep 24 2019, 5:53 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1064

ValueTracking.h (that defines internal Query struct) does not
include InstructionSimplify (that defines SimplifyQuery struct).
Going in either direction does not seem optimal, and seems out of the scope of the patch.

This isn't C, and in C++ i suppose using helpers if they improve code readability is not frowned upon..

xbolva00 added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1064

Thanks for info - just wanted to say: If we can add a overload, we should.. (instead of using lambda).

For me, the 'isKnownNonZero(V, Q.DL, /*Depth=*/0, Q.AC, Q.CxtI, Q.DT);' is fine.

I am personally not very happy to look at the code where there are so many lambdas in one function (what may happen here in the future). I prefer static helper functions..

But this is just my opinion, but I would like to hear opinions of @spatel / @efriedma too.

spatel added inline comments.Sep 24 2019, 7:15 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1064

I agree that an overload of some kind would be nicer, but I also agree that it's not necessary to impede this patch with that requirement because it's not a straightforward translation (Query is not the same as SimplifyQuery).

If we add a proper overload/adapter, then we can reduce the code duplication here and within InstSimplify itself?

It's a matter of taste whether the lambda makes the code more or less readable. I'm ok with it.

xbolva00 added inline comments.Sep 24 2019, 7:24 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1064

If we add a proper overload/adapter, then we can reduce the code duplication here and within InstSimplify itself?

Yeah, and same for 'isKnownToBeAPowerOfTwo'

spatel accepted this revision.Sep 24 2019, 7:58 AM

LGTM

This revision is now accepted and ready to land.Sep 24 2019, 7:58 AM

LGTM

Thank you for the review!
Again sorry for not catching the "which one has to be non-zero" miscompile-waiting-to-happen :/

This revision was automatically updated to reflect the committed changes.