This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] simplifyUnsignedRangeCheck(): if we know that X != 0, handle more cases (PR43246)
ClosedPublic

Authored by lebedev.ri on Sep 8 2019, 11:09 AM.

Details

Summary

This is motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens -fsanitize=pointer-overflow
overhead from 25% to 50%, which strongly implies missing folds.

In this particular case, given

char* test(char& base, unsigned long offset) {
  return &base + offset;
}

it will end up producing something like
https://godbolt.org/z/LK5-iH
which after optimizations reduces down to roughly

define i1 @t0(i8* nonnull %base, i64 %offset) {
  %base_int = ptrtoint i8* %base to i64
  %adjusted = add i64 %base_int, %offset
  %non_null_after_adjustment = icmp ne i64 %adjusted, 0
  %no_overflow_during_adjustment = icmp uge i64 %adjusted, %base_int
  %res = and i1 %non_null_after_adjustment, %no_overflow_during_adjustment
  ret i1 %res
}

Without D67122 there was no %non_null_after_adjustment,
and in this particular case we can get rid of the overhead:

Here we add some offset to a non-null pointer,
and check that the result does not overflow and is not a null pointer.
But since the base pointer is already non-null, and we check for overflow,
that overflow check will already catch the null pointer,
so the separate null check is redundant and can be dropped.

Alive proofs:
https://rise4fun.com/Alive/WRzq

There are more patterns of "unsigned-add-with-overflow", they are not handled here,
but this is the main pattern, that we currently consider canonical,
so it makes sense to handle it.

https://bugs.llvm.org/show_bug.cgi?id=43246

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri created this revision.Sep 8 2019, 11:09 AM
lebedev.ri retitled this revision from [InstSimplify] simplifyUnsignedRangeCheck(): if we know that X != 0, handle more cases to [InstSimplify] simplifyUnsignedRangeCheck(): if we know that X != 0, handle more cases (PR43246).Sep 8 2019, 11:31 AM
lebedev.ri edited the summary of this revision. (Show Details)
spatel accepted this revision.Sep 8 2019, 12:40 PM

LGTM

This revision is now accepted and ready to land.Sep 8 2019, 12:40 PM

LGTM

Thank you for the review!

Sorry - I just looked over the comments in:
https://bugs.llvm.org/show_bug.cgi?id=43246

Does this allow removing a more specific (offset is constant) blob of code from instcombine?

lebedev.ri marked an inline comment as done.Sep 8 2019, 12:44 PM
lebedev.ri added inline comments.
llvm/lib/Analysis/InstructionSimplify.cpp
1399–1406 ↗(On Diff #219273)

Ehh, i messed up comments here

fixup comments

Sorry - I just looked over the comments in:
https://bugs.llvm.org/show_bug.cgi?id=43246

Does this allow removing a more specific (offset is constant) blob of code from instcombine?

Can you please point me to it?
I suspect we get it via multiple folds that happen because
those instructions have no extra uses and constant on RHS.

Sorry - I just looked over the comments in:
https://bugs.llvm.org/show_bug.cgi?id=43246

Does this allow removing a more specific (offset is constant) blob of code from instcombine?

Can you please point me to it?
I suspect we get it via multiple folds that happen because
those instructions have no extra uses and constant on RHS.

declare void @use64(i64)
declare void @use1(i1)
define i1 @test(i64 %offset) {
  %1 = add i64 %offset, 100
  call void @use64(i64 %1)
  %2 = icmp ne i64 %1, 0
  %3 = icmp ugt i64 %1, 99
  %r = and i1 %2, %3
  ret i1 %r
}

gets folded by

// X < Y && Y != 0  -->  X < Y
// X < Y || Y != 0  -->  Y != 0
if (UnsignedPred == ICmpInst::ICMP_ULT && EqPred == ICmpInst::ICMP_NE)
  return IsAnd ? UnsignedICmp : ZeroICmp;

in simplifyUnsignedRangeCheck() i don't know if there is anything to delete.
Proceeding to lading the patch.

This revision was automatically updated to reflect the committed changes.
spatel added a comment.Sep 8 2019, 1:38 PM

Sorry - I just looked over the comments in:
https://bugs.llvm.org/show_bug.cgi?id=43246

Does this allow removing a more specific (offset is constant) blob of code from instcombine?

Can you please point me to it?
I suspect we get it via multiple folds that happen because
those instructions have no extra uses and constant on RHS.

That was just a guess on my part, but I tried a couple of examples, and I think you're right. If the offset is constant, we'll adjust the predicate and/or constant value and end up simplifying right above the clause you're adding.