This is an archive of the discontinued LLVM Phabricator instance.

[ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc.
ClosedPublic

Authored by aqjune on Apr 22 2020, 1:11 AM.

Details

Summary

This patch makes propagatesPoison be more accurate by returning true on
more bin ops/unary ops/casts/etc.

The changed test in ScalarEvolution/nsw.ll was introduced by
https://github.com/llvm/llvm-project/commit/a19edc4d15b0dae0210b90615775edd76f021008 .
IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has
no-overflow flags even if the loop isn't in the wanted form.
It becomes more accurate with this patch, so think this is okay.

Diff Detail

Event Timeline

aqjune created this revision.Apr 22 2020, 1:11 AM
jdoerfert added inline comments.Apr 22 2020, 9:15 AM
llvm/lib/Transforms/Instrumentation/PoisonChecking.cpp
348

I think this makes sense but we need to get input from others on this.

llvm/test/Analysis/ScalarEvolution/nsw.ll
244

Please verify as the function name suggests there is some problem here. I noted in the previous patch that I think this is correct but we should check and potentially change the function name or record our findings.

nikic added inline comments.Apr 22 2020, 9:35 AM
llvm/include/llvm/Analysis/ValueTracking.h
567

As I mentioned on the other review, I don't think this is the best choice for this API. propagatesPoison should return true as much as possible for best results. It can return true for immediate UB, so imho it should. At least I don't see in which situation it would be advantageous to return false here.

For example, right now propagatesPoison on udiv returns false, because poison op2 results in IUB. However, poison op1 will propagate poison. Unless you want to distinguish which operand is poison, it would be better to always return true.

(Though distinguishing which operand is poison may be necessary for more accurate select handling, which propagates poison only on op1.)

llvm/lib/Transforms/Instrumentation/PoisonChecking.cpp
348

Lang ref does kind of state this by omission:

Values other than phi nodes and select instructions depend on their operands.

We have select -> and/or transforms that are known to be unsound if and/or do not block poison, but work is underway to fix those.

aqjune marked 2 inline comments as done.Apr 22 2020, 5:06 PM
aqjune added inline comments.
llvm/include/llvm/Analysis/ValueTracking.h
567

Okay, so it requires the users of propagatesPoison to specially deal with div/rem operations. We have getGuaranteedNonPoisonOp, which can be used to leave poison-propagating operands only. I think the concern makes sense.

llvm/lib/Transforms/Instrumentation/PoisonChecking.cpp
348

The sentence of LangRef implicitly describes the semantics of and/or, as @nikic says, but I have an impression that the semantics of and/or frequently has been controversial.

I remember that every possible other candidates for and/or was problematic because it breaks many optimizations on logical <-> arithmetic ops.

I'll bring concrete examples for these by running LLVM unit tests with modified semantics of and/or with Alive2 in a few days. After this issue is resolved, it will be great if we can explicitly state in LangRef why and/or should propagate poison regardless of masking values.

aqjune added a subscriber: fhahn.Apr 22 2020, 5:26 PM
aqjune updated this revision to Diff 259465.Apr 22 2020, 8:30 PM
  • propagatesPoison returns true on div/rem
aqjune updated this revision to Diff 259466.Apr 22 2020, 8:41 PM
  • Write formal specification of propagatesPoison
aqjune marked an inline comment as done.Apr 24 2020, 11:48 AM

Hello all,
I ran experiment with an alternative semantics which defines and poison, 0 and or poison, -1 to block propagation of poison. In this semantics, and poison, 0 is 0, and or poison, -1 is -1.

Here are a few optimizations that I discovered from llvm/test/Transforms becoming incorrect in the alternative semantics:

  1. <Transforms/InstCombine/and-or-icmps.ll>
define i1 @PR2330(i32 %a, i32 %b) {
%0:
  %cmp1 = icmp ult i32 %a, 8
  %cmp2 = icmp ult i32 %b, 8
  %and = and i1 %cmp2, %cmp1
  ret i1 %and
}
=>
define i1 @PR2330(i32 %a, i32 %b) {
%0:
  %1 = or i32 %b, %a
  %2 = icmp ult i32 %1, 8
  ret i1 %2
}

If %a = poison && %b = 9, the source returns poison && 0 = 0, whereas target returns poison.

  1. <Transforms/InstCombine/apint-shift-simplify.ll>
define i41 @test0(i41 %A, i41 %B, i41 %C) {
%0:
  %X = shl i41 %A, %C
  %Y = shl i41 %B, %C
  %Z = and i41 %X, %Y
  ret i41 %Z
}
=>
define i41 @test0(i41 %A, i41 %B, i41 %C) {
%0:
  %X1 = and i41 %A, %B
  %Z = shl i41 %X1, %C
  ret i41 %Z
}

If %A = 100...00 && %B = poison && %C = 1, the source returns 0 whereas target returns poison.

  1. <Transforms/InstCombine/and-fcmp.ll>
define i1 @PR1738(double %x, double %y) {
%0:
  %cmp1 = fcmp ord %x, 0.000000
  %cmp2 = fcmp ord %y, 0.000000
  %and = and i1 %cmp1, %cmp2
  ret i1 %and 
}
=>
define i1 @PR1738(double %x, double %y) {
%0:
  %1 = fcmp ord %x, %y
  ret i1 %1
}

If %x = poison && %y = NaN, the source returns 0 whereas target returns poison.

When I run Alive2, I see that 50 more lit tests from test/Transforms/ are failing, compared with the base semantics (and/ors unconditionally propagating poison)
To fix this - we need to redefine the semantics of many existing operations. For example, fcmp ord should be redefined to determine when it returns poison / when it does not, which might be a pain. Same for shl/icmp etc. So I vote for the simpler semantics (and/or unconditionally propagating poison).

Also, LangRef states that values other than select/phi depends on their operands, and an instruction that depends on a poison value, produces a poison value, so it implies that and/or propagates poison.
propagatesPoison in this patch returns true on and/or operations for this reason.

llvm/test/Analysis/ScalarEvolution/nsw.ll
244

I investigated a bit, and the history was like this:

https://github.com/llvm/llvm-project/commit/7e4a64167d4d2e7b0b680fae1706182223047af1 fixed a bug in SCEV which was incorrectly adding no-wrap flag to a post-inc add recurrence. The patch added isAddRecNeverPoison, which checks whether it is UB if the given add recurrence is poison, by consecutively calling propagatesFullPoison on its use chain.

However, the patch wasn't calling propagatesFullPoison on the direct uses of the add recurrence, so https://github.com/llvm/llvm-project/commit/a19edc4d15b0dae0210b90615775edd76f021008 added the call & added these two tests (bad_postinc_nsw_a/b).

In bad_postinc_nsw_b, and %iv.inc, 0 was inserted to test whether the first propagateFullPoison check successfully blocks propagation of poison. Now propagatesPoison returns true, so the test has changed.
To summarize, the validity of this change depends on whether and x, 0 should propagate poison or not. I think it should propagate (as I wrote on another comment). As suggested, I'll change the function name & leave a comment.

aqjune updated this revision to Diff 259946.Apr 24 2020, 11:50 AM

Update a comment & clang-format

Added people who might be interested in this issue as subscribers.

nikic accepted this revision.Apr 30 2020, 10:19 AM

Thanks for running the alive experiment. The results look pretty compelling and are in line with what I'd expect. Giving one instruction special poison semantics has a snowball effect that will quickly lead us to undef-like reasoning.

This LG to me, but please wait for a second opinion.

llvm/lib/Analysis/ValueTracking.cpp
4944

Maybe list call/invoke explicitly here?

This revision is now accepted and ready to land.Apr 30 2020, 10:19 AM
aqjune updated this revision to Diff 261438.May 1 2020, 12:54 AM

Make call / invoke explicit

aqjune marked an inline comment as done.May 1 2020, 12:54 AM
spatel accepted this revision.May 4 2020, 5:03 AM

LGTM - see inline for some grammar nits.

llvm/include/llvm/Analysis/ValueTracking.h
565–570

raise -> raises

569

the operand -> operands

llvm/lib/Analysis/ValueTracking.cpp
4956

Add period to end of sentence.

aqjune updated this revision to Diff 262462.May 6 2020, 1:07 PM

Fix grammar errors

aqjune marked 4 inline comments as done.May 6 2020, 1:08 PM
aqjune added inline comments.
llvm/test/Analysis/ScalarEvolution/nsw.ll
244

Does this address your concern? @jdoerfert

reames added inline comments.May 6 2020, 2:23 PM
llvm/include/llvm/Analysis/ValueTracking.h
567

I really don't think that having UB operations return true is the right semantic here. It doesn't add any value. If we have a UB operation with provable poison, then we've missed the opportunity to prune that path. Having some later optimization trigger (which is what the proposed semantic does), feels like side stepping the problem.

(I'm just expressing an opinion, not blocking the patch. I'm fine with this going in, we can continue the discussion and change our minds later if my view turns out to convince others.)

nlopes added inline comments.May 8 2020, 9:58 AM
llvm/include/llvm/Analysis/ValueTracking.h
567

I guess it's still useful to know that udiv propagates poison regardless whether we know it is UB or not.
Most of the times we won't know if an operation triggers UB or not, so it feels right to at least some information. We have other APIs to check whether an instruction triggers UB. You are right in that we are pushing a bit more responsibility to the user of this analysis (for perf reasons, not correctness), but for the reasons I've stated above, I think it's a good tradeoff.
I'm in favor of the patch as-is.

aqjune marked an inline comment as done.May 12 2020, 4:51 AM
aqjune added inline comments.
llvm/test/Analysis/ScalarEvolution/nsw.ll
244
jdoerfert added inline comments.May 12 2020, 9:41 AM
llvm/test/Analysis/ScalarEvolution/nsw.ll
244

OK with me.

Thanks all, I landed this patch.
For the update in LangRef, I checked that it already has this specific example:
%still_poison = and i32 %poison, 0 ; 0, but also poison.,
Making LLVM's implementation consistent with LangRef through this patch seems enough.

This revision was automatically updated to reflect the committed changes.