This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
3/6
ValueTracking.h
-
lib/
-
Analysis/
2/2
ValueTracking.cpp
-
Transforms/Instrumentation/
-
Instrumentation/
1/3
PoisonChecking.cpp
-
test/Analysis/ScalarEvolution/
-
Analysis/
-
ScalarEvolution/
3/5
nsw.ll
-
unittests/Analysis/
-
Analysis/
-
ValueTrackingTest.cpp

Differential D78615

[ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc.
ClosedPublic

Authored by aqjune on Apr 22 2020, 1:11 AM.

Download Raw Diff

Details

Reviewers

spatel
lebedev.ri
jdoerfert
reames
nikic
sanjoy

Commits

rGe5f602d82ca0: [ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc.

Summary

This patch makes propagatesPoison be more accurate by returning true on
more bin ops/unary ops/casts/etc.

The changed test in ScalarEvolution/nsw.ll was introduced by
https://github.com/llvm/llvm-project/commit/a19edc4d15b0dae0210b90615775edd76f021008 .
IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has
no-overflow flags even if the loop isn't in the wanted form.
It becomes more accurate with this patch, so think this is okay.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aqjune created this revision.Apr 22 2020, 1:11 AM

Herald added a subscriber: javed.absar. · View Herald TranscriptApr 22 2020, 1:11 AM

aqjune added a parent revision: D78503: [ValueTracking] Let analyses assume a value cannot be partially poison.Apr 22 2020, 1:12 AM

Harbormaster failed remote builds in B54205: Diff 259197!Apr 22 2020, 2:08 AM

jdoerfert added inline comments.Apr 22 2020, 9:15 AM

llvm/lib/Transforms/Instrumentation/PoisonChecking.cpp
348	I think this makes sense but we need to get input from others on this.
llvm/test/Analysis/ScalarEvolution/nsw.ll
244	Please verify as the function name suggests there is some problem here. I noted in the previous patch that I think this is correct but we should check and potentially change the function name or record our findings.

nikic added inline comments.Apr 22 2020, 9:35 AM

llvm/include/llvm/Analysis/ValueTracking.h
567	As I mentioned on the other review, I don't think this is the best choice for this API. `propagatesPoison` should return `true` as much as possible for best results. It can return true for immediate UB, so imho it should. At least I don't see in which situation it would be advantageous to return false here. For example, right now `propagatesPoison` on `udiv` returns false, because poison op2 results in IUB. However, poison op1 will propagate poison. Unless you want to distinguish which operand is poison, it would be better to always return true. (Though distinguishing which operand is poison may be necessary for more accurate `select` handling, which propagates poison only on op1.)
llvm/lib/Transforms/Instrumentation/PoisonChecking.cpp
348	Lang ref does kind of state this by omission: Values other than phi nodes and select instructions depend on their operands. We have select -> and/or transforms that are known to be unsound if and/or do not block poison, but work is underway to fix those.

aqjune mentioned this in D78503: [ValueTracking] Let analyses assume a value cannot be partially poison.Apr 22 2020, 3:56 PM

aqjune marked 2 inline comments as done.Apr 22 2020, 5:06 PM

aqjune added inline comments.

llvm/include/llvm/Analysis/ValueTracking.h
567	Okay, so it requires the users of propagatesPoison to specially deal with div/rem operations. We have `getGuaranteedNonPoisonOp`, which can be used to leave poison-propagating operands only. I think the concern makes sense.
llvm/lib/Transforms/Instrumentation/PoisonChecking.cpp
348	The sentence of LangRef implicitly describes the semantics of and/or, as @nikic says, but I have an impression that the semantics of and/or frequently has been controversial. I remember that every possible other candidates for and/or was problematic because it breaks many optimizations on logical <-> arithmetic ops. I'll bring concrete examples for these by running LLVM unit tests with modified semantics of and/or with Alive2 in a few days. After this issue is resolved, it will be great if we can explicitly state in LangRef why and/or should propagate poison regardless of masking values.

aqjune added a subscriber: fhahn.Apr 22 2020, 5:26 PM

propagatesPoison returns true on div/rem

Write formal specification of propagatesPoison

Harbormaster failed remote builds in B54343: Diff 259465!Apr 22 2020, 9:10 PM

Harbormaster failed remote builds in B54344: Diff 259466!Apr 22 2020, 9:42 PM

Hello all,
I ran experiment with an alternative semantics which defines and poison, 0 and or poison, -1 to block propagation of poison. In this semantics, and poison, 0 is 0, and or poison, -1 is -1.

Here are a few optimizations that I discovered from llvm/test/Transforms becoming incorrect in the alternative semantics:

<Transforms/InstCombine/and-or-icmps.ll>

define i1 @PR2330(i32 %a, i32 %b) {
%0:
  %cmp1 = icmp ult i32 %a, 8
  %cmp2 = icmp ult i32 %b, 8
  %and = and i1 %cmp2, %cmp1
  ret i1 %and
}
=>
define i1 @PR2330(i32 %a, i32 %b) {
%0:
  %1 = or i32 %b, %a
  %2 = icmp ult i32 %1, 8
  ret i1 %2
}

If %a = poison && %b = 9, the source returns poison && 0 = 0, whereas target returns poison.

<Transforms/InstCombine/apint-shift-simplify.ll>

define i41 @test0(i41 %A, i41 %B, i41 %C) {
%0:
  %X = shl i41 %A, %C
  %Y = shl i41 %B, %C
  %Z = and i41 %X, %Y
  ret i41 %Z
}
=>
define i41 @test0(i41 %A, i41 %B, i41 %C) {
%0:
  %X1 = and i41 %A, %B
  %Z = shl i41 %X1, %C
  ret i41 %Z
}

If %A = 100...00 && %B = poison && %C = 1, the source returns 0 whereas target returns poison.

<Transforms/InstCombine/and-fcmp.ll>

define i1 @PR1738(double %x, double %y) {
%0:
  %cmp1 = fcmp ord %x, 0.000000
  %cmp2 = fcmp ord %y, 0.000000
  %and = and i1 %cmp1, %cmp2
  ret i1 %and 
}
=>
define i1 @PR1738(double %x, double %y) {
%0:
  %1 = fcmp ord %x, %y
  ret i1 %1
}

If %x = poison && %y = NaN, the source returns 0 whereas target returns poison.

When I run Alive2, I see that 50 more lit tests from test/Transforms/ are failing, compared with the base semantics (and/ors unconditionally propagating poison)
To fix this - we need to redefine the semantics of many existing operations. For example, fcmp ord should be redefined to determine when it returns poison / when it does not, which might be a pain. Same for shl/icmp etc. So I vote for the simpler semantics (and/or unconditionally propagating poison).

Also, LangRef states that values other than select/phi depends on their operands, and an instruction that depends on a poison value, produces a poison value, so it implies that and/or propagates poison.
propagatesPoison in this patch returns true on and/or operations for this reason.

llvm/test/Analysis/ScalarEvolution/nsw.ll
244	I investigated a bit, and the history was like this: https://github.com/llvm/llvm-project/commit/7e4a64167d4d2e7b0b680fae1706182223047af1 fixed a bug in SCEV which was incorrectly adding no-wrap flag to a post-inc add recurrence. The patch added `isAddRecNeverPoison`, which checks whether it is UB if the given add recurrence is poison, by consecutively calling `propagatesFullPoison` on its use chain. However, the patch wasn't calling `propagatesFullPoison` on the direct uses of the add recurrence, so https://github.com/llvm/llvm-project/commit/a19edc4d15b0dae0210b90615775edd76f021008 added the call & added these two tests (bad_postinc_nsw_a/b). In bad_postinc_nsw_b, `and %iv.inc, 0` was inserted to test whether the first `propagateFullPoison` check successfully blocks propagation of poison. Now `propagatesPoison` returns true, so the test has changed. To summarize, the validity of this change depends on whether `and x, 0` should propagate poison or not. I think it should propagate (as I wrote on another comment). As suggested, I'll change the function name & leave a comment.

aqjune added subscribers: efriedma, nlopes, regehr.Apr 24 2020, 11:49 AM

Update a comment & clang-format

Added people who might be interested in this issue as subscribers.

Harbormaster failed remote builds in B54602: Diff 259946!Apr 24 2020, 12:26 PM

Thanks for running the alive experiment. The results look pretty compelling and are in line with what I'd expect. Giving one instruction special poison semantics has a snowball effect that will quickly lead us to undef-like reasoning.

This LG to me, but please wait for a second opinion.

llvm/lib/Analysis/ValueTracking.cpp
4852	Maybe list call/invoke explicitly here?

This revision is now accepted and ready to land.Apr 30 2020, 10:19 AM

Make call / invoke explicit

aqjune marked an inline comment as done.May 1 2020, 12:54 AM

Harbormaster completed remote builds in B55417: Diff 261438.May 1 2020, 1:07 AM

LGTM - see inline for some grammar nits.

llvm/include/llvm/Analysis/ValueTracking.h
565	raise -> raises
569	the operand -> operands
llvm/lib/Analysis/ValueTracking.cpp
4864	Add period to end of sentence.

Fix grammar errors

aqjune marked 4 inline comments as done.May 6 2020, 1:08 PM

aqjune added inline comments.

llvm/test/Analysis/ScalarEvolution/nsw.ll
244	Does this address your concern? @jdoerfert

Harbormaster failed remote builds in B55974: Diff 262462!May 6 2020, 2:09 PM

reames added inline comments.May 6 2020, 2:23 PM

llvm/include/llvm/Analysis/ValueTracking.h
567	I really don't think that having UB operations return true is the right semantic here. It doesn't add any value. If we have a UB operation with provable poison, then we've missed the opportunity to prune that path. Having some later optimization trigger (which is what the proposed semantic does), feels like side stepping the problem. (I'm just expressing an opinion, not blocking the patch. I'm fine with this going in, we can continue the discussion and change our minds later if my view turns out to convince others.)

nlopes added inline comments.May 8 2020, 9:58 AM

llvm/include/llvm/Analysis/ValueTracking.h
567	I guess it's still useful to know that udiv propagates poison regardless whether we know it is UB or not. Most of the times we won't know if an operation triggers UB or not, so it feels right to at least some information. We have other APIs to check whether an instruction triggers UB. You are right in that we are pushing a bit more responsibility to the user of this analysis (for perf reasons, not correctness), but for the reasons I've stated above, I think it's a good tradeoff. I'm in favor of the patch as-is.

aqjune mentioned this in D79748: [ValueTracking] And & Or propagate poison..May 11 2020, 11:27 PM

aqjune marked an inline comment as done.May 12 2020, 4:51 AM

aqjune added inline comments.

llvm/test/Analysis/ScalarEvolution/nsw.ll
244	ping @jdoerfert

jdoerfert added inline comments.May 12 2020, 9:41 AM

llvm/test/Analysis/ScalarEvolution/nsw.ll
244	OK with me.

Thanks all, I landed this patch.
For the update in LangRef, I checked that it already has this specific example:
%still_poison = and i32 %poison, 0 ; 0, but also poison.,
Making LLVM's implementation consistent with LangRef through this patch seems enough.

Closed by commit rGe5f602d82ca0: [ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc. (authored by aqjune). · Explain WhyMay 12 2020, 11:17 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

ValueTracking.h

8 lines

lib/

Analysis/

ValueTracking.cpp

37 lines

Transforms/

Instrumentation/

PoisonChecking.cpp

25 lines

test/

Analysis/

ScalarEvolution/

nsw.ll

8 lines

unittests/

Analysis/

ValueTrackingTest.cpp

49 lines

Diff 261438

llvm/include/llvm/Analysis/ValueTracking.h

//===- llvm/Analysis/ValueTracking.h - Walk computations --------- C++ --===//		//===- llvm/Analysis/ValueTracking.h - Walk computations --------- C++ --===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 548 Lines • ▼ Show 20 Lines	class Value;

/// Return true if this function can prove that the instruction I		/// Return true if this function can prove that the instruction I
/// is executed for every iteration of the loop L.		/// is executed for every iteration of the loop L.
///		///
/// Note that this currently only considers the loop header.		/// Note that this currently only considers the loop header.
bool isGuaranteedToExecuteForEveryIteration(const Instruction *I,		bool isGuaranteedToExecuteForEveryIteration(const Instruction *I,
const Loop *L);		const Loop *L);

/// Return true if this function can prove that I is guaranteed to yield		/// Return true if I yields poison or raise UB if any of its operands is
		spatelUnsubmitted Done Reply Inline Actions raise -> raises spatel: raise -> raises
/// poison if at least one of its operands is poison.		/// poison.
		/// Formally, given I = `r = op v1 v2 .. vN`, propagatesPoison returns true
		nikicUnsubmitted Not Done Reply Inline Actions As I mentioned on the other review, I don't think this is the best choice for this API. `propagatesPoison` should return `true` as much as possible for best results. It can return true for immediate UB, so imho it should. At least I don't see in which situation it would be advantageous to return false here. For example, right now `propagatesPoison` on `udiv` returns false, because poison op2 results in IUB. However, poison op1 will propagate poison. Unless you want to distinguish which operand is poison, it would be better to always return true. (Though distinguishing which operand is poison may be necessary for more accurate `select` handling, which propagates poison only on op1.) nikic: As I mentioned on the other review, I don't think this is the best choice for this API.
		aqjuneAuthorUnsubmitted Done Reply Inline Actions Okay, so it requires the users of propagatesPoison to specially deal with div/rem operations. We have `getGuaranteedNonPoisonOp`, which can be used to leave poison-propagating operands only. I think the concern makes sense. aqjune: Okay, so it requires the users of propagatesPoison to specially deal with div/rem operations.
		reamesUnsubmitted Not Done Reply Inline Actions I really don't think that having UB operations return true is the right semantic here. It doesn't add any value. If we have a UB operation with provable poison, then we've missed the opportunity to prune that path. Having some later optimization trigger (which is what the proposed semantic does), feels like side stepping the problem. (I'm just expressing an opinion, not blocking the patch. I'm fine with this going in, we can continue the discussion and change our minds later if my view turns out to convince others.) reames: I really don't think that having UB operations return true is the right semantic here. It…
		nlopesUnsubmitted Not Done Reply Inline Actions I guess it's still useful to know that udiv propagates poison regardless whether we know it is UB or not. Most of the times we won't know if an operation triggers UB or not, so it feels right to at least some information. We have other APIs to check whether an instruction triggers UB. You are right in that we are pushing a bit more responsibility to the user of this analysis (for perf reasons, not correctness), but for the reasons I've stated above, I think it's a good tradeoff. I'm in favor of the patch as-is. nlopes: I guess it's still useful to know that udiv propagates poison regardless whether we know it is…
		/// if, for all i, r is evaluated to poison or op raises UB if vi = poison.
		/// To filter out the operand that raise UB on poison, you can use
		spatelUnsubmitted Done Reply Inline Actions the operand -> operands spatel: the operand -> operands
		/// getGuaranteedNonPoisonOp.
bool propagatesPoison(const Instruction *I);		bool propagatesPoison(const Instruction *I);

/// Return either nullptr or an operand of I such that I will trigger		/// Return either nullptr or an operand of I such that I will trigger
/// undefined behavior if I is executed and that operand has a poison		/// undefined behavior if I is executed and that operand has a poison
/// value.		/// value.
const Value getGuaranteedNonPoisonOp(const Instruction I);		const Value getGuaranteedNonPoisonOp(const Instruction I);

/// Return true if the given instruction must trigger undefined behavior.		/// Return true if the given instruction must trigger undefined behavior.
▲ Show 20 Lines • Show All 157 Lines • Show Last 20 Lines

llvm/lib/Analysis/ValueTracking.cpp

//===- ValueTracking.cpp - Walk computations to compute properties --------===//		//===- ValueTracking.cpp - Walk computations to compute properties --------===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 4,832 Lines • ▼ Show 20 Lines	bool llvm::isGuaranteedToExecuteForEveryIteration(const Instruction *I,
for (const Instruction &LI : *L->getHeader()) {		for (const Instruction &LI : *L->getHeader()) {
if (&LI == I) return true;		if (&LI == I) return true;
if (!isGuaranteedToTransferExecutionToSuccessor(&LI)) return false;		if (!isGuaranteedToTransferExecutionToSuccessor(&LI)) return false;
}		}
llvm_unreachable("Instruction not contained in its own parent basic block.");		llvm_unreachable("Instruction not contained in its own parent basic block.");
}		}

bool llvm::propagatesPoison(const Instruction *I) {		bool llvm::propagatesPoison(const Instruction *I) {
// TODO: This should include all instructions apart from phis, selects and
// call-like instructions.
switch (I->getOpcode()) {		switch (I->getOpcode()) {
case Instruction::Add:		case Instruction::Freeze:
case Instruction::Sub:		case Instruction::Select:
case Instruction::Xor:		case Instruction::PHI:
		nikicUnsubmitted Done Reply Inline Actions Maybe list call/invoke explicitly here? nikic: Maybe list call/invoke explicitly here?
case Instruction::Trunc:		case Instruction::Call:
case Instruction::BitCast:		case Instruction::Invoke:
case Instruction::AddrSpaceCast:		return false;
case Instruction::Mul:		case Instruction::ICmp:
case Instruction::Shl:		case Instruction::FCmp:
case Instruction::GetElementPtr:		case Instruction::GetElementPtr:
// These operations all propagate poison unconditionally. Note that poison
// is not any particular value, so xor or subtraction of poison with
// itself still yields poison, not zero.
return true;

case Instruction::AShr:
case Instruction::SExt:
// For these operations, one bit of the input is replicated across
// multiple output bits. A replicated poison bit is still poison.
return true;		return true;
		default:
case Instruction::ICmp:		if (isa<BinaryOperator>(I) \|\| isa<UnaryOperator>(I) \|\| isa<CastInst>(I))
// Comparing poison with any value yields poison. This is why, for
// instance, x s< (x +nsw 1) can be folded to true.
return true;		return true;

default:		// Be conservative and return false
		spatelUnsubmitted Done Reply Inline Actions Add period to end of sentence. spatel: Add period to end of sentence.
return false;		return false;
}		}
}		}

const Value llvm::getGuaranteedNonPoisonOp(const Instruction I) {		const Value llvm::getGuaranteedNonPoisonOp(const Instruction I) {
switch (I->getOpcode()) {		switch (I->getOpcode()) {
case Instruction::Store:		case Instruction::Store:
return cast<StoreInst>(I)->getPointerOperand();		return cast<StoreInst>(I)->getPointerOperand();
▲ Show 20 Lines • Show All 1,535 Lines • Show Last 20 Lines

llvm/lib/Transforms/Instrumentation/PoisonChecking.cpp

//===- PoisonChecking.cpp - -----------------------------------------------===//		//===- PoisonChecking.cpp - -----------------------------------------------===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Implements a transform pass which instruments IR such that poison semantics		// Implements a transform pass which instruments IR such that poison semantics
// are made explicit. That is, it provides a (possibly partial) executable		// are made explicit. That is, it provides a (possibly partial) executable
// semantics for every instruction w.r.t. poison as specified in the LLVM		// semantics for every instruction w.r.t. poison as specified in the LLVM
// LangRef. There are obvious parallels to the sanitizer tools, but this pass		// LangRef. There are obvious parallels to the sanitizer tools, but this pass
// is focused purely on the semantics of LLVM IR, not any particular source		// is focused purely on the semantics of LLVM IR, not any particular source
// language. If you're looking for something to see if your C/C++ contains		// language. If you're looking for something to see if your C/C++ contains
// UB, this is not it.		// UB, this is not it.
//		//
// The rewritten semantics of each instruction will include the following		// The rewritten semantics of each instruction will include the following
// components:		// components:
//		//
// 1) The original instruction, unmodified.		// 1) The original instruction, unmodified.
// 2) A propagation rule which translates dynamic information about the poison		// 2) A propagation rule which translates dynamic information about the poison
// state of each input to whether the dynamic output of the instruction		// state of each input to whether the dynamic output of the instruction
// produces poison.		// produces poison.
// 3) A creation rule which validates any poison producing flags on the		// 3) A creation rule which validates any poison producing flags on the
// instruction itself (e.g. checks for overflow on nsw).		// instruction itself (e.g. checks for overflow on nsw).
// 4) A check rule which traps (to a handler function) if this instruction must		// 4) A check rule which traps (to a handler function) if this instruction must
// execute undefined behavior given the poison state of it's inputs.		// execute undefined behavior given the poison state of it's inputs.
//		//
// This is a must analysis based transform; that is, the resulting code may		// This is a must analysis based transform; that is, the resulting code may
// produce a false negative result (not report UB when actually exists		// produce a false negative result (not report UB when actually exists
// according to the LangRef spec), but should never produce a false positive		// according to the LangRef spec), but should never produce a false positive
// (report UB where it doesn't exist).		// (report UB where it doesn't exist).
//		//
// Use cases for this pass include:		// Use cases for this pass include:
// - Understanding (and testing!) the implications of the definition of poison		// - Understanding (and testing!) the implications of the definition of poison
// from the LangRef.		// from the LangRef.
// - Validating the output of a IR fuzzer to ensure that all programs produced		// - Validating the output of a IR fuzzer to ensure that all programs produced
// are well defined on the specific input used.		// are well defined on the specific input used.
// - Finding/confirming poison specific miscompiles by checking the poison		// - Finding/confirming poison specific miscompiles by checking the poison
// status of an input/IR pair is the same before and after an optimization		// status of an input/IR pair is the same before and after an optimization
// transform.		// transform.
// - Checking that a bugpoint reduction does not introduce UB which didn't		// - Checking that a bugpoint reduction does not introduce UB which didn't
// exist in the original program being reduced.		// exist in the original program being reduced.
//		//
// The major sources of inaccuracy are currently:		// The major sources of inaccuracy are currently:
// - Most validation rules not yet implemented for instructions with poison		// - Most validation rules not yet implemented for instructions with poison
// relavant flags. At the moment, only nsw/nuw on add/sub are supported.		// relavant flags. At the moment, only nsw/nuw on add/sub are supported.
// - UB which is control dependent on a branch on poison is not yet		// - UB which is control dependent on a branch on poison is not yet
// reported. Currently, only data flow dependence is modeled.		// reported. Currently, only data flow dependence is modeled.
// - Poison which is propagated through memory is not modeled. As such,		// - Poison which is propagated through memory is not modeled. As such,
// storing poison to memory and then reloading it will cause a false negative		// storing poison to memory and then reloading it will cause a false negative
// as we consider the reloaded value to not be poisoned.		// as we consider the reloaded value to not be poisoned.
// - Poison propagation across function boundaries is not modeled. At the		// - Poison propagation across function boundaries is not modeled. At the
// moment, all arguments and return values are assumed not to be poison.		// moment, all arguments and return values are assumed not to be poison.
// - Undef is not modeled. In particular, the optimizer's freedom to pick		// - Undef is not modeled. In particular, the optimizer's freedom to pick
// concrete values for undef bits so as to maximize potential for producing		// concrete values for undef bits so as to maximize potential for producing
// poison is not modeled.		// poison is not modeled.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Transforms/Instrumentation/PoisonChecking.h"		#include "llvm/Transforms/Instrumentation/PoisonChecking.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/MemoryBuiltins.h"		#include "llvm/Analysis/MemoryBuiltins.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
Show All 33 Lines	for (; i < Ops.size(); i++)
if (!isConstantFalse(Ops[i]))		if (!isConstantFalse(Ops[i]))
Accum = B.CreateOr(Accum, Ops[i]);		Accum = B.CreateOr(Accum, Ops[i]);
return Accum;		return Accum;
}		}

static void generateCreationChecksForBinOp(Instruction &I,		static void generateCreationChecksForBinOp(Instruction &I,
SmallVectorImpl<Value*> &Checks) {		SmallVectorImpl<Value*> &Checks) {
assert(isa<BinaryOperator>(I));		assert(isa<BinaryOperator>(I));

IRBuilder<> B(&I);		IRBuilder<> B(&I);
Value *LHS = I.getOperand(0);		Value *LHS = I.getOperand(0);
Value *RHS = I.getOperand(1);		Value *RHS = I.getOperand(1);
switch (I.getOpcode()) {		switch (I.getOpcode()) {
default:		default:
return;		return;
case Instruction::Add: {		case Instruction::Add: {
if (I.hasNoSignedWrap()) {		if (I.hasNoSignedWrap()) {
▲ Show 20 Lines • Show All 145 Lines • ▼ Show 20 Lines
static bool rewrite(Function &F) {		static bool rewrite(Function &F) {
auto * const Int1Ty = Type::getInt1Ty(F.getContext());		auto * const Int1Ty = Type::getInt1Ty(F.getContext());

DenseMap<Value , Value > ValToPoison;		DenseMap<Value , Value > ValToPoison;

for (BasicBlock &BB : F)		for (BasicBlock &BB : F)
for (auto I = BB.begin(); isa<PHINode>(&*I); I++) {		for (auto I = BB.begin(); isa<PHINode>(&*I); I++) {
auto OldPHI = cast<PHINode>(&I);		auto OldPHI = cast<PHINode>(&I);
auto *NewPHI = PHINode::Create(Int1Ty,		auto *NewPHI = PHINode::Create(Int1Ty, OldPHI->getNumIncomingValues());
OldPHI->getNumIncomingValues());
for (unsigned i = 0; i < OldPHI->getNumIncomingValues(); i++)		for (unsigned i = 0; i < OldPHI->getNumIncomingValues(); i++)
NewPHI->addIncoming(UndefValue::get(Int1Ty),		NewPHI->addIncoming(UndefValue::get(Int1Ty),
OldPHI->getIncomingBlock(i));		OldPHI->getIncomingBlock(i));
NewPHI->insertBefore(OldPHI);		NewPHI->insertBefore(OldPHI);
ValToPoison[OldPHI] = NewPHI;		ValToPoison[OldPHI] = NewPHI;
}		}

for (BasicBlock &BB : F)		for (BasicBlock &BB : F)
for (Instruction &I : BB) {		for (Instruction &I : BB) {
if (isa<PHINode>(I)) continue;		if (isa<PHINode>(I)) continue;

IRBuilder<> B(cast<Instruction>(&I));		IRBuilder<> B(cast<Instruction>(&I));

// Note: There are many more sources of documented UB, but this pass only		// Note: There are many more sources of documented UB, but this pass only
// attempts to find UB triggered by propagation of poison.		// attempts to find UB triggered by propagation of poison.
if (Value Op = const_cast<Value>(getGuaranteedNonPoisonOp(&I)))		if (Value Op = const_cast<Value>(getGuaranteedNonPoisonOp(&I)))
CreateAssertNot(B, getPoisonFor(ValToPoison, Op));		CreateAssertNot(B, getPoisonFor(ValToPoison, Op));

if (LocalCheck)		if (LocalCheck)
if (auto *RI = dyn_cast<ReturnInst>(&I))		if (auto *RI = dyn_cast<ReturnInst>(&I))
if (RI->getNumOperands() != 0) {		if (RI->getNumOperands() != 0) {
Show All 35 Lines	PreservedAnalyses PoisonCheckingPass::run(Module &M,
return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();		return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();
}		}

PreservedAnalyses PoisonCheckingPass::run(Function &F,		PreservedAnalyses PoisonCheckingPass::run(Function &F,
FunctionAnalysisManager &AM) {		FunctionAnalysisManager &AM) {
return rewrite(F) ? PreservedAnalyses::none() : PreservedAnalyses::all();		return rewrite(F) ? PreservedAnalyses::none() : PreservedAnalyses::all();
}		}


/* Major TODO Items:		/* Major TODO Items:
- Control dependent poison UB		- Control dependent poison UB
- Strict mode - (i.e. must analyze every operand)		- Strict mode - (i.e. must analyze every operand)
- Poison through memory		- Poison through memory
- Function ABIs		- Function ABIs
- Full coverage of intrinsics, etc.. (ouch)		- Full coverage of intrinsics, etc.. (ouch)

Instructions w/Unclear Semantics:		Instructions w/Unclear Semantics:
- shufflevector - It would seem reasonable for an out of bounds mask element		- shufflevector - It would seem reasonable for an out of bounds mask element
to produce poison, but the LangRef does not state.		to produce poison, but the LangRef does not state.
- and/or - It would seem reasonable for poison to propagate from both
arguments, but LangRef doesn't state and propagatesPoison doesn't
include these two.
jdoerfertUnsubmitted Not Done Reply Inline Actions I think this makes sense but we need to get input from others on this. jdoerfert: I think this makes sense but we need to get input from others on this.
nikicUnsubmitted Not Done Reply Inline Actions Lang ref does kind of state this by omission: Values other than phi nodes and select instructions depend on their operands. We have select -> and/or transforms that are known to be unsound if and/or do not block poison, but work is underway to fix those. nikic: Lang ref does kind of state this by omission: > Values other than phi nodes and select…
aqjuneAuthorUnsubmitted Done Reply Inline Actions The sentence of LangRef implicitly describes the semantics of and/or, as @nikic says, but I have an impression that the semantics of and/or frequently has been controversial. I remember that every possible other candidates for and/or was problematic because it breaks many optimizations on logical <-> arithmetic ops. I'll bring concrete examples for these by running LLVM unit tests with modified semantics of and/or with Alive2 in a few days. After this issue is resolved, it will be great if we can explicitly state in LangRef why and/or should propagate poison regardless of masking values. aqjune: The sentence of LangRef implicitly describes the semantics of and/or, as @nikic says, but I…
- all binary ops w/vector operands - The likely interpretation would be that		- all binary ops w/vector operands - The likely interpretation would be that
any element overflowing should produce poison for the entire result, but		any element overflowing should produce poison for the entire result, but
the LangRef does not state.		the LangRef does not state.
- Floating point binary ops w/fmf flags other than (nnan, noinfs). It seems		- Floating point binary ops w/fmf flags other than (nnan, noinfs). It seems
strange that only certian flags should be documented as producing poison.		strange that only certian flags should be documented as producing poison.

Cases of clear poison semantics not yet implemented:		Cases of clear poison semantics not yet implemented:
- Exact flags on ashr/lshr produce poison		- Exact flags on ashr/lshr produce poison
- NSW/NUW flags on shl produce poison		- NSW/NUW flags on shl produce poison
- Inbounds flag on getelementptr produce poison		- Inbounds flag on getelementptr produce poison
- fptosi/fptoui (out of bounds input) produce poison		- fptosi/fptoui (out of bounds input) produce poison
- Scalable vector types for insertelement/extractelement		- Scalable vector types for insertelement/extractelement
- Floating point binary ops w/fmf nnan/noinfs flags produce poison		- Floating point binary ops w/fmf nnan/noinfs flags produce poison
*/		*/

llvm/test/Analysis/ScalarEvolution/nsw.ll

	Show First 20 Lines • Show All 217 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: --> {7,+,7}<nuw><%loop>			; CHECK-NEXT: --> {7,+,7}<nuw><%loop>
	%becond = icmp ult i32 %iv, %n			%becond = icmp ult i32 %iv, %n
	br i1 %becond, label %loop, label %leave			br i1 %becond, label %loop, label %leave

	leave:			leave:
	ret void			ret void
	}			}

	define void @bad_postinc_nsw_b(i32 %n) {			; Unlike @bad_postinc_nsw_a(), the SCEV expression of %iv.inc has <nsw> flag
	; CHECK-LABEL: Classifying expressions for: @bad_postinc_nsw_b			; because poison can be propagated through 'and %iv.inc, 0'.
				define void @postinc_poison_prop_through_and(i32 %n) {
				; CHECK-LABEL: Classifying expressions for: @postinc_poison_prop_through_and
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i32 [ 0, %entry ], [ %iv.inc, %loop ]			%iv = phi i32 [ 0, %entry ], [ %iv.inc, %loop ]
	%iv.inc = add nsw i32 %iv, 7			%iv.inc = add nsw i32 %iv, 7
	%iv.inc.and = and i32 %iv.inc, 0			%iv.inc.and = and i32 %iv.inc, 0
	; CHECK: %iv.inc = add nsw i32 %iv, 7			; CHECK: %iv.inc = add nsw i32 %iv, 7
	; CHECK-NEXT: --> {7,+,7}<nuw><%loop>			; CHECK-NEXT: --> {7,+,7}<nuw><nsw><%loop>
	%becond = icmp ult i32 %iv.inc.and, %n			%becond = icmp ult i32 %iv.inc.and, %n
	br i1 %becond, label %loop, label %leave			br i1 %becond, label %loop, label %leave

	leave:			leave:
	ret void			ret void
	}			}
				jdoerfertUnsubmitted Not Done Reply Inline Actions Please verify as the function name suggests there is some problem here. I noted in the previous patch that I think this is correct but we should check and potentially change the function name or record our findings. jdoerfert: Please verify as the function name suggests there is some problem here. I noted in the previous…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions I investigated a bit, and the history was like this: https://github.com/llvm/llvm-project/commit/7e4a64167d4d2e7b0b680fae1706182223047af1 fixed a bug in SCEV which was incorrectly adding no-wrap flag to a post-inc add recurrence. The patch added `isAddRecNeverPoison`, which checks whether it is UB if the given add recurrence is poison, by consecutively calling `propagatesFullPoison` on its use chain. However, the patch wasn't calling `propagatesFullPoison` on the direct uses of the add recurrence, so https://github.com/llvm/llvm-project/commit/a19edc4d15b0dae0210b90615775edd76f021008 added the call & added these two tests (bad_postinc_nsw_a/b). In bad_postinc_nsw_b, `and %iv.inc, 0` was inserted to test whether the first `propagateFullPoison` check successfully blocks propagation of poison. Now `propagatesPoison` returns true, so the test has changed. To summarize, the validity of this change depends on whether `and x, 0` should propagate poison or not. I think it should propagate (as I wrote on another comment). As suggested, I'll change the function name & leave a comment. aqjune: I investigated a bit, and the history was like this: https://github.com/llvm/llvm…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions Does this address your concern? @jdoerfert aqjune: Does this address your concern? @jdoerfert
				aqjuneAuthorUnsubmitted Done Reply Inline Actions ping @jdoerfert aqjune: ping @jdoerfert
				jdoerfertUnsubmitted Not Done Reply Inline Actions OK with me. jdoerfert: OK with me.

	declare void @may_exit() nounwind			declare void @may_exit() nounwind

	define void @pr28012(i32 %n) {			define void @pr28012(i32 %n) {
	; CHECK-LABEL: Classifying expressions for: @pr28012			; CHECK-LABEL: Classifying expressions for: @pr28012
	entry:			entry:
	br label %loop			br label %loop

	Show All 12 Lines

llvm/unittests/Analysis/ValueTrackingTest.cpp

//===- ValueTrackingTest.cpp - ValueTracking tests ------------------------===//		//===- ValueTrackingTest.cpp - ValueTracking tests ------------------------===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

▲ Show 20 Lines • Show All 653 Lines • ▼ Show 20 Lines	parseAssembly(
"define <2 x i32> @test(<2 x i1> %x) {\n"		"define <2 x i32> @test(<2 x i1> %x) {\n"
" %sext = sext <2 x i1> %x to <2 x i32>\n"		" %sext = sext <2 x i1> %x to <2 x i32>\n"
" %A = shufflevector <2 x i32> %sext, <2 x i32> undef, <2 x i32> <i32 0, i32 2>\n"		" %A = shufflevector <2 x i32> %sext, <2 x i32> undef, <2 x i32> <i32 0, i32 2>\n"
" ret <2 x i32> %A\n"		" ret <2 x i32> %A\n"
"}\n");		"}\n");
EXPECT_EQ(ComputeNumSignBits(A, M->getDataLayout()), 1u);		EXPECT_EQ(ComputeNumSignBits(A, M->getDataLayout()), 1u);
}		}

		TEST(ValueTracking, propagatesPoison) {
		std::string AsmHead = "declare i32 @g(i32)\n"
		"define void @f(i32 %x, i32 %y, float %fx, float %fy, "
		"i1 %cond, i8* %p) {\n";
		std::string AsmTail = " ret void\n}";
		// (propagates poison?, IR instruction)
		SmallVector<std::pair<bool, std::string>, 32> Data = {
		{true, "add i32 %x, %y"},
		{true, "add nsw nuw i32 %x, %y"},
		{true, "ashr i32 %x, %y"},
		{true, "lshr exact i32 %x, 31"},
		{true, "fcmp oeq float %fx, %fy"},
		{true, "icmp eq i32 %x, %y"},
		{true, "getelementptr i8, i8* %p, i32 %x"},
		{true, "getelementptr inbounds i8, i8* %p, i32 %x"},
		{true, "bitcast float %fx to i32"},
		{false, "select i1 %cond, i32 %x, i32 %y"},
		{false, "freeze i32 %x"},
		{true, "udiv i32 %x, %y"},
		{true, "urem i32 %x, %y"},
		{true, "sdiv exact i32 %x, %y"},
		{true, "srem i32 %x, %y"},
		{false, "call i32 @g(i32 %x)"}};

		std::string AssemblyStr = AsmHead;
		for (auto &Itm : Data)
		AssemblyStr += Itm.second + "\n";
		AssemblyStr += AsmTail;

		LLVMContext Context;
		SMDiagnostic Error;
		auto M = parseAssemblyString(AssemblyStr, Error, Context);
		assert(M && "Bad assembly?");

		auto *F = M->getFunction("f");
		assert(F && "Bad assembly?");

		auto &BB = F->getEntryBlock();

		int Index = 0;
		for (auto &I : BB) {
		if (isa<ReturnInst>(&I))
		break;
		EXPECT_EQ(propagatesPoison(&I), Data[Index].first)
		<< "Incorrect answer at instruction " << Index << " = " << I;
		Index++;
		}
		}

TEST(ValueTracking, canCreatePoison) {		TEST(ValueTracking, canCreatePoison) {
std::string AsmHead =		std::string AsmHead =
"declare i32 @g(i32)\n"		"declare i32 @g(i32)\n"
"define void @f(i32 %x, i32 %y, float %fx, float %fy, i1 %cond, "		"define void @f(i32 %x, i32 %y, float %fx, float %fy, i1 %cond, "
"<4 x i32> %vx, <4 x i32> %vx2, <vscale x 4 x i32> %svx, i8* %p) {\n";		"<4 x i32> %vx, <4 x i32> %vx2, <vscale x 4 x i32> %svx, i8* %p) {\n";
std::string AsmTail = " ret void\n}";		std::string AsmTail = " ret void\n}";
// (can create poison?, IR instruction)		// (can create poison?, IR instruction)
SmallVector<std::pair<bool, std::string>, 32> Data = {		SmallVector<std::pair<bool, std::string>, 32> Data = {
▲ Show 20 Lines • Show All 468 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 261438

llvm/include/llvm/Analysis/ValueTracking.h

llvm/lib/Analysis/ValueTracking.cpp

llvm/lib/Transforms/Instrumentation/PoisonChecking.cpp

llvm/test/Analysis/ScalarEvolution/nsw.ll

llvm/unittests/Analysis/ValueTrackingTest.cpp

[ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc.
ClosedPublic