This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/Transforms/Instrumentation/
-
llvm/
-
Transforms/
-
Instrumentation/
-
PoisonChecking.h
-
lib/
-
IR/
-
Instruction.cpp
-
Passes/
-
PassBuilder.cpp
-
PassRegistry.def
-
Transforms/Instrumentation/
-
Instrumentation/
-
CMakeLists.txt
-
PoisonChecking.cpp
-
test/Instrumentation/PoisonChecking/
-
Instrumentation/
-
PoisonChecking/
-
basic-flag-validation.ll
-
ub-checks.ll

Differential D64215

Add a transform pass to make the executable semantics of poison explicit in the IR
ClosedPublic

Authored by reames on Jul 4 2019, 11:46 AM.

Download Raw Diff

Details

Reviewers

nikic
sanjoy
nlopes
regehr
manasij7479
aqjune

Commits

rGf47a313e717a: Add a transform pass to make the executable semantics of poison explicit in the…
rL365536: Add a transform pass to make the executable semantics of poison explicit in the…

Summary

Implements a transform pass which instruments IR such that poison semantics are made explicit. That is, it provides a (possibly partial) executable semantics for every instruction w.r.t. poison as specified in the LLVM LangRef. There are obvious parallels to the sanitizer tools, but this pass is focused purely on the semantics of LLVM IR, not any particular source language.

The target audience for this tool is developers working on or targetting LLVM from a frontend. The idea is to be able to take arbitrary IR (with the assumption of known inputs), and evaluate it concretely after having made poison semantics explicit to detect cases where either a) the original code executes UB, or b) a transform pass introduces UB which didn't exist in the original program.

At the moment, this is mostly the framework and still needs to be fleshed out. By reusing existing code we have decent coverage, but there's a lot of cases not yet handled. What's here is good enough to handle interesting cases though; for instance, one of the recent LFTR bugs involved UB being triggered by integer induction variables with nsw/nuw flags would be reported by the current code.

(See comment in PoisonChecking.cpp for full explanation and context)

Before this lands, it obviously needs a bunch of tests added, but for the moment, I wanted to collect feedback on the idea.

Diff Detail

Repository: rL LLVM

Event Timeline

reames created this revision.Jul 4 2019, 11:46 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 4 2019, 11:46 AM

Herald added subscribers: bollu, mgorny, mcrosier. · View Herald Transcript

I like the idea. I browsed through the code but I failed to see where the poison is exposed to the user, e.g. through an assert or printf call. (Polly has a neat printf builder if that is of interest.)

sanjoy added reviewers: manasij7479, aqjune.Jul 4 2019, 12:51 PM

I think this is an excellent idea!!

I personally won't have time to carefully review this & following patches anytime soon (I'm happy to have high level discussions though), but I'm hoping the other reviewers you and I added will have time.

lib/Transforms/Instrumentation/PoisonChecking.cpp
1 ↗	(On Diff #208064)	Maybe just call this PoisonSanitizer?
41 ↗	(On Diff #208064)	This will have false positives as long as `undef` is a thing right? Unless the instrumentation added by this pass will produce poison if any value of `undef` produces poison (naively it seems this would require calling into a SAT solver at runtime)?

In D64215#1570640, @jdoerfert wrote:

I like the idea. I browsed through the code but I failed to see where the poison is exposed to the user, e.g. through an assert or printf call. (Polly has a neat printf builder if that is of interest.)

At the moment, I have a placeholder trap(bool) signature used, but that's one of the areas which clearly needs work.

lib/Transforms/Instrumentation/PoisonChecking.cpp
1 ↗	(On Diff #208064)	I could easily be convinced, but since the existing convention is that sanatizers are compiler user tools, not compiler developer tools, I thought it might be better to pick a different name. Weakly held opinion, happy to go with whatever reviewers prefer.
41 ↗	(On Diff #208064)	I don't actually follow what you're trying to say. Can you maybe rephrase w/an example which concerns you? For context, I'm focusing on poison for the moment, but I think a completely reasonable thing would be to have a parallel approach for undef (1 bit per bit of original value), and then cross propagate if needed. I have a much weaker understanding of the undef rules, so I started with poison since that seemed a bit more clearly specified and recently discussed.

aqjune added inline comments.Jul 4 2019, 5:08 PM

lib/Transforms/Instrumentation/PoisonChecking.cpp
105 ↗	(On Diff #208064)	This is great. Is it assumed that poison is always value-wise? In other words, if the result is 1, does it mean either: (1) the value is always fully poison (2) the value has at least one poison 'bit'? In the future, there might exist several kinds of poison semantics that people would like to test (e.g. bitwise poison vs. valuewise poison) - so I think it is a good idea to describe which poison semantics this visitAdd is assuming.
124 ↗	(On Diff #208064)	How about merging the contents of `visitSub` and `visitAdd` as they are equivalent except the intrinsics name? It can be reused for `visitMul` as well.

reames marked 3 inline comments as done.Jul 4 2019, 5:21 PM

reames added inline comments.

lib/Transforms/Instrumentation/PoisonChecking.cpp
41 ↗	(On Diff #208064)	For context, the part which has me confused was your claim of a false positive. I definitely see how undef can cause a false negative if not accounted for in the poison tracking.
105 ↗	(On Diff #208064)	All of this is assuming a single poison bit since that is our current semantics. I'll leave it up to anyone wishing to evaluate alternate semantics to implement them.
124 ↗	(On Diff #208064)	I agree the code structure needs some work here. I'm experimenting with variations on that to see what looks least ugly and will upload a revised patch later tonight or tomorrow.

shchenz added a subscriber: shchenz.Jul 4 2019, 5:24 PM

Restructure code to be more readable.

Flesh out some test cases

Herald added a subscriber: jfb. · View Herald TranscriptJul 4 2019, 9:14 PM

Add a case of imprecision to the comments, and move away from terminology around false positive/false negative since that dependents heavily on the use case. (i.e. is finding more UB after a transform a false negative in the analysis of the previous IR, or a false positive in the comparison?)

fhahn added a subscriber: fhahn.Jul 5 2019, 4:05 AM

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.
This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I guess this is more a FYI. If you want to support a significant chunk of LLVM, it's going to be a lot of work. Alive2 already has ~1 year of development, and has already found a few bugs in LLVM.

sanjoy added inline comments.Jul 5 2019, 10:19 AM

lib/Transforms/Instrumentation/PoisonChecking.cpp
41 ↗	(On Diff #208064)	By "false positive" I meant "incorrectly concluding that an optimization is buggy because this pass fails to detect poison before the optimization but is able to detect poison after the optimization". Say you have this program: int c = INT_SMAX +nsw undef; IIUC you'll instrument this to: int c, bool ov = add_with_overflow(INT_SMAX, undef) if (ov) trap(); which will for only some values of `undef`. In particular, it may be (depending on the phase of the moon) that running this program does not trap. But a pass can legitimately transform the program to: int c = INT_SMAX +nsw 1; // fold undef to 1 which will deterministically trap after instrumentation. So if we are unlucky it will "look like" the transformation has a bug -- before the xform the instrumented IR did not trap, but after the xform it traps.

In D64215#1571424, @nlopes wrote:

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I'd love to explore options for sharing the semantics here. What form does Alive2 express them in?

I guess this is more a FYI. If you want to support a significant chunk of LLVM, it's going to be a lot of work. Alive2 already has ~1 year of development, and has already found a few bugs in LLVM.

I'm not expecting this to ever get 100% coverage, but I do have a backlog of miscompiles I'm trying to reduce. :)

lib/Transforms/Instrumentation/PoisonChecking.cpp
41 ↗	(On Diff #208064)	Yep, completely agree. This is a false positive for the "find problematic transform" case, but a false negative for the "does this program execute UB" case. I'd added a comment about this to the source, and this is the case that made me realize the false negative/positive terminology was super confusing without being specific about the use case. In other words, I'm okay with this for the moment. :) Future work may explore this further.

In D64215#1572070, @reames wrote:

In D64215#1571424, @nlopes wrote:

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)

This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I'd love to explore options for sharing the semantics here. What form does Alive2 express them in?

That's still a unsolved research problem. No one really knows how to share semantics still.
The semantics in Alive2 are written in C++, using an embedded expression language. While it is potentially possible to reuse that somewhere else, it isn't trivial. See e.g. the ir/instr.cpp file.

I'll just add that we (my students and I) are interested in making the UB semantics easily pluggable / switchable. They'll still (almost certainly) be in C++, but we'd like to factor these out so at least they're cleanly separated and easily swapped out.

In D64215#1572285, @nlopes wrote:

In D64215#1572070, @reames wrote:

In D64215#1571424, @nlopes wrote:

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)

I finally got it working, required a couple changes to the CMakeFiles and an LD_PRELOAD (for unclear reasons). However, it doesn't look like the scope of the alive-tv tool is anywhere near wide enough for my purposes.

Correct me if I'm wrong, but it looks like it can't handle loops at all right? And use of any memory seams to trigger timeouts? (Even for trivially identical IR?) Just making sure there's no error between keyboard and chair. :)

The problems I'm looking at are definitely not single block problems, unfortunately.

Can I get an LGTM on this? It seems like folks are interested in the approach, and it would be much easier to iterate in tree if needed.

I think Alive and this are complementary. Alive is useful in proving the correctness of _new_ transformations, whereas this pass should let us quickly figure out if a large unreduced application is misbehaving it branches on poison.

lib/Transforms/Instrumentation/PoisonChecking.cpp
91 ↗	(On Diff #208113)	This looks like it can live on `IRBuilder`.
95 ↗	(On Diff #208113)	Is all of this necessary? I'd expect one round of instsimplify to fix up this kind of stuff.
166 ↗	(On Diff #208113)	You can do a single iterator lookup here instead of two.
181 ↗	(On Diff #208113)	It seems like the trap function assert that the condition is false? If yes then it should probably be called something else I think (something like `assert_is_false` would be great).
226 ↗	(On Diff #208113)	Can't you directly do `for (Value *V: I.operands())`?
267 ↗	(On Diff #208113)	There is a `PreservedAnalyses::none()`.

This revision is now accepted and ready to land.Jul 8 2019, 9:52 PM

reames marked 6 inline comments as done.Jul 9 2019, 11:41 AM

reames added inline comments.

lib/Transforms/Instrumentation/PoisonChecking.cpp
91 ↗	(On Diff #208113)	Agreed, plan to move it there once I reapply the split out fix which got reverted due to a couple of tests I didn't catch.
95 ↗	(On Diff #208113)	It makes the test much, much easier to read. :)

Closed by commit rL365536: Add a transform pass to make the executable semantics of poison explicit in the… (authored by reames). · Explain WhyJul 9 2019, 11:50 AM

This revision was automatically updated to reflect the committed changes.

In D64215#1574483, @reames wrote:

In D64215#1572285, @nlopes wrote:

In D64215#1572070, @reames wrote:

In D64215#1571424, @nlopes wrote:

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)

I finally got it working, required a couple changes to the CMakeFiles and an LD_PRELOAD (for unclear reasons). However, it doesn't look like the scope of the alive-tv tool is anywhere near wide enough for my purposes.

Correct me if I'm wrong, but it looks like it can't handle loops at all right? And use of any memory seams to trigger timeouts? (Even for trivially identical IR?) Just making sure there's no error between keyboard and chair. :)

Cool!
Is true that Alive2 doesn't support loops yet; that's in the todo list. It can handle branches and Phi nodes well, though.
Memory is being implemented ATM. Proofs with undef are quite hard. We don't have a fast-path for identical IR yet, it always goes to the expensive proof algorithm.
You can increase the timeout. With, say, a 10 seconds timeout it's very unlikely there's a bug since the SMT solver would very likely find it in that time.

Please keep me in the loop for these bugs you are seeing. I'm happy to help debug and/or just being in the loop of what happened if this might be an issue with IR semantics or simply a misunderstanding in the semantics. Thank you!

In D64215#1572285, @nlopes wrote:

In D64215#1572070, @reames wrote:

In D64215#1571424, @nlopes wrote:

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)

This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I'd love to explore options for sharing the semantics here. What form does Alive2 express them in?

That's still a unsolved research problem. No one really knows how to share semantics still.
The semantics in Alive2 are written in C++, using an embedded expression language. While it is potentially possible to reuse that somewhere else, it isn't trivial. See e.g. the ir/instr.cpp file.

One thought on sharing.

From what I can tell from a quick look at the code you mentioned, it looks like you're parsing IR into an expression language, then rewriting the expressions to propagate poison - in a fairly similar manner to this code, but over your expression language - and then translating that expression language to SMT. Is that a good high level summary?

If you used this framework (not necessarily the pass, but the utilities) to rewrite the IR so as to make poison semantics explicit before converting to your expression language, you might be able to factor out a good portion of the poison reasoning from Alive2. It'd be a fairly major design though, so not sure if that's worth it to you or not.

In D64215#1577194, @reames wrote:

In D64215#1572285, @nlopes wrote:

In D64215#1572070, @reames wrote:

In D64215#1571424, @nlopes wrote:

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)

This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I'd love to explore options for sharing the semantics here. What form does Alive2 express them in?

That's still a unsolved research problem. No one really knows how to share semantics still.
The semantics in Alive2 are written in C++, using an embedded expression language. While it is potentially possible to reuse that somewhere else, it isn't trivial. See e.g. the ir/instr.cpp file.

One thought on sharing.

From what I can tell from a quick look at the code you mentioned, it looks like you're parsing IR into an expression language, then rewriting the expressions to propagate poison - in a fairly similar manner to this code, but over your expression language - and then translating that expression language to SMT. Is that a good high level summary?

It's like this:

LLVM IR -> Alive2 IR (close to LLVM, but smaller)
Alive2 IR -> SMT expressions. As you say, it describes the poison semantics in terms of SMT expressions. The underlying engine handles propagation of undef.
VcGen: combine SMT expressions to produce a theorem to be proved

If you used this framework (not necessarily the pass, but the utilities) to rewrite the IR so as to make poison semantics explicit before converting to your expression language, you might be able to factor out a good portion of the poison reasoning from Alive2. It'd be a fairly major design though, so not sure if that's worth it to you or not.

We could go back from SMT expressions to LLVM IR, that's not very difficult (for arithmetic and other simple things, at least). The complication is the vcgen bit. Doable, but it's quite some work.
I think sharing semantics is not an immediate goal for us. We can share high-level semantics through LangRef and fix it where needed.

lebedev.ri mentioned this in D71799: [Attributor] AAUndefinedBehavior: Check for branches on undef value..Dec 22 2019, 1:17 AM

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Transforms/

Instrumentation/

PoisonChecking.h

25 lines

lib/

IR/

Instruction.cpp

2 lines

Passes/

PassBuilder.cpp

1 line

PassRegistry.def

1 line

Transforms/

Instrumentation/

CMakeLists.txt

1 line

PoisonChecking.cpp

283 lines

test/

Instrumentation/

PoisonChecking/

basic-flag-validation.ll

158 lines

ub-checks.ll

137 lines

Diff 208766

llvm/trunk/include/llvm/Transforms/Instrumentation/PoisonChecking.h

				//===- PoisonChecking.h - ---------------------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//


				#ifndef LLVM_TRANSFORMS_INSTRUMENTATION_POISON_CHECKING_H
				#define LLVM_TRANSFORMS_INSTRUMENTATION_POISON_CHECKING_H

				#include "llvm/IR/PassManager.h"

				namespace llvm {

				struct PoisonCheckingPass : public PassInfoMixin<PoisonCheckingPass> {
				PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
				PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
				};

				}


				#endif // LLVM_TRANSFORMS_INSTRUMENTATION_POISON_CHECKING_H

llvm/trunk/lib/IR/Instruction.cpp

Show First 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	void Instruction::dropPoisonGeneratingFlags() {
case Instruction::LShr:		case Instruction::LShr:
cast<PossiblyExactOperator>(this)->setIsExact(false);		cast<PossiblyExactOperator>(this)->setIsExact(false);
break;		break;

case Instruction::GetElementPtr:		case Instruction::GetElementPtr:
cast<GetElementPtrInst>(this)->setIsInBounds(false);		cast<GetElementPtrInst>(this)->setIsInBounds(false);
break;		break;
}		}
		// TODO: FastMathFlags!
}		}


bool Instruction::isExact() const {		bool Instruction::isExact() const {
return cast<PossiblyExactOperator>(this)->isExact();		return cast<PossiblyExactOperator>(this)->isExact();
}		}

void Instruction::setFast(bool B) {		void Instruction::setFast(bool B) {
assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");
cast<FPMathOperator>(this)->setFast(B);		cast<FPMathOperator>(this)->setFast(B);
}		}
▲ Show 20 Lines • Show All 602 Lines • Show Last 20 Lines

llvm/trunk/lib/Passes/PassBuilder.cpp

	Show First 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
	#include "llvm/Transforms/Instrumentation/CGProfile.h"			#include "llvm/Transforms/Instrumentation/CGProfile.h"
	#include "llvm/Transforms/Instrumentation/ControlHeightReduction.h"			#include "llvm/Transforms/Instrumentation/ControlHeightReduction.h"
	#include "llvm/Transforms/Instrumentation/GCOVProfiler.h"			#include "llvm/Transforms/Instrumentation/GCOVProfiler.h"
	#include "llvm/Transforms/Instrumentation/HWAddressSanitizer.h"			#include "llvm/Transforms/Instrumentation/HWAddressSanitizer.h"
	#include "llvm/Transforms/Instrumentation/InstrOrderFile.h"			#include "llvm/Transforms/Instrumentation/InstrOrderFile.h"
	#include "llvm/Transforms/Instrumentation/InstrProfiling.h"			#include "llvm/Transforms/Instrumentation/InstrProfiling.h"
	#include "llvm/Transforms/Instrumentation/MemorySanitizer.h"			#include "llvm/Transforms/Instrumentation/MemorySanitizer.h"
	#include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"			#include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
				#include "llvm/Transforms/Instrumentation/PoisonChecking.h"
	#include "llvm/Transforms/Instrumentation/ThreadSanitizer.h"			#include "llvm/Transforms/Instrumentation/ThreadSanitizer.h"
	#include "llvm/Transforms/Scalar/ADCE.h"			#include "llvm/Transforms/Scalar/ADCE.h"
	#include "llvm/Transforms/Scalar/AlignmentFromAssumptions.h"			#include "llvm/Transforms/Scalar/AlignmentFromAssumptions.h"
	#include "llvm/Transforms/Scalar/BDCE.h"			#include "llvm/Transforms/Scalar/BDCE.h"
	#include "llvm/Transforms/Scalar/CallSiteSplitting.h"			#include "llvm/Transforms/Scalar/CallSiteSplitting.h"
	#include "llvm/Transforms/Scalar/ConstantHoisting.h"			#include "llvm/Transforms/Scalar/ConstantHoisting.h"
	#include "llvm/Transforms/Scalar/CorrelatedValuePropagation.h"			#include "llvm/Transforms/Scalar/CorrelatedValuePropagation.h"
	#include "llvm/Transforms/Scalar/DCE.h"			#include "llvm/Transforms/Scalar/DCE.h"
	▲ Show 20 Lines • Show All 2,189 Lines • Show Last 20 Lines

llvm/trunk/lib/Passes/PassRegistry.def

	Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines
	MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())			MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())
	MODULE_PASS("sample-profile", SampleProfileLoaderPass())			MODULE_PASS("sample-profile", SampleProfileLoaderPass())
	MODULE_PASS("strip-dead-prototypes", StripDeadPrototypesPass())			MODULE_PASS("strip-dead-prototypes", StripDeadPrototypesPass())
	MODULE_PASS("synthetic-counts-propagation", SyntheticCountsPropagation())			MODULE_PASS("synthetic-counts-propagation", SyntheticCountsPropagation())
	MODULE_PASS("wholeprogramdevirt", WholeProgramDevirtPass(nullptr, nullptr))			MODULE_PASS("wholeprogramdevirt", WholeProgramDevirtPass(nullptr, nullptr))
	MODULE_PASS("verify", VerifierPass())			MODULE_PASS("verify", VerifierPass())
	MODULE_PASS("asan-module", ModuleAddressSanitizerPass(/CompileKernel=/false, false, true, false))			MODULE_PASS("asan-module", ModuleAddressSanitizerPass(/CompileKernel=/false, false, true, false))
	MODULE_PASS("kasan-module", ModuleAddressSanitizerPass(/CompileKernel=/true, false, true, false))			MODULE_PASS("kasan-module", ModuleAddressSanitizerPass(/CompileKernel=/true, false, true, false))
				MODULE_PASS("poison-checking", PoisonCheckingPass())
	#undef MODULE_PASS			#undef MODULE_PASS

	#ifndef CGSCC_ANALYSIS			#ifndef CGSCC_ANALYSIS
	#define CGSCC_ANALYSIS(NAME, CREATE_PASS)			#define CGSCC_ANALYSIS(NAME, CREATE_PASS)
	#endif			#endif
	CGSCC_ANALYSIS("no-op-cgscc", NoOpCGSCCAnalysis())			CGSCC_ANALYSIS("no-op-cgscc", NoOpCGSCCAnalysis())
	CGSCC_ANALYSIS("fam-proxy", FunctionAnalysisManagerCGSCCProxy())			CGSCC_ANALYSIS("fam-proxy", FunctionAnalysisManagerCGSCCProxy())
	CGSCC_ANALYSIS("pass-instrumentation", PassInstrumentationAnalysis(PIC))			CGSCC_ANALYSIS("pass-instrumentation", PassInstrumentationAnalysis(PIC))
	▲ Show 20 Lines • Show All 218 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Instrumentation/CMakeLists.txt

	add_llvm_library(LLVMInstrumentation			add_llvm_library(LLVMInstrumentation
	AddressSanitizer.cpp			AddressSanitizer.cpp
	BoundsChecking.cpp			BoundsChecking.cpp
	CGProfile.cpp			CGProfile.cpp
	ControlHeightReduction.cpp			ControlHeightReduction.cpp
	DataFlowSanitizer.cpp			DataFlowSanitizer.cpp
	GCOVProfiling.cpp			GCOVProfiling.cpp
	MemorySanitizer.cpp			MemorySanitizer.cpp
	IndirectCallPromotion.cpp			IndirectCallPromotion.cpp
	Instrumentation.cpp			Instrumentation.cpp
	InstrOrderFile.cpp			InstrOrderFile.cpp
	InstrProfiling.cpp			InstrProfiling.cpp
	PGOInstrumentation.cpp			PGOInstrumentation.cpp
	PGOMemOPSizeOpt.cpp			PGOMemOPSizeOpt.cpp
				PoisonChecking.cpp
	SanitizerCoverage.cpp			SanitizerCoverage.cpp
	ThreadSanitizer.cpp			ThreadSanitizer.cpp
	HWAddressSanitizer.cpp			HWAddressSanitizer.cpp

	ADDITIONAL_HEADER_DIRS			ADDITIONAL_HEADER_DIRS
	${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms			${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms

	DEPENDS			DEPENDS
	intrinsics_gen			intrinsics_gen
	)			)

llvm/trunk/lib/Transforms/Instrumentation/PoisonChecking.cpp

				//===- PoisonChecking.cpp - -----------------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// Implements a transform pass which instruments IR such that poison semantics
				// are made explicit. That is, it provides a (possibly partial) executable
				// semantics for every instruction w.r.t. poison as specified in the LLVM
				// LangRef. There are obvious parallels to the sanitizer tools, but this pass
				// is focused purely on the semantics of LLVM IR, not any particular source
				// language. If you're looking for something to see if your C/C++ contains
				// UB, this is not it.
				//
				// The rewritten semantics of each instruction will include the following
				// components:
				//
				// 1) The original instruction, unmodified.
				// 2) A propagation rule which translates dynamic information about the poison
				// state of each input to whether the dynamic output of the instruction
				// produces poison.
				// 3) A flag validation rule which validates any poison producing flags on the
				// instruction itself (e.g. checks for overflow on nsw).
				// 4) A check rule which traps (to a handler function) if this instruction must
				// execute undefined behavior given the poison state of it's inputs.
				//
				// At the moment, the UB detection is done in a best effort manner; that is,
				// the resulting code may produce a false negative result (not report UB when
				// it actually exists according to the LangRef spec), but should never produce
				// a false positive (report UB where it doesn't exist). The intention is to
				// eventually support a "strict" mode which never dynamically reports a false
				// negative at the cost of rejecting some valid inputs to translation.
				//
				// Use cases for this pass include:
				// - Understanding (and testing!) the implications of the definition of poison
				// from the LangRef.
				// - Validating the output of a IR fuzzer to ensure that all programs produced
				// are well defined on the specific input used.
				// - Finding/confirming poison specific miscompiles by checking the poison
				// status of an input/IR pair is the same before and after an optimization
				// transform.
				// - Checking that a bugpoint reduction does not introduce UB which didn't
				// exist in the original program being reduced.
				//
				// The major sources of inaccuracy are currently:
				// - Most validation rules not yet implemented for instructions with poison
				// relavant flags. At the moment, only nsw/nuw on add/sub are supported.
				// - UB which is control dependent on a branch on poison is not yet
				// reported. Currently, only data flow dependence is modeled.
				// - Poison which is propagated through memory is not modeled. As such,
				// storing poison to memory and then reloading it will cause a false negative
				// as we consider the reloaded value to not be poisoned.
				// - Poison propagation across function boundaries is not modeled. At the
				// moment, all arguments and return values are assumed not to be poison.
				// - Undef is not modeled. In particular, the optimizer's freedom to pick
				// concrete values for undef bits so as to maximize potential for producing
				// poison is not modeled.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Transforms/Instrumentation/PoisonChecking.h"
				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/MemoryBuiltins.h"
				#include "llvm/Analysis/ValueTracking.h"
				#include "llvm/IR/InstVisitor.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/PatternMatch.h"
				#include "llvm/Support/Debug.h"

				using namespace llvm;

				#define DEBUG_TYPE "poison-checking"

				static cl::opt<bool>
				LocalCheck("poison-checking-function-local",
				cl::init(false),
				cl::desc("Check that returns are non-poison (for testing)"));


				static bool isConstantFalse(Value* V) {
				assert(V->getType()->isIntegerTy(1));
				if (auto *CI = dyn_cast<ConstantInt>(V))
				return CI->isZero();
				return false;
				}

				static Value buildOrChain(IRBuilder<> &B, ArrayRef<Value> Ops) {
				if (Ops.size() == 0)
				return B.getFalse();
				unsigned i = 0;
				for (; i < Ops.size() && isConstantFalse(Ops[i]); i++) {}
				if (i == Ops.size())
				return B.getFalse();
				Value *Accum = Ops[i++];
				for (; i < Ops.size(); i++)
				if (!isConstantFalse(Ops[i]))
				Accum = B.CreateOr(Accum, Ops[i]);
				return Accum;
				}

				static void generatePoisonChecksForBinOp(Instruction &I,
				SmallVector<Value*, 2> &Checks) {
				assert(isa<BinaryOperator>(I));

				IRBuilder<> B(&I);
				Value *LHS = I.getOperand(0);
				Value *RHS = I.getOperand(1);
				switch (I.getOpcode()) {
				default:
				return;
				case Instruction::Add: {
				if (I.hasNoSignedWrap()) {
				auto *OverflowOp =
				B.CreateBinaryIntrinsic(Intrinsic::sadd_with_overflow, LHS, RHS);
				Checks.push_back(B.CreateExtractValue(OverflowOp, 1));
				}
				if (I.hasNoUnsignedWrap()) {
				auto *OverflowOp =
				B.CreateBinaryIntrinsic(Intrinsic::uadd_with_overflow, LHS, RHS);
				Checks.push_back(B.CreateExtractValue(OverflowOp, 1));
				}
				break;
				}
				case Instruction::Sub: {
				if (I.hasNoSignedWrap()) {
				auto *OverflowOp =
				B.CreateBinaryIntrinsic(Intrinsic::ssub_with_overflow, LHS, RHS);
				Checks.push_back(B.CreateExtractValue(OverflowOp, 1));
				}
				if (I.hasNoUnsignedWrap()) {
				auto *OverflowOp =
				B.CreateBinaryIntrinsic(Intrinsic::usub_with_overflow, LHS, RHS);
				Checks.push_back(B.CreateExtractValue(OverflowOp, 1));
				}
				break;
				}
				case Instruction::Mul: {
				if (I.hasNoSignedWrap()) {
				auto *OverflowOp =
				B.CreateBinaryIntrinsic(Intrinsic::smul_with_overflow, LHS, RHS);
				Checks.push_back(B.CreateExtractValue(OverflowOp, 1));
				}
				if (I.hasNoUnsignedWrap()) {
				auto *OverflowOp =
				B.CreateBinaryIntrinsic(Intrinsic::umul_with_overflow, LHS, RHS);
				Checks.push_back(B.CreateExtractValue(OverflowOp, 1));
				}
				break;
				}
				};
				}

				static Value* generatePoisonChecks(Instruction &I) {
				IRBuilder<> B(&I);
				SmallVector<Value*, 2> Checks;
				if (isa<BinaryOperator>(I))
				generatePoisonChecksForBinOp(I, Checks);
				return buildOrChain(B, Checks);
				}

				static Value getPoisonFor(DenseMap<Value , Value > &ValToPoison, Value V) {
				auto Itr = ValToPoison.find(V);
				if (Itr != ValToPoison.end())
				return Itr->second;
				if (isa<Constant>(V)) {
				return ConstantInt::getFalse(V->getContext());
				}
				// Return false for unknwon values - this implements a non-strict mode where
				// unhandled IR constructs are simply considered to never produce poison. At
				// some point in the future, we probably want a "strict mode" for testing if
				// nothing else.
				return ConstantInt::getFalse(V->getContext());
				}

				static void CreateAssert(IRBuilder<> &B, Value *Cond) {
				assert(Cond->getType()->isIntegerTy(1));
				if (auto *CI = dyn_cast<ConstantInt>(Cond))
				if (CI->isAllOnesValue())
				return;

				Module *M = B.GetInsertBlock()->getModule();
				M->getOrInsertFunction("__poison_checker_assert",
				Type::getVoidTy(M->getContext()),
				Type::getInt1Ty(M->getContext()));
				Function *TrapFunc = M->getFunction("__poison_checker_assert");
				B.CreateCall(TrapFunc, Cond);
				}

				static void CreateAssertNot(IRBuilder<> &B, Value *Cond) {
				assert(Cond->getType()->isIntegerTy(1));
				CreateAssert(B, B.CreateNot(Cond));
				}

				static bool rewrite(Function &F) {
				auto * const Int1Ty = Type::getInt1Ty(F.getContext());

				DenseMap<Value , Value > ValToPoison;

				for (BasicBlock &BB : F)
				for (auto I = BB.begin(); isa<PHINode>(&*I); I++) {
				auto OldPHI = cast<PHINode>(&I);
				auto *NewPHI = PHINode::Create(Int1Ty,
				OldPHI->getNumIncomingValues());
				for (unsigned i = 0; i < OldPHI->getNumIncomingValues(); i++)
				NewPHI->addIncoming(UndefValue::get(Int1Ty),
				OldPHI->getIncomingBlock(i));
				NewPHI->insertBefore(OldPHI);
				ValToPoison[OldPHI] = NewPHI;
				}

				for (BasicBlock &BB : F)
				for (Instruction &I : BB) {
				if (isa<PHINode>(I)) continue;

				IRBuilder<> B(cast<Instruction>(&I));
				if (Value Op = const_cast<Value>(getGuaranteedNonFullPoisonOp(&I)))
				CreateAssertNot(B, getPoisonFor(ValToPoison, Op));

				if (LocalCheck)
				if (auto *RI = dyn_cast<ReturnInst>(&I))
				if (RI->getNumOperands() != 0) {
				Value *Op = RI->getOperand(0);
				CreateAssertNot(B, getPoisonFor(ValToPoison, Op));
				}

				SmallVector<Value*, 4> Checks;
				if (propagatesFullPoison(&I))
				for (Value *V : I.operands())
				Checks.push_back(getPoisonFor(ValToPoison, V));

				if (auto *Check = generatePoisonChecks(I))
				Checks.push_back(Check);
				ValToPoison[&I] = buildOrChain(B, Checks);
				}

				for (BasicBlock &BB : F)
				for (auto I = BB.begin(); isa<PHINode>(&*I); I++) {
				auto OldPHI = cast<PHINode>(&I);
				if (!ValToPoison.count(OldPHI))
				continue; // skip the newly inserted phis
				auto *NewPHI = cast<PHINode>(ValToPoison[OldPHI]);
				for (unsigned i = 0; i < OldPHI->getNumIncomingValues(); i++) {
				auto *OldVal = OldPHI->getIncomingValue(i);
				NewPHI->setIncomingValue(i, getPoisonFor(ValToPoison, OldVal));
				}
				}
				return true;
				}


				PreservedAnalyses PoisonCheckingPass::run(Module &M,
				ModuleAnalysisManager &AM) {
				bool Changed = false;
				for (auto &F : M)
				Changed \|= rewrite(F);

				return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();
				}

				PreservedAnalyses PoisonCheckingPass::run(Function &F,
				FunctionAnalysisManager &AM) {
				return rewrite(F) ? PreservedAnalyses::none() : PreservedAnalyses::all();
				}


				/* Major TODO Items:
				- Control dependent poison UB
				- Strict mode - (i.e. must analyze every operand)
				- Poison through memory
				- Function ABIs

				Minor TODO items:
				- Add propagation rules for and/or instructions
				- Add hasPoisonFlags predicate to ValueTracking
				- Add poison check rules for:
				- exact flags, out of bounds operands
				- inbounds (can't be strict due to unknown allocation sizes)
				- fmf and fp casts
				*/

llvm/trunk/test/Instrumentation/PoisonChecking/basic-flag-validation.ll

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -passes=poison-checking -S -poison-checking-function-local < %s \| FileCheck %s

				; This file contains tests to exercise the custom flag validation rules

				define i32 @add_noflags(i32 %a, i32 %b) {
				; CHECK-LABEL: @add_noflags(
				; CHECK-NEXT: [[RES:%.]] = add i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = add i32 %a, %b
				ret i32 %res
				}

				define i32 @add_nsw(i32 %a, i32 %b) {
				; CHECK-LABEL: @add_nsw(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[RES:%.*]] = add nsw i32 [[A]], [[B]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = add nsw i32 %a, %b
				ret i32 %res
				}

				define i32 @add_nuw(i32 %a, i32 %b) {
				; CHECK-LABEL: @add_nuw(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[RES:%.*]] = add nuw i32 [[A]], [[B]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = add nuw i32 %a, %b
				ret i32 %res
				}

				define i32 @add_nsw_nuw(i32 %a, i32 %b) {
				; CHECK-LABEL: @add_nsw_nuw(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[TMP3:%.*]] = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 [[A]], i32 [[B]])
				; CHECK-NEXT: [[TMP4:%.*]] = extractvalue { i32, i1 } [[TMP3]], 1
				; CHECK-NEXT: [[TMP5:%.*]] = or i1 [[TMP2]], [[TMP4]]
				; CHECK-NEXT: [[RES:%.*]] = add nuw nsw i32 [[A]], [[B]]
				; CHECK-NEXT: [[TMP6:%.*]] = xor i1 [[TMP5]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP6]])
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = add nsw nuw i32 %a, %b
				ret i32 %res
				}

				define i32 @sub_noflags(i32 %a, i32 %b) {
				; CHECK-LABEL: @sub_noflags(
				; CHECK-NEXT: [[RES:%.]] = sub i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = sub i32 %a, %b
				ret i32 %res
				}

				define i32 @sub_nsw(i32 %a, i32 %b) {
				; CHECK-LABEL: @sub_nsw(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[RES:%.*]] = sub nsw i32 [[A]], [[B]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = sub nsw i32 %a, %b
				ret i32 %res
				}

				define i32 @sub_nuw(i32 %a, i32 %b) {
				; CHECK-LABEL: @sub_nuw(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.usub.with.overflow.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[RES:%.*]] = sub nuw i32 [[A]], [[B]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = sub nuw i32 %a, %b
				ret i32 %res
				}

				define i32 @sub_nsw_nuw(i32 %a, i32 %b) {
				; CHECK-LABEL: @sub_nsw_nuw(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[TMP3:%.*]] = call { i32, i1 } @llvm.usub.with.overflow.i32(i32 [[A]], i32 [[B]])
				; CHECK-NEXT: [[TMP4:%.*]] = extractvalue { i32, i1 } [[TMP3]], 1
				; CHECK-NEXT: [[TMP5:%.*]] = or i1 [[TMP2]], [[TMP4]]
				; CHECK-NEXT: [[RES:%.*]] = sub nuw nsw i32 [[A]], [[B]]
				; CHECK-NEXT: [[TMP6:%.*]] = xor i1 [[TMP5]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP6]])
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = sub nsw nuw i32 %a, %b
				ret i32 %res
				}

				define i32 @mul_noflags(i32 %a, i32 %b) {
				; CHECK-LABEL: @mul_noflags(
				; CHECK-NEXT: [[RES:%.]] = mul i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = mul i32 %a, %b
				ret i32 %res
				}

				define i32 @mul_nsw(i32 %a, i32 %b) {
				; CHECK-LABEL: @mul_nsw(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.smul.with.overflow.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[RES:%.*]] = mul nsw i32 [[A]], [[B]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = mul nsw i32 %a, %b
				ret i32 %res
				}

				define i32 @mul_nuw(i32 %a, i32 %b) {
				; CHECK-LABEL: @mul_nuw(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[RES:%.*]] = mul nuw i32 [[A]], [[B]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = mul nuw i32 %a, %b
				ret i32 %res
				}

				define i32 @mul_nsw_nuw(i32 %a, i32 %b) {
				; CHECK-LABEL: @mul_nsw_nuw(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.smul.with.overflow.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[TMP3:%.*]] = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 [[A]], i32 [[B]])
				; CHECK-NEXT: [[TMP4:%.*]] = extractvalue { i32, i1 } [[TMP3]], 1
				; CHECK-NEXT: [[TMP5:%.*]] = or i1 [[TMP2]], [[TMP4]]
				; CHECK-NEXT: [[RES:%.*]] = mul nuw nsw i32 [[A]], [[B]]
				; CHECK-NEXT: [[TMP6:%.*]] = xor i1 [[TMP5]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP6]])
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%res = mul nsw nuw i32 %a, %b
				ret i32 %res
				}

llvm/trunk/test/Instrumentation/PoisonChecking/ub-checks.ll

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -passes=poison-checking -S < %s \| FileCheck %s

				; This file contains tests to exercise the UB triggering instructions with
				; a potential source of UB. The UB source is kept simple; we focus on the
				; UB triggering instructions here.

				define void @store(i8* %base, i32 %a) {
				; CHECK-LABEL: @store(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[A:%.]], i32 1)
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[A]], 1
				; CHECK-NEXT: [[P:%.]] = getelementptr i8, i8 [[BASE:%.*]], i32 [[ADD]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: store i8 0, i8* [[P]]
				; CHECK-NEXT: ret void
				;
				%add = add nsw i32 %a, 1
				%p = getelementptr i8, i8* %base, i32 %add
				store i8 0, i8* %p
				ret void
				}

				define void @load(i8* %base, i32 %a) {
				; CHECK-LABEL: @load(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[A:%.]], i32 1)
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[A]], 1
				; CHECK-NEXT: [[P:%.]] = getelementptr i8, i8 [[BASE:%.*]], i32 [[ADD]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: [[TMP4:%.]] = load volatile i8, i8 [[P]]
				; CHECK-NEXT: ret void
				;
				%add = add nsw i32 %a, 1
				%p = getelementptr i8, i8* %base, i32 %add
				load volatile i8, i8* %p
				ret void
				}

				define void @atomicrmw(i8* %base, i32 %a) {
				; CHECK-LABEL: @atomicrmw(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[A:%.]], i32 1)
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[A]], 1
				; CHECK-NEXT: [[P:%.]] = getelementptr i8, i8 [[BASE:%.*]], i32 [[ADD]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: [[TMP4:%.]] = atomicrmw add i8 [[P]], i8 1 seq_cst
				; CHECK-NEXT: ret void
				;
				%add = add nsw i32 %a, 1
				%p = getelementptr i8, i8* %base, i32 %add
				atomicrmw add i8* %p, i8 1 seq_cst
				ret void
				}

				define void @cmpxchg(i8* %base, i32 %a) {
				; CHECK-LABEL: @cmpxchg(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[A:%.]], i32 1)
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[A]], 1
				; CHECK-NEXT: [[P:%.]] = getelementptr i8, i8 [[BASE:%.*]], i32 [[ADD]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: [[TMP4:%.]] = cmpxchg i8 [[P]], i8 1, i8 0 seq_cst seq_cst
				; CHECK-NEXT: ret void
				;
				%add = add nsw i32 %a, 1
				%p = getelementptr i8, i8* %base, i32 %add
				cmpxchg i8* %p, i8 1, i8 0 seq_cst seq_cst
				ret void
				}

				define i32 @udiv(i8* %base, i32 %a) {
				; CHECK-LABEL: @udiv(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 [[A:%.]], i32 1)
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[ADD:%.*]] = add nuw i32 [[A]], 1
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: [[RES:%.*]] = udiv i32 2048, [[ADD]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%add = add nuw i32 %a, 1
				%res = udiv i32 2048, %add
				ret i32 %res
				}

				define i32 @sdiv(i8* %base, i32 %a) {
				; CHECK-LABEL: @sdiv(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 [[A:%.]], i32 1)
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[ADD:%.*]] = add nuw i32 [[A]], 1
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: [[RES:%.*]] = sdiv i32 2048, [[ADD]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%add = add nuw i32 %a, 1
				%res = sdiv i32 2048, %add
				ret i32 %res
				}

				define i32 @urem(i8* %base, i32 %a) {
				; CHECK-LABEL: @urem(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 [[A:%.]], i32 1)
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[ADD:%.*]] = add nuw i32 [[A]], 1
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: [[RES:%.*]] = urem i32 2048, [[ADD]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%add = add nuw i32 %a, 1
				%res = urem i32 2048, %add
				ret i32 %res
				}

				define i32 @srem(i8* %base, i32 %a) {
				; CHECK-LABEL: @srem(
				; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 [[A:%.]], i32 1)
				; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { i32, i1 } [[TMP1]], 1
				; CHECK-NEXT: [[ADD:%.*]] = add nuw i32 [[A]], 1
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[TMP2]], true
				; CHECK-NEXT: call void @__poison_checker_assert(i1 [[TMP3]])
				; CHECK-NEXT: [[RES:%.*]] = srem i32 2048, [[ADD]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%add = add nuw i32 %a, 1
				%res = srem i32 2048, %add
				ret i32 %res
				}

This is an archive of the discontinued LLVM Phabricator instance.

Add a transform pass to make the executable semantics of poison explicit in the IRClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 208766

llvm/trunk/include/llvm/Transforms/Instrumentation/PoisonChecking.h

llvm/trunk/lib/IR/Instruction.cpp

llvm/trunk/lib/Passes/PassBuilder.cpp

llvm/trunk/lib/Passes/PassRegistry.def

llvm/trunk/lib/Transforms/Instrumentation/CMakeLists.txt

llvm/trunk/lib/Transforms/Instrumentation/PoisonChecking.cpp

llvm/trunk/test/Instrumentation/PoisonChecking/basic-flag-validation.ll

llvm/trunk/test/Instrumentation/PoisonChecking/ub-checks.ll

Add a transform pass to make the executable semantics of poison explicit in the IR
ClosedPublic