This is an archive of the discontinued LLVM Phabricator instance.

Add a transform pass to make the executable semantics of poison explicit in the IR
ClosedPublic

Authored by reames on Jul 4 2019, 11:46 AM.

Details

Summary

Implements a transform pass which instruments IR such that poison semantics are made explicit. That is, it provides a (possibly partial) executable semantics for every instruction w.r.t. poison as specified in the LLVM LangRef. There are obvious parallels to the sanitizer tools, but this pass is focused purely on the semantics of LLVM IR, not any particular source language.

The target audience for this tool is developers working on or targetting LLVM from a frontend. The idea is to be able to take arbitrary IR (with the assumption of known inputs), and evaluate it concretely after having made poison semantics explicit to detect cases where either a) the original code executes UB, or b) a transform pass introduces UB which didn't exist in the original program.

At the moment, this is mostly the framework and still needs to be fleshed out. By reusing existing code we have decent coverage, but there's a lot of cases not yet handled. What's here is good enough to handle interesting cases though; for instance, one of the recent LFTR bugs involved UB being triggered by integer induction variables with nsw/nuw flags would be reported by the current code.

(See comment in PoisonChecking.cpp for full explanation and context)

Before this lands, it obviously needs a bunch of tests added, but for the moment, I wanted to collect feedback on the idea.

Diff Detail

Event Timeline

reames created this revision.Jul 4 2019, 11:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 4 2019, 11:46 AM

I like the idea. I browsed through the code but I failed to see where the poison is exposed to the user, e.g. through an assert or printf call. (Polly has a neat printf builder if that is of interest.)

sanjoy added a comment.Jul 4 2019, 1:02 PM

I think this is an excellent idea!!

I personally won't have time to carefully review this & following patches anytime soon (I'm happy to have high level discussions though), but I'm hoping the other reviewers you and I added will have time.

lib/Transforms/Instrumentation/PoisonChecking.cpp
1

Maybe just call this PoisonSanitizer?

41

This will have false positives as long as undef is a thing right? Unless the instrumentation added by this pass will produce poison if *any* value of undef produces poison (naively it seems this would require calling into a SAT solver at runtime)?

reames marked 2 inline comments as done.Jul 4 2019, 3:33 PM

I like the idea. I browsed through the code but I failed to see where the poison is exposed to the user, e.g. through an assert or printf call. (Polly has a neat printf builder if that is of interest.)

At the moment, I have a placeholder trap(bool) signature used, but that's one of the areas which clearly needs work.

lib/Transforms/Instrumentation/PoisonChecking.cpp
1

I could easily be convinced, but since the existing convention is that sanatizers are compiler user tools, not compiler developer tools, I thought it might be better to pick a different name. Weakly held opinion, happy to go with whatever reviewers prefer.

41

I don't actually follow what you're trying to say. Can you maybe rephrase w/an example which concerns you?

For context, I'm focusing on poison for the moment, but I think a completely reasonable thing would be to have a parallel approach for undef (1 bit per bit of original value), and then cross propagate if needed. I have a much weaker understanding of the undef rules, so I started with poison since that seemed a bit more clearly specified and recently discussed.

aqjune added inline comments.Jul 4 2019, 5:08 PM
lib/Transforms/Instrumentation/PoisonChecking.cpp
105

This is great.
Is it assumed that poison is always value-wise?
In other words, if the result is 1, does it mean either:
(1) the value is always fully poison
(2) the value has at least one poison 'bit'?

In the future, there might exist several kinds of poison semantics that people would like to test (e.g. bitwise poison vs. valuewise poison) - so I think it is a good idea to describe which poison semantics this visitAdd is assuming.

124

How about merging the contents of visitSub and visitAdd as they are equivalent except the intrinsics name? It can be reused for visitMul as well.

reames marked 3 inline comments as done.Jul 4 2019, 5:21 PM
reames added inline comments.
lib/Transforms/Instrumentation/PoisonChecking.cpp
41

For context, the part which has me confused was your claim of a false positive. I definitely see how undef can cause a false negative if not accounted for in the poison tracking.

105

All of this is assuming a single poison bit since that is our current semantics. I'll leave it up to anyone wishing to evaluate alternate semantics to implement them.

124

I agree the code structure needs some work here. I'm experimenting with variations on that to see what looks least ugly and will upload a revised patch later tonight or tomorrow.

shchenz added a subscriber: shchenz.Jul 4 2019, 5:24 PM
reames updated this revision to Diff 208108.Jul 4 2019, 8:36 PM

Restructure code to be more readable.

reames updated this revision to Diff 208112.Jul 4 2019, 9:14 PM

Flesh out some test cases

reames updated this revision to Diff 208113.Jul 4 2019, 9:20 PM

Add a case of imprecision to the comments, and move away from terminology around false positive/false negative since that dependents heavily on the use case. (i.e. is finding more UB after a transform a false negative in the analysis of the previous IR, or a false positive in the comparison?)

fhahn added a subscriber: fhahn.Jul 5 2019, 4:05 AM
nlopes added a comment.Jul 5 2019, 6:49 AM

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.
This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I guess this is more a FYI. If you want to support a significant chunk of LLVM, it's going to be a lot of work. Alive2 already has ~1 year of development, and has already found a few bugs in LLVM.

sanjoy added inline comments.Jul 5 2019, 10:19 AM
lib/Transforms/Instrumentation/PoisonChecking.cpp
41

By "false positive" I meant "incorrectly concluding that an optimization is buggy because this pass fails to detect poison before the optimization but is able to detect poison after the optimization".

Say you have this program:

int c = INT_SMAX +nsw undef;

IIUC you'll instrument this to:

int c, bool ov = add_with_overflow(INT_SMAX, undef)
if (ov) trap();

which will for only some values of undef. In particular, it may be (depending on the phase of the moon) that running this program does not trap.

But a pass can legitimately transform the program to:

int c = INT_SMAX +nsw 1;  // fold undef to 1

which will deterministically trap after instrumentation.

So if we are unlucky it will "look like" the transformation has a bug -- before the xform the instrumented IR did not trap, but after the xform it traps.

reames marked an inline comment as done.Jul 5 2019, 4:28 PM

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I'd love to explore options for sharing the semantics here. What form does Alive2 express them in?

I guess this is more a FYI. If you want to support a significant chunk of LLVM, it's going to be a lot of work. Alive2 already has ~1 year of development, and has already found a few bugs in LLVM.

I'm not expecting this to ever get 100% coverage, but I do have a backlog of miscompiles I'm trying to reduce. :)

lib/Transforms/Instrumentation/PoisonChecking.cpp
41

Yep, completely agree. This is a false positive for the "find problematic transform" case, but a false negative for the "does this program execute UB" case. I'd added a comment about this to the source, and this is the case that made me realize the false negative/positive terminology was super confusing without being specific about the use case.

In other words, I'm okay with this for the moment. :) Future work may explore this further.

nlopes added a comment.Jul 6 2019, 3:35 AM

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)

This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I'd love to explore options for sharing the semantics here. What form does Alive2 express them in?

That's still a unsolved research problem. No one really knows how to share semantics still.
The semantics in Alive2 are written in C++, using an embedded expression language. While it is potentially possible to reuse that somewhere else, it isn't trivial. See e.g. the ir/instr.cpp file.

I'll just add that we (my students and I) are interested in making the UB semantics easily pluggable / switchable. They'll still (almost certainly) be in C++, but we'd like to factor these out so at least they're cleanly separated and easily swapped out.

reames added a comment.Jul 8 2019, 2:58 PM

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)

I finally got it working, required a couple changes to the CMakeFiles and an LD_PRELOAD (for unclear reasons). However, it doesn't look like the scope of the alive-tv tool is anywhere near wide enough for my purposes.

Correct me if I'm wrong, but it looks like it can't handle loops at all right? And use of any memory seams to trigger timeouts? (Even for trivially identical IR?) Just making sure there's no error between keyboard and chair. :)

The problems I'm looking at are definitely not single block problems, unfortunately.

Can I get an LGTM on this? It seems like folks are interested in the approach, and it would be much easier to iterate in tree if needed.

sanjoy accepted this revision.Jul 8 2019, 9:52 PM

I think Alive and this are complementary. Alive is useful in proving the correctness of _new_ transformations, whereas this pass should let us quickly figure out if a large unreduced application is misbehaving it branches on poison.

lib/Transforms/Instrumentation/PoisonChecking.cpp
92

This looks like it can live on IRBuilder.

96

Is all of this necessary? I'd expect one round of instsimplify to fix up this kind of stuff.

167

You can do a single iterator lookup here instead of two.

182

It seems like the trap function assert that the condition is false? If yes then it should probably be called something else I think (something like assert_is_false would be great).

227

Can't you directly do for (Value *V: I.operands())?

268

There is a PreservedAnalyses::none().

This revision is now accepted and ready to land.Jul 8 2019, 9:52 PM
reames marked 6 inline comments as done.Jul 9 2019, 11:41 AM
reames added inline comments.
lib/Transforms/Instrumentation/PoisonChecking.cpp
92

Agreed, plan to move it there once I reapply the split out fix which got reverted due to a couple of tests I didn't catch.

96

It makes the test much, much easier to read. :)

This revision was automatically updated to reflect the committed changes.

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)

I finally got it working, required a couple changes to the CMakeFiles and an LD_PRELOAD (for unclear reasons). However, it doesn't look like the scope of the alive-tv tool is anywhere near wide enough for my purposes.

Correct me if I'm wrong, but it looks like it can't handle loops at all right? And use of any memory seams to trigger timeouts? (Even for trivially identical IR?) Just making sure there's no error between keyboard and chair. :)

Cool!
Is true that Alive2 doesn't support loops yet; that's in the todo list. It can handle branches and Phi nodes well, though.
Memory is being implemented ATM. Proofs with undef are quite hard. We don't have a fast-path for identical IR yet, it always goes to the expensive proof algorithm.
You can increase the timeout. With, say, a 10 seconds timeout it's very unlikely there's a bug since the SMT solver would very likely find it in that time.

Please keep me in the loop for these bugs you are seeing. I'm happy to help debug and/or just being in the loop of what happened if this might be an issue with IR semantics or simply a misunderstanding in the semantics. Thank you!

reames added a comment.Jul 9 2019, 2:56 PM

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)

This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I'd love to explore options for sharing the semantics here. What form does Alive2 express them in?

That's still a unsolved research problem. No one really knows how to share semantics still.
The semantics in Alive2 are written in C++, using an embedded expression language. While it is potentially possible to reuse that somewhere else, it isn't trivial. See e.g. the ir/instr.cpp file.

One thought on sharing.

From what I can tell from a quick look at the code you mentioned, it looks like you're parsing IR into an expression language, then rewriting the expressions to propagate poison - in a fairly similar manner to this code, but over your expression language - and then translating that expression language to SMT. Is that a good high level summary?

If you used this framework (not necessarily the pass, but the utilities) to rewrite the IR so as to make poison semantics explicit before converting to your expression language, you might be able to factor out a good portion of the poison reasoning from Alive2. It'd be a fairly major design though, so not sure if that's worth it to you or not.

Just a shameless plug :)
We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for opt that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work. Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned? I'd very much like to play with this.

I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)

This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I'd love to explore options for sharing the semantics here. What form does Alive2 express them in?

That's still a unsolved research problem. No one really knows how to share semantics still.
The semantics in Alive2 are written in C++, using an embedded expression language. While it is potentially possible to reuse that somewhere else, it isn't trivial. See e.g. the ir/instr.cpp file.

One thought on sharing.

From what I can tell from a quick look at the code you mentioned, it looks like you're parsing IR into an expression language, then rewriting the expressions to propagate poison - in a fairly similar manner to this code, but over your expression language - and then translating that expression language to SMT. Is that a good high level summary?

It's like this:

  • LLVM IR -> Alive2 IR (close to LLVM, but smaller)
  • Alive2 IR -> SMT expressions. As you say, it describes the poison semantics in terms of SMT expressions. The underlying engine handles propagation of undef.
  • VcGen: combine SMT expressions to produce a theorem to be proved

If you used this framework (not necessarily the pass, but the utilities) to rewrite the IR so as to make poison semantics explicit before converting to your expression language, you might be able to factor out a good portion of the poison reasoning from Alive2. It'd be a fairly major design though, so not sure if that's worth it to you or not.

We could go back from SMT expressions to LLVM IR, that's not very difficult (for arithmetic and other simple things, at least). The complication is the vcgen bit. Doable, but it's quite some work.
I think sharing semantics is not an immediate goal for us. We can share high-level semantics through LangRef and fix it where needed.