This is an archive of the discontinued LLVM Phabricator instance.

Reassociate: Fix miscompile for va_arg arguments.
ClosedPublic

Authored by MatzeB on Jul 28 2015, 3:00 PM.

Details

Summary

isUnmovableInstruction() had a list of instructions hardcoded which are
considered unmovable. The list lacked (at least) an entry for the va_arg
and cmpxchg instructions.
Fix this by using !Instruction::mayRead() && isSafeToSpeculativelyExecute() instead of maintaining another
instruction list.

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB updated this revision to Diff 30860.Jul 28 2015, 3:00 PM
MatzeB retitled this revision from to Reassociate: Implement isUnmovableInstruction() without a hardcoded list.
MatzeB updated this object.
MatzeB added reviewers: majnemer, mcrosier, baldrick.
MatzeB set the repository for this revision to rL LLVM.
MatzeB added a subscriber: llvm-commits.

Note that this change takes the [USF]Div and [USF]Rem instructions out of the list, with llvms semantics they do not produce side effects (but just a poison value) so this should be fine. I'm also running the llvm-testsuite now to get some more testing for this change.

Note that this change takes the [USF]Div and [USF]Rem instructions out of the list, with llvms semantics they do not produce side effects (but just a poison value) so this should be fine. I'm also running the llvm-testsuite now to get some more testing for this change.

Integer div by zero will still trap so div and rem cannot be moved freely between basic blocks without changing behavior. I'm not sure if Reassociate performs movements that could move instructions into a different path in the CFG, but if it does we'll need something stronger like isSafeToSpeculativelyExecute.

MatzeB updated this revision to Diff 30874.Jul 28 2015, 4:33 PM
MatzeB updated this object.

Updated the patch to use a !I->mayRead() && isSafeToSpeculativelyExecute(I) to determine whether we can move an instruction around.

Note that this change takes the [USF]Div and [USF]Rem instructions out of the list, with llvms semantics they do not produce side effects (but just a poison value) so this should be fine. I'm also running the llvm-testsuite now to get some more testing for this change.

Integer div by zero will still trap so div and rem cannot be moved freely between basic blocks without changing behavior. I'm not sure if Reassociate performs movements that could move instructions into a different path in the CFG, but if it does we'll need something stronger like isSafeToSpeculativelyExecute.

From what I can see it does not move the instructions into different CFG paths but just inside a basic block, but that may already fail if you reorder an instruction potentially triggering undefined behaviour and a call I think, so switching to isSafeToSpeculativelyExecute() is the right call.

majnemer added inline comments.Jul 28 2015, 7:03 PM
lib/Transforms/Scalar/Reassociate.cpp
278

Would it make sense to extract this logic into a hypothetical mayBeControlDependent ?

MatzeB updated this revision to Diff 30932.Jul 29 2015, 11:27 AM

New revision, introducing a mayBeControlDependent() function.

MatzeB updated this revision to Diff 30942.Jul 29 2015, 1:45 PM

Let's call the function mayBeStateDependent() as classical definitions of control dependence are more concerned with control flow while this function is mostly concerned with global state and memory contents (of course the global state is *also* control dependent).

So, these are either data dependent, control dependent, or both :)
(IE that is the correct terminology)

So, these are either data dependent, control dependent, or both :)
(IE that is the correct terminology)

If you think in that dichotomy, then it is a data dependence. However usually data dependence talks about variables or registers while this function specifically is about the things not modeled as variables/ssa values. It is about the global state/memory, that's why I put the "State" in the name data dependence would be too broad.

????

I'm not thinking in that dichotomy, that is the generally dichotomy
of dependences in compilers :).

Data dependence talks about memory all the time, in compiler
schedulers, in loop data dependence, etc. Heck, even in SSA.
So I honestly do not grasp the distinction you are trying to draw here.

It happens that in LLVM, explicitly represented data dependences do
not make instructions inherently immovable. The IR explicitly
represents these things well enough that we don't pay a high cost for
figuring out safe move locations for those things.

However, memory data dependences are implicit (becuase we lack
something like memoryssa), and the cost of figuring out where the safe
move points are is not worth it.
(IE none of these instructions are actually immovable, it's just that
we don't want to pay the cost to figure out the place we could move
them to :P. For some of them, there is usually only one correct place
anyway, like phi nodes. )

Not to bikeshed this too hard, but state dependent means nothing to
me, and it doesn't actually give anyone any idea what the function
should return for what.

My previous comment was probably worded to strongly. What I meant was that a function name of "ControlDependent" or "DataDependent" would have been too broad. I just realize that your comment was probably about my "of course the global state is *also* control dependent" remark which I admit is confusing and only makes sense if you have things like the value state dependence graphs or program expression graphs in mind where all control dependences usually manifest themselfes as data dependences on the memory values.

I was going for the term state because we do not just capture the contents of the computer memory but also things like input/output or program abortion because of undefined behaviour. Anyway I'm completely fine with calling this memory dependence which is by far the more popular term.

MatzeB updated this revision to Diff 31039.Jul 30 2015, 9:33 AM
MatzeB retitled this revision from Reassociate: Implement isUnmovableInstruction() without a hardcoded list to Reassociate: Fix miscompile for va_arg arguments..

Uploaded a new version which calls the function mayBeMemoryDependent(). Change the title to make it clear that this is not just a cleanup.

mcrosier resigned from this revision.Aug 5 2015, 8:01 AM
mcrosier removed a reviewer: mcrosier.

LGTM, but I don't want to undercut Daniel or David, so I won't make my vote official.

qcolombet accepted this revision.Aug 6 2015, 11:53 AM
qcolombet added a reviewer: qcolombet.

Thanks.

Committed revision 244244.

This revision is now accepted and ready to land.Aug 6 2015, 11:53 AM
qcolombet closed this revision.Aug 6 2015, 11:53 AM