This is an archive of the discontinued LLVM Phabricator instance.

[ifcnv] Don't duplicate blocks that contain convergent instructions.
ClosedPublic

Authored by jlebar on Feb 22 2016, 11:16 AM.

Details

Summary

Also add a comment briefly explaining what ifcnv is.

It's unsafe to duplicate blocks that contain convergent instructions during ifcnv. See the patch for details.

Diff Detail

Event Timeline

jlebar updated this revision to Diff 48713.Feb 22 2016, 11:16 AM
jlebar retitled this revision from to [ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv..
jlebar updated this object.
jlebar added a reviewer: resistor.
jlebar added subscribers: echristo, tra, llvm-commits.

Friendly ping. Based on the discussion in D17430, this does seem confusing enough to be worth a comment.

jlebar added a reviewer: rnk.Mar 30 2016, 4:53 PM
rnk edited edge metadata.Mar 30 2016, 6:17 PM

I'm not sure I understand the comment. This transform is basically turning ifs into predicated instructions, right? And we're basically saying that a predicated convergent operation (threadsync or what have you) is the same as having the operation in a conditional basic block?

In D17518#387824, @rnk wrote:

I'm not sure I understand the comment. This transform is basically turning
ifs into predicated instructions, right? And we're basically saying that a
predicated convergent operation (threadsync or what have you) is the same as
having the operation in a conditional basic block?

Not quite. The simple case of converting

if (pred) __syncthreads();
==>
(predicated on pred) __syncthreads();

is trivially safe, like I think you're saying. But ifconv can also perform
transformations which duplicate the newly-predicated instruction. The question
is whether or not it's safe to duplicate a convergent instruction, in the
process of if-conversion.

A simple example is

   BB0
  /   \
 BB1  BB2
 |\_ _/
 | | |
 | TBB --> exit
 |
FBB

TBB forms ifconv's "simple" shape with its predecessors. Can we move TBB's
contents into BB1 and BB2 (as predicated instructions) if TBB contains a
convergent op?

I'd argued in D17430 that this was safe, but now I think it is not. If we
think concretely in terms of CUDA, a necessary condition for correctness is
that

  • if all threads in the warp executed TBB "together" (i.e., convergently) before the transformation,
  • then all threads in the warp must execute *the same copy* of TBB's instructions after the transformation.

But clearly that's not necessarily true in this example.

In terms of the LLVM spec, we say that you're not allowed to "add" a
control-flow dependency to a convergent op. In this case, if BB1 switches on
"cond", then the original CFG dependency expression for TBB is

(BB1 && cond) || BB2

But after predication, we have two copies of the convergent operation:

BB1 && cond
BB2

I think my mistake earlier was concluding from this that we "removed" CFG
dependencies, rather than adding them. But let me write this in an equivalent
way:

BB1 && cond && !BB2
BB2 && !BB1

Now it's suddenly clear that we added a CFG dependency to each of these new
instructions.

Put another way, there's a set of control flows which result in us reaching an
instruction. If I add a CFG dependency, I make that set *smaller*. That's
what we're not allowed to do with convergent operations, and that's what we're
doing here.

@resistor, I'm going to revert this back to what we originally had in D17430.
Reid, thank you for being confused by this. :)

jlebar updated this revision to Diff 52289.Mar 31 2016, 2:17 PM
jlebar edited edge metadata.

Go back to disallowing this conversion; see comment in code.

jlebar retitled this revision from [ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv. to [ifcnv] Don't duplicate blocks that contain convergent instructions..Mar 31 2016, 2:17 PM
jlebar updated this object.
hfinkel added inline comments.Apr 14 2016, 12:16 PM
lib/CodeGen/IfConversion.cpp
691

instr -> instruction

693

I understand (I think) what you're saying here, but everything from here down seems confusing. I think this is because 'cond' is not in the diagram. I think you can either:

  1. Put 'cond' in the diagram, and then say that you've gone from having no control dependencies on the convergent instruction to have a control dependency on 'cond' or '!cond' depending on into which block the instruction is duplicated.
  2. If I'm reading this correctly and BB1 and BB2 below indicate the control dependencies of their respective blocks, then you can do the same as above, but say that you've gone from having no relevant control dependencies to have a dependency 'BB1' or 'BB2'.

Feel free to tell me I'm missing something ;)

jlebar updated this revision to Diff 53776.Apr 14 2016, 1:16 PM
jlebar marked 2 inline comments as done.
jlebar updated this object.

Update per Hal's review.

Thank you for the review, Hal.

This is really pushing my ASCII art skills. lmk if this makes any more sense.

hfinkel accepted this revision.Apr 14 2016, 6:36 PM
hfinkel edited edge metadata.

Thank you for the review, Hal.

This is really pushing my ASCII art skills. lmk if this makes any more sense.

I like this much better, thanks! LGTM.

This revision is now accepted and ready to land.Apr 14 2016, 6:36 PM

Excellent. Thank you for the review!

This revision was automatically updated to reflect the committed changes.