This is an archive of the discontinued LLVM Phabricator instance.

Add convergent-removing bits to FunctionAttrs pass.
ClosedPublic

Authored by jlebar on Feb 8 2016, 6:16 PM.

Details

Summary

Remove the convergent attribute on any functions which provably do not
contain or invoke any convergent functions.

After this change, we'll be able to modify clang to conservatively add
'convergent' to all functions when compiling CUDA.

Diff Detail

Event Timeline

jlebar updated this revision to Diff 47282.Feb 8 2016, 6:16 PM
jlebar retitled this revision from to Add convergent-removing bits to FunctionAttrs pass..
jlebar updated this object.
jlebar added reviewers: hfinkel, resistor, chandlerc, arsenm.
jlebar added subscribers: jhen, tra, mehdi_amini and 2 others.
mehdi_amini added inline comments.Feb 8 2016, 6:39 PM
lib/Transforms/IPO/FunctionAttrs.cpp
950

could also by a call to any_of?

954

Callee->isIntrinsic() should be redundant here (I mean !Callee->isIntrinsic() && Callee->isConvergent() should be impossible after the llvm::any_of loop above right?

jingyue added inline comments.Feb 8 2016, 7:26 PM
lib/Transforms/IPO/FunctionAttrs.cpp
934

Please comment on what it returns.

936

SCC may contain multiple functions. In that case, you need to consider all of them.

949

You can iterate through the call sites with CallGraphNode::iterator (http://llvm.org/docs/doxygen/html/CallGraph_8h_source.html#l00186).

mehdi_amini added inline comments.Feb 8 2016, 9:25 PM
lib/Transforms/IPO/FunctionAttrs.cpp
936

That's a good point: any reason not to take the SCCNodeSet and iterate over like addNonNullAttrs for instance?

jlebar updated this revision to Diff 47352.Feb 9 2016, 12:50 PM
jlebar marked 5 inline comments as done.

Address jingyue, joker.eph's comments.

Thank you for the reviews. I've updated the patch, please have a look when you have a chance.

lib/Transforms/IPO/FunctionAttrs.cpp
936

SCC may contain multiple functions. In that case, you need to consider all of them.

Thanks, this was quite broken.

any reason not to take the SCCNodeSet?

SCCNodeSet is the set of strongly-connected nodes -- aiui a set of mutually-recursive functions, excluding any OptimizeNone or external functions that may be present in the set.

Two problems with using this rather than SCC itself. First, I think we do want to consider OptimizeNone functions. At least, if Foo and Bar are convergent and mutually recursive and Bar is OptimizeNone, then Foo should remain convergent. (The original code didn't handle OptimizeNone correctly either; I've fixed this.)

Second, if we use this, we don't get the actual control-flow graph, which is contained in SCC. There appears to be nontrivial logic to e.g. devirtualize calls in there, so I don't think we want to replicate all that code when we iterate over the function looking for calls to convergent intrinsics.

949

I may be misunderstanding, but I think that's what we're doing in the loop above when we do llvm::any_of(*CGN) -- we're iterating over CallGraphNode::begin()/end(). The issue is that, as the CallGraphNode is constructed in CGPassManager::RefreshCallGraph, does not contain calls to intrinsic functions.

954

You're right, it wasn't necessary. However, it now is, because now we're ignoring calls to functions in the same SCC.

jingyue accepted this revision.Feb 9 2016, 1:20 PM
jingyue added a reviewer: jingyue.
jingyue added inline comments.
lib/Transforms/IPO/FunctionAttrs.cpp
949

Looks good now.

This revision is now accepted and ready to land.Feb 9 2016, 1:20 PM
mehdi_amini accepted this revision.Feb 9 2016, 1:21 PM
mehdi_amini added a reviewer: mehdi_amini.

LGTM.

lib/Transforms/IPO/FunctionAttrs.cpp
973

Suggestion: outline the loop body in a helper function canRemoveConvergent(CallGraphNode *CGN)

The loop becomes then: if(!llvm::all_of(SCC, canRemoveConvergent)) return false;

jlebar marked an inline comment as done.Feb 9 2016, 1:34 PM
jlebar added inline comments.
lib/Transforms/IPO/FunctionAttrs.cpp
973

I'm actually not wild about this because it's a bit misleading. Consider

void foo() { bar(); }
void bar() { __syncthreads(); foo(); }

canRemoveConvergent(foo) would return true, even though, because we cannot remove convergent from bar and foo and bar are in the same SCC, we cannot in fact remove convergent from foo.

mehdi_amini added inline comments.Feb 9 2016, 1:57 PM
lib/Transforms/IPO/FunctionAttrs.cpp
973

To be clear: I was suggesting a pure "syntactic" change no functional change intended. I agree that the naming I proposed isn't well suited though (considering your example)

This revision was automatically updated to reflect the committed changes.
jlebar marked 3 inline comments as done.