Page MenuHomePhabricator

Mark all CUDA device-side function defs and decls as convergent.

Authored by jlebar on Feb 9 2016, 4:19 PM.



This is important for e.g. the following case:

void sync() { __syncthreads(); }
void foo() {

Without this change, if the optimizer does not inline sync() (which it
won't because syncthreads is also marked as noduplicate, for now
anyway), it is free to perform optimizations on sync() that it would not
be able to perform on
syncthreads(), because sync() is not marked as

This chagne is conservative; the optimizer will remove these attrs where
it can, see r260318, r260319.

Diff Detail


Event Timeline

jlebar updated this revision to Diff 47395.Feb 9 2016, 4:19 PM
jlebar retitled this revision from to Mark all CUDA device-side function defs and decls as convergent..
jlebar updated this object.
jlebar added a reviewer: majnemer.
jlebar added subscribers: tra, echristo, jhen, cfe-commits.
tra added inline comments.Feb 9 2016, 4:43 PM
1880 ↗(On Diff #47395)


4 ↗(On Diff #47395)

You may want to add -disable-llvm-passes to make sure you're checking results of front-end actions only.

jlebar updated this revision to Diff 47409.Feb 9 2016, 5:08 PM
jlebar marked 2 inline comments as done.

Update per tra's review.

jlebar added a reviewer: rnk.Feb 16 2016, 4:44 PM
majnemer edited edge metadata.Feb 17 2016, 11:02 AM
majnemer added a subscriber: majnemer.

SetLLVMFunctionAttributesForDefinition seems like the more obvious place
for this logic.

jlebar updated this revision to Diff 48255.Feb 17 2016, 4:08 PM
jlebar edited edge metadata.

Move code into SetLLVMFunctionAttributesForDefinition.

Move code into SetLLVMFunctionAttributesForDefinition.

Actually, this doesn't work -- we don't annotate

__host__ __device__ void baz();

as convergent. (I ran the tests, but of course I didn't notice it failing...)

jlebar updated this revision to Diff 48260.Feb 17 2016, 4:35 PM

Move coded into SetLLVMFunctionAttributes (not ForDefinition).

Move coded into SetLLVMFunctionAttributes (not ForDefinition).

Much better. :)

majnemer added inline comments.Feb 17 2016, 4:42 PM
817–823 ↗(On Diff #48260)

Sorry for the goose chase...

ConstructAttributeList seems better.

jlebar updated this revision to Diff 48261.Feb 17 2016, 4:53 PM

Move code into ConstructAttributeList. Now it applies to both functions and calls.

817–823 ↗(On Diff #48260)

Actually, this is much better; now I don't need D17313.

Friendly ping -- are we happy with this?

majnemer accepted this revision.Feb 24 2016, 1:54 PM
majnemer edited edge metadata.


This revision is now accepted and ready to land.Feb 24 2016, 1:54 PM
This revision was automatically updated to reflect the committed changes.