Page MenuHomePhabricator

[CUDA] Implemented __nvvm_atom_*_gen_* builtins.

Authored by tra on Jun 23 2015, 11:20 AM.



Implemented __nvvm_atom_*_gen_* builtins.

Integer variants are implmented as atomicrmw or cmpxchg instructions.

Atomic add for floating point (__nvvm_atom_add_gen_f()) is implemented as a call to an overloaded LVVM intrinsic.

Diff Detail


Event Timeline

tra updated this revision to Diff 28268.Jun 23 2015, 11:20 AM
tra retitled this revision from to [CUDA] Implemented __nvvm_atom_*_gen_* builtins..
tra updated this object.
tra edited the test plan for this revision. (Show Details)
tra added reviewers: jholewinski, eliben, echristo.
tra added a subscriber: Unknown Object (MLST).
eliben added inline comments.Jun 24 2015, 8:08 AM
112 ↗(On Diff #28268)

Don't really need the Result temporary here?

118 ↗(On Diff #28268)

You can just "return RValue::get(MakeBinary...)" -- similarly to how it's done below, for consistency

165 ↗(On Diff #28268)

Please document all function parameters in the comment (especially ReturnBool)

174 ↗(On Diff #28268)

Would universal initialization be simpler here?

Value *Args[3] = {CGF.Builder.CreateBi......, CGF.EmitScalar...}

6865 ↗(On Diff #28268)

I'm wondering if there's some unwritten rule saying that all target builtins should be crammed into a single file... It's closing in on 7KLOC now and no end in sight. Would it be very bad for NVPTX to have its own CGBuiltinNVPTX or something like that? Clang splits classes to multiple files (Sema, CodeGenFunction, etc) already...


tra updated this revision to Diff 28371.Jun 24 2015, 11:12 AM

Addressed Eli's comments.

112 ↗(On Diff #28268)


118 ↗(On Diff #28268)


165 ↗(On Diff #28268)


174 ↗(On Diff #28268)

Args[1] creation is a two-step operation which gets in a way.

eliben added inline comments.Jun 24 2015, 3:46 PM
174 ↗(On Diff #28268)

You could obtain the type from Args[1] through the cast (getSrcTy) and avoid the awkward repeated assignment to Args[1].

But this isn't super important - up to you.

eliben accepted this revision.Jun 24 2015, 3:50 PM
eliben edited edge metadata.


I'd move the NVPTX builtin emission code to its own file unless others have strong objections, but looks good otherwise.

190 ↗(On Diff #28371)

I would just "return EmitFromInt" and "return CreateZExt" inside the if() and else, and avoid this temporary. Then the first builder call in each clause can be assigned to a more meaningful name ("Result" is very generic).

This revision is now accepted and ready to land.Jun 24 2015, 3:50 PM
jholewinski accepted this revision.Jun 25 2015, 5:37 AM
jholewinski edited edge metadata.
tra updated this revision to Diff 28482.Jun 25 2015, 11:25 AM
tra edited edge metadata.

Removed temp variable.

tra added inline comments.Jun 25 2015, 11:27 AM
190 ↗(On Diff #28371)

I've removed temp variable altogether and added comments explaining what the code does.

This revision was automatically updated to reflect the committed changes.