I've completed my audit of all the code that looks at noduplicate and
added handling of convergent where appropriate, so we no longer need
noduplicate on these intrinsics.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
Friendly ping here -- I think we're ready to go with this one. It's necessary to get inlining of functions which contain syncthreads() and unrolling of loops that contain syncthreads(), so has a pretty substantial performance impact.