- __shfl_{up,down}* uses unsigned int for the third parameter.
- added [unsigned] long overloads for non-sync shuffles. Augments r319908 which added long overload for sync shuffles.
Details
Details
Diff Detail
Diff Detail
- Repository
- rC Clang
Event Timeline
Comment Actions
Since this is tricky and we've seen it affecting user code, do you think it's a bad idea to add tests to the test-suite?
Comment Actions
Added to my todo list. There are few more gaps that I want to test in order to make sure we don't regress on compatibility with older CUDA versions while changing these wrappers.