[OPENMP][NVPTX]Use __syncwarp() to reconverge the threads.
In Cuda 9.0 it is not guaranteed that threads in the warps are
convergent. We need to use syncwarp() function to reconverge
the threads and to guarantee the memory ordering among threads in the
This is the first patch to fix the problem with the test
libomptarget/deviceRTLs/nvptx/src/sync.cu on Cuda9+.
This patch just replaces calls to shfl_sync() function with the call
of __syncwarp() function where we need to reconverge the threads when we
try to modify the value of the parallel level counter.
Subscribers: guansong, jfb, jdoerfert, caomhin, kkwli0, openmp-commits
Differential Revision: https://reviews.llvm.org/D65013