Details
- Reviewers
• tstellarAMD nhaehnle
Diff Detail
Event Timeline
LGTM
Out of curiosity, what does making v_readlane/v_writelane convergent fix? I thought they were independent of control flow...
Now I'm not really sure about them. I was just thinking any of the instructions that do any kind of crosslane interactions would be convergent
Assuming that they do what I imagine (read the corresponding register from a neighboring thread), they need to be convergent in order to ensure that the desired value hasn't been clobbered on the neighboring thread. Consider:
r0 = readlane(r0)
if ( ...) {
// r0 unused, gets reused as scratch register
} else {
// use r0
}
If we sink the readlane into the else block things break. Imagine the scenario where thread 0 in the wavefront takes the if branch, but thread 1 takes the else branch. Because they execute in lockstep, thread 0's code will execute first, and clobber the value in its r0. By the time thread 1 gets to run the sunken readlane, the proper value is no longer available in the neighbor's r0.