This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Make some instructions convergent
ClosedPublic

Authored by arsenm on May 9 2016, 9:19 AM.

Details

Diff Detail

Event Timeline

arsenm updated this revision to Diff 56587.May 9 2016, 9:19 AM
arsenm retitled this revision from to AMDGPU: Make some instructions convergent.
arsenm updated this object.
arsenm added a reviewer: tstellarAMD.
arsenm added a subscriber: llvm-commits.
nhaehnle accepted this revision.May 9 2016, 10:55 AM
nhaehnle added a reviewer: nhaehnle.
nhaehnle added a subscriber: nhaehnle.

LGTM

Out of curiosity, what does making v_readlane/v_writelane convergent fix? I thought they were independent of control flow...

This revision is now accepted and ready to land.May 9 2016, 10:55 AM

LGTM

Out of curiosity, what does making v_readlane/v_writelane convergent fix? I thought they were independent of control flow...

Now I'm not really sure about them. I was just thinking any of the instructions that do any kind of crosslane interactions would be convergent

LGTM

Out of curiosity, what does making v_readlane/v_writelane convergent fix? I thought they were independent of control flow...

Now I'm not really sure about them. I was just thinking any of the instructions that do any kind of crosslane interactions would be convergent

Assuming that they do what I imagine (read the corresponding register from a neighboring thread), they need to be convergent in order to ensure that the desired value hasn't been clobbered on the neighboring thread. Consider:

r0 = readlane(r0)
if ( ...) {

// r0 unused, gets reused as scratch register

} else {

// use r0

}

If we sink the readlane into the else block things break. Imagine the scenario where thread 0 in the wavefront takes the if branch, but thread 1 takes the else branch. Because they execute in lockstep, thread 0's code will execute first, and clobber the value in its r0. By the time thread 1 gets to run the sunken readlane, the proper value is no longer available in the neighbor's r0.

arsenm closed this revision.May 10 2016, 5:38 PM

r269147