This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Remove unnecessary and on conditional branch
ClosedPublic

Authored by arsenm on Nov 2 2016, 3:43 PM.

Details

Summary

The comment explaining why this was necessary is incorrect
in its description of v_cmp's behavior for inactive workitems.

Diff Detail

Event Timeline

arsenm updated this revision to Diff 76800.Nov 2 2016, 3:43 PM
arsenm retitled this revision from to AMDGPU: Remove unnecessary and on conditional branch.
arsenm updated this object.
arsenm added a subscriber: llvm-commits.
tstellarAMD accepted this revision.Nov 3 2016, 6:49 AM
tstellarAMD edited edge metadata.

We should make sure to test this, because I recall there were some regressions fixed by doing this.

This revision is now accepted and ready to land.Nov 3 2016, 6:49 AM
nhaehnle requested changes to this revision.Nov 3 2016, 7:17 AM
nhaehnle added a reviewer: nhaehnle.

I'm pretty sure this can lead to problems with more complicated conditionals, or at least it once could have done so.

The comment was definitely wrong, v_cmp returns a mask that is & EXEC, but the concern I have is when the condition is some boolean combination of conditions instead of coming straight out of v_cmp. For example, we translate a logical negation to simply XOR with ~0 instead of XOR with EXEC (for good reasons, because the condition may have been computed in a different basic block, but still...). In theory negations can always be folded away by a compination of DeMorgan and inverting the v_cmps, but I'm not sure that that always happens.

This revision now requires changes to proceed.Nov 3 2016, 7:17 AM
arsenm added a comment.Nov 3 2016, 9:12 AM

I saw no piglit regressions on Hawaii. I don't think I've seen sc add anything like this before

nhaehnle accepted this revision.Nov 7 2016, 3:13 AM
nhaehnle edited edge metadata.

Sorry for the delay. Now that I'm thinking about this more carefully it's probably okay since it's only for uniform control flow. It might randomly fix or break some of those EXEC = 0 cases, but that's not really relevant for this patch.

This revision is now accepted and ready to land.Nov 7 2016, 3:13 AM
arsenm closed this revision.Nov 7 2016, 11:19 AM

r286134