The comment explaining why this was necessary is incorrect
in its description of v_cmp's behavior for inactive workitems.
Details
- Reviewers
• tstellarAMD nhaehnle
Diff Detail
Event Timeline
We should make sure to test this, because I recall there were some regressions fixed by doing this.
I'm pretty sure this can lead to problems with more complicated conditionals, or at least it once could have done so.
The comment was definitely wrong, v_cmp returns a mask that is & EXEC, but the concern I have is when the condition is some boolean combination of conditions instead of coming straight out of v_cmp. For example, we translate a logical negation to simply XOR with ~0 instead of XOR with EXEC (for good reasons, because the condition may have been computed in a different basic block, but still...). In theory negations can always be folded away by a compination of DeMorgan and inverting the v_cmps, but I'm not sure that that always happens.
I saw no piglit regressions on Hawaii. I don't think I've seen sc add anything like this before
Sorry for the delay. Now that I'm thinking about this more carefully it's probably okay since it's only for uniform control flow. It might randomly fix or break some of those EXEC = 0 cases, but that's not really relevant for this patch.