Export instructions read from EXEC to determine what data to write if
the VM bit is on. If we inadvertently left WQM (or in the future, WWM)
on, we'd write pixels that we weren't supposed to.
Details
Diff Detail
- Build Status
Buildable 7820 Build 7820: arc lint + arc unit
Event Timeline
This isn't necessary. Exports can't enable pixels that weren't enabled in the first place.
How sure are you? I haven't found anything in the publically available documentation that would support this assertion, and experimentally it seems this isn't true. This patch (combined with D34847) fixes a fragment shader AMD_shader_ballot test I wrote. The test does addInvocationsNonUniform(1) and then compares the result with the number of active threads with ID less than or equal to the current thread in the subgroup. Due to the way addInvocationsNonUniform is implemented, inactive lanes should return the result of the closest active lane with a smaller thread ID, so the inactive lanes should have a result smaller than the test would expect (but that's ok, because they're inactive!). I wrote the test to report the expected and actual value for each thread that failed, so you would expect any inactive lanes that were accidentally written due to WWM staying enabled would result in pixels with the actual smaller than expected, which is exactly what happened without this patch. With this patch, the test was fixed.
After some more testing, it seems like my hypothesis about the issue was wrong. I think the real problem was that WWM wasn't being disabled in the face of control flow like I intended it to, and then stuff was getting messed up because inactive lanes were enabled when the code assumed they weren't. At least, that's my theory for now. I'm going to update D34717 to fix that.