Demotes may be used to explicitly create helper invocations.
These helper invocations may be intend to have observable effects
in WQM, e.g. in fragment shader subgroup operations.
Facilitate this behaviour by forcing softwqm operations to be run
in WQM when demotes are present in a shader.
Conversely this allows such operations to be marked softwqm so
helper lanes are only enabled if demotes are present.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The demote itself does not need to enable wqm. So I think soft_wqm should not depend on demote to enable wqm. If some graphics operation needs to run under wqm, it should directly use the non-soft version.
softwqm is used by our implementation of subgroup operations: if helper invocations are there, then subgroup operations must interact with them.
Helper invocations exist in one of two cases:
- There are "hardwqm" instructions (e.g. sampler).
- Invocations are explicitly demoted.
In the latter case, we need to make sure that the demoted invocations are still live for the purpose of feeding their values into those subgroup operations. Simply pretending that hardwqm instructions exists seems like a correct way to ensure that. Perhaps you have an alternative in mind @ruiling?
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.softwqm.ll | ||
---|---|---|
284–301 | Can you pre-submit this test with the incorrect code sequence (ideally, update_llc_test_checks?) and then update this review so that the diff is more clearly visible? |
There are two possible sources of helper invocations:
- A quad may have pixels not covered by current primitive
- An active invocation is demoted into helper invocation.
Normally we don't activate helper invocations to optimize for memory bandwidth/power. But if graphics specification require implementation has defined behavior that need helper invocation's participation for certain shader operations, then we have to turn on the helper invocations, we need to place a non-soft wqm. My understanding of the working logic is: it is the observer(image sample, derivative, subgroupQuad operation) to decide whether we should turn on the helper invocations for a quad with at least one active invocation. My understanding of soft_wqm (a kind of weak observer) is let other strong observer to decide whether we should turn on wqm. I guess we are making this change to fix subgroupQuad*** issue, under which I believe we need a non-soft wqm. I did not follow all the details of how demote was implemented, so please correct me if I misread the change. But I still think that if there is no strong observer (hard-wqm), don't enable wqm. If an operation really need to be executed under wqm, then it should use hard-wqm.
I think there is perhaps another way to see this.
From the specification for SPV_EXT_demote_to_helper_invocation:
Demote fragment shader invocation to a helper invocation. [...]
The implementation may terminate helper invocations before the end of the shader as an optimization, but doing so must not affect derivative calculations and does not make control flow non-uniform.
One way to read this is that demote should always create helper lanes, i.e. demote should always run WQM.
Our implementation doesn't do this and instead executes demote as a kill outside WQM regions.
In sense this is actually fixing an issue with the demote implementation, rather than with softwqm.
I agree with the description of softwqm as a "weak observer", but demote is suppose to be a strong source of observable derivatives.
Having gone over the argument with @ruiling separately, I think he has a point. Have you double-checked whether this change is still required after the related LLPC changes have been made?
Can you pre-submit this test with the incorrect code sequence (ideally, update_llc_test_checks?) and then update this review so that the diff is more clearly visible?