Add llvm.amdgcn.softwqm intrinsic which behaves like llvm.amdgcn.wqm only if there is other WQM computation in the shader.
Details
Diff Detail
- Repository
- rL LLVM
- Build Status
Buildable 35461 Build 35460: arc lint + arc unit
Event Timeline
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
5955–5959 ↗ | (On Diff #210619) | Is there some reason you can't just handle this with an instruction pattern? |
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
5955–5959 ↗ | (On Diff #210619) | For the same reason as llvm.amdgcn.wqm, we don't specify the input and output types. |
Have you checked that this actually fixes the reported CTS failure?
IIRC the CTS failure was essentially due to a shader of the form:
derivative calculation here subgroup operation here
The derivative calculation enables WQM, but then we may leave WQM again for the subgroup operations which is unexpected (since helper lanes are expected to participate). So softwqm needs to seed WQM requirements, but only if there is at least one hard wqm requirement in the shader.
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
5955–5959 ↗ | (On Diff #210619) | It's easier to directly select than to enumerate all the possible types. I would still expect all of these direct-to-machine-node intrinsics to be handled in AMDGPUISelDAGToDAG |
Yes, with the associated (minimal) frontend changes this fixes the CTS failure.
While my understanding of "seed requirements" means "for the whole shader", this code does what you expect.
If there are any hard WQM requirements for the shader, then all softwqm instructions (and their dependencies) are marked WQM.
Okay thanks, I see the logic now.
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
5955–5959 ↗ | (On Diff #210619) | You mean adding an AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN and lowering the softwqm intrinsic there? That does make sense to me. |
lib/Target/AMDGPU/SIInstructions.td | ||
114 | s/wcm/wqm/ |
I've moved the selection to AMDGPUISelDAGToDAG.
If this code is appropriate I will submit a follow change to move the selection for llvm.amdgcn.wqm and llvm.amdgcn.wwm as well.
s/wcm/wqm/