This is still broken for VI like the other cases handled here.
These can't use the workaround of using flat for global memory.
Details
- Reviewers
• tstellarAMD
Diff Detail
Event Timeline
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2531–2539 | The legalization is more complicated because we need to check the stride and tid_enable bits on the resource and do: ptr = ptr + (stride * (index + tid)) I think instead of lowering this, we should try harder to put the base pointer in SGPRs, though I'm not sure of a good way to do that. |
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2531–2539 | Figuring out the TID here is a problem. If it was just the VGPR0 input, a copy could be inserted in the entry block of the input argument to use here, but then we don't know if the y and z components need to be added, which might not even be enabled inputs yet at this point. |
The legalization is more complicated because we need to check the stride and tid_enable bits on the resource and do:
ptr = ptr + (stride * (index + tid))
I think instead of lowering this, we should try harder to put the base pointer in SGPRs, though I'm not sure of a good way to do that.