Diff Detail
Event Timeline
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
---|---|---|
278 | Is this needed only if the the instruction is a VMEM or FLAT instruction, not if a DS? It is only ensuring that the load has completed before doing the VMEM invalidate. |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
---|---|---|
362 | Is this required if a DS memory operation? Seems it is only required if a VMEM or FLAT instruction to ensure it has completed before invalidating the cache. | |
409 | Is this required if a DS memory operation? Seems it is only required if a VMEM or FLAT instruction to ensure it has completed before invalidating the cache. |
test/CodeGen/AMDGPU/global_atomics.ll | ||
---|---|---|
1026–1027 | Why does this lose the glc bit? |
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
515 ↗ | (On Diff #72013) | There's an SPseduoInst or something like that which will avoid needing to set any of these bits |
519 ↗ | (On Diff #72013) | I don't think this is necessary, only mayLoad and mayStore |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
195–196 | If there are no memory operands, this should still work and be handled as conservatively as possible |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
---|---|---|
234 | I believe a waitcnt vmem(0) is required before the InsertBufferWbinvl1Vol to ensure any previous atomic load has completed that the fence will pair with to create a synchronizes-with relation. |
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
515 ↗ | (On Diff #72013) | Already committed separately with comments taken care of. |
519 ↗ | (On Diff #72013) | Already committed separately with comments taken care of. |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
195–196 | Do you see an issue with this approach? | |
278 | As discussed, this will be done in separate change. | |
348 | As discussed, this will be done in separate change. | |
362 | As discussed, this will be done in separate change. | |
398 | As discussed, this will be done in separate change. | |
409 | As discussed, this will be done in separate change. | |
420 | As discussed, this will be done in separate change. | |
test/CodeGen/AMDGPU/global_atomics.ll | ||
1026–1027 | I think it does not need to be set in this case. |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
---|---|---|
195–196 | Yes. If there are no memory operands, this could skip an atomic load. |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
---|---|---|
327 | May be worth asserting that FailureOrdering is not AtomicOrdering::Release or AtomicOrdering::AcquireRelease as these are not allowed, and following code relies on that fact. | |
test/CodeGen/AMDGPU/flat_atomics.ll | ||
1001 | Curious why glc is no longer being checked for? |
One question, otherwise LGTM.
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
102 ↗ | (On Diff #107151) | Since a fence itself does not modify memory, and has no memory address operand, should it be marked maybeatomic? |
If there are no memory operands, this should still work and be handled as conservatively as possible