Diff Detail
Event Timeline
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
---|---|---|
277 | Is this needed only if the the instruction is a VMEM or FLAT instruction, not if a DS? It is only ensuring that the load has completed before doing the VMEM invalidate. |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
---|---|---|
361 | Is this required if a DS memory operation? Seems it is only required if a VMEM or FLAT instruction to ensure it has completed before invalidating the cache. | |
408 | Is this required if a DS memory operation? Seems it is only required if a VMEM or FLAT instruction to ensure it has completed before invalidating the cache. |
test/CodeGen/AMDGPU/global_atomics.ll | ||
---|---|---|
1026 | Why does this lose the glc bit? |
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
515 | There's an SPseduoInst or something like that which will avoid needing to set any of these bits | |
519 | I don't think this is necessary, only mayLoad and mayStore | |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
194–195 | If there are no memory operands, this should still work and be handled as conservatively as possible |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
---|---|---|
233 | I believe a waitcnt vmem(0) is required before the InsertBufferWbinvl1Vol to ensure any previous atomic load has completed that the fence will pair with to create a synchronizes-with relation. |
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
515 | Already committed separately with comments taken care of. | |
519 | Already committed separately with comments taken care of. | |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
194–195 | Do you see an issue with this approach? | |
277 | As discussed, this will be done in separate change. | |
347 | As discussed, this will be done in separate change. | |
361 | As discussed, this will be done in separate change. | |
397 | As discussed, this will be done in separate change. | |
408 | As discussed, this will be done in separate change. | |
419 | As discussed, this will be done in separate change. | |
test/CodeGen/AMDGPU/global_atomics.ll | ||
1026 | I think it does not need to be set in this case. |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
---|---|---|
194–195 | Yes. If there are no memory operands, this could skip an atomic load. |
lib/Target/AMDGPU/SIMemoryLegalizer.cpp | ||
---|---|---|
328 | May be worth asserting that FailureOrdering is not AtomicOrdering::Release or AtomicOrdering::AcquireRelease as these are not allowed, and following code relies on that fact. | |
test/CodeGen/AMDGPU/flat_atomics.ll | ||
934–935 | Curious why glc is no longer being checked for? |
One question, otherwise LGTM.
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
323 | Since a fence itself does not modify memory, and has no memory address operand, should it be marked maybeatomic? |
Since a fence itself does not modify memory, and has no memory address operand, should it be marked maybeatomic?