MemorySSA considers any atomic a def to any operation it dominates
just like a barrier or fence. That is correct from memory state
perspective, but not required for the no-clobber metadata since
we are not using it for reordering. Skip such atomics during the
scan just like a barrier if it does not alias with the load.
Details
Diff Detail
Event Timeline
llvm/test/CodeGen/AMDGPU/noclobber-barrier.ll | ||
---|---|---|
601 | Yes, clobber_by_atomic_load in this file. |
llvm/lib/Target/AMDGPU/AMDGPUAnnotateUniformValues.cpp | ||
---|---|---|
125 | Typo unversal |
Fixed not calling getClobberingMemoryAccess, it was gathering more defs for inspection than needed. Added test for this case.
llvm/test/CodeGen/AMDGPU/noclobber-barrier.ll | ||
---|---|---|
623 | No volatile |
llvm/lib/Target/AMDGPU/AMDGPUAnnotateUniformValues.cpp | ||
---|---|---|
51 | Should add an addUsedIfAvailable call for AA here. But I think MemorySSA already requires it, so you could make it a hard requirement here too? |
MemorySSA considers any atomic a def to any operation it dominates
I guess this is because of the problem described by the TODO in MemorySSA::createNewAccess:
// The isOrdered check is used to ensure that volatiles end up as defs // (atomics end up as ModRef right now anyway). Until we separate the // ordering chain from the memory chain, this enables people to see at least // some relative ordering to volatiles. Note that getClobberingMemoryAccess // will still give an answer that bypasses other volatile loads. TODO: // Separate memory aliasing and ordering into two different chains so that // we can precisely represent both "what memory will this read/write/is // clobbered by" and "what instructions can I move this past".
?
Maybe a cleaner way to handle this would be to implement a new MemorySSAWalker subclass where getClobberingMemoryAccess(I) ignores ordering, and only returns a def that mayAlias I. But I suppose that's more work.
Should add an addUsedIfAvailable call for AA here. But I think MemorySSA already requires it, so you could make it a hard requirement here too?