This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: If a store defines (alias) a load, it clobbers the load.
ClosedPublic

Authored by cfang on Dec 9 2020, 11:00 AM.

Details

Summary

If a store defines (must alias) a load, it clobbers the load.

Fixes: SWDEV-258915

Diff Detail

Event Timeline

cfang created this revision.Dec 9 2020, 11:00 AM
cfang requested review of this revision.Dec 9 2020, 11:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2020, 11:00 AM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm added inline comments.Dec 9 2020, 11:09 AM
llvm/lib/Target/AMDGPU/AMDGPUAnnotateUniformValues.cpp
115

Specifically checking for store instructions is going to be wrong since many other instruction types can also write to memory

cfang updated this revision to Diff 310598.Dec 9 2020, 11:51 AM

Change store to any instructions that may write to memory.

cfang marked an inline comment as done.Dec 9 2020, 9:55 PM
cfang added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUAnnotateUniformValues.cpp
115

Changed to mayWriteToMemory.

arsenm added inline comments.Dec 10 2020, 3:03 PM
llvm/test/CodeGen/AMDGPU/store-clobbers-load.ll
24

Should add some tests with exotic memory writes since this was bug before

cfang marked an inline comment as done.Dec 11 2020, 12:34 PM
cfang added inline comments.
llvm/test/CodeGen/AMDGPU/store-clobbers-load.ll
24

The memory dependence analysis will return isClobber (other than isDef) dependence for the "exotic memory writes".
While such tests maybe useful, they are beyond this patch.

cfang updated this revision to Diff 311319.Dec 11 2020, 2:27 PM

Add a new test case.

arsenm added inline comments.Dec 12 2020, 6:45 AM
llvm/lib/Target/AMDGPU/AMDGPUAnnotateUniformValues.cpp
115

Why does it need to check the instruction at all? Why isn't Q.isDef() sufficient?

cfang marked an inline comment as done.Dec 14 2020, 10:57 AM
cfang added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUAnnotateUniformValues.cpp
115

Q.isDef() does not mean there must be a dependence at all.
One example is AllocaInst, in which case isDef() is returned, but the load could be optimized to undef.
I think instructional check of mayWriteToMemory is necessary here.

arsenm accepted this revision.Dec 14 2020, 2:25 PM
arsenm added inline comments.
llvm/test/CodeGen/AMDGPU/store-clobbers-load.ll
16

Why are we bothering to annotate a private memory access in the first place?

This revision is now accepted and ready to land.Dec 14 2020, 2:25 PM
cfang marked 2 inline comments as done.Dec 14 2020, 3:40 PM
cfang added inline comments.
llvm/test/CodeGen/AMDGPU/store-clobbers-load.ll
16

This is a good question. Just found that the only place that uses amdgpu.noclobber metadata is
in SITargetLowering::LowerLOAD for global address space.

Should we limit the analysis only to global address space (I am not clear whether there will be future
usage of amdgpu.noclobber)?

This revision was automatically updated to reflect the committed changes.
cfang marked an inline comment as done.