This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Skip waiting on lgkmcnt for global flat loads
AbandonedPublic

Authored by arsenm on May 23 2016, 9:18 PM.

Details

Reviewers
tstellarAMD
Summary

If we know the access isn't to a flat address,
the wait for LDS is not necessary.

Diff Detail

Event Timeline

arsenm updated this revision to Diff 58192.May 23 2016, 9:18 PM
arsenm retitled this revision from to AMDGPU: Skip waiting on lgkmcnt for global flat loads.
arsenm updated this object.
arsenm added a reviewer: tstellarAMD.
arsenm added a subscriber: llvm-commits.
lib/Target/AMDGPU/SIInsertWaits.cpp
221–225

I'm not really sure exactly what this is doing, but as long as this accounts for the fact that the hw LGKM counter is always incremented even if the operation accesses global memory than this is fine.

Though, I think you should add some tests that have lds operations before and after a flat instruction that accesses global memory.

arsenm added inline comments.May 24 2016, 5:39 PM
lib/Target/AMDGPU/SIInsertWaits.cpp
221–225

I don't think this is accounting for the hardware increase

arsenm abandoned this revision.Feb 18 2020, 7:29 AM

This pass was replaced

t-tye added inline comments.Feb 18 2020, 9:38 AM
lib/Target/AMDGPU/SIInsertWaits.cpp
221–225

But does the hardware increasing the LGKM counter matter? The hardware will increase it, then decrease it once it determines the FLT address is targeting LDS. So all that that can effect is another memory operation waiting for LGKM, causing them to wait a bit longer. It cannot make any other memory operation satisfy their WAITCNT early so cannot break correctness.

The completion of a FLAT operation that is known to only target VMEM only needs to wait on the vmem counter.