This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Also consider global and scratch instructions when flushing vmcnt counter in loop preheader
ClosedPublic

Authored by rochauha on Apr 27 2023, 4:07 AM.

Diff Detail

Event Timeline

rochauha created this revision.Apr 27 2023, 4:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2023, 4:07 AM
rochauha requested review of this revision.Apr 27 2023, 4:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2023, 4:07 AM
foad added inline comments.Apr 27 2023, 4:53 AM
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
1724–1725

My only slight concern is whether we should also accept FLAT instructions here? They update vmcnt but not only vmcnt. I'm not sure what the answer is.

bsaleil added inline comments.Apr 27 2023, 7:24 AM
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
1724–1725

I think it is still better to flush vmcnt in this case.
With a flat load, we would have:

v0 = flat_load(...)
s_waitcnt vmcnt(0)
loop {
  ...
  s_waitcnt lgkmcnt(0)
  use(v0)
  ...
  store(...)
  ...
}

Which is better than having a s_waitcnt vmcnt in the loop. If the store is also a flat store, it may be worth flushing lgkmcnt too, but I don't know if this case is common or not.
Anyway, we should add tests cases for that. @rochauha, can you extend waitcnt-vmcnt-loop.mir with a minimal test case for global and flat instructions ?

foad added inline comments.Apr 27 2023, 7:30 AM
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
1724–1725

I think it is still better to flush vmcnt in this case.

OK, so maybe we should test isVMEM || isFLAT here?

kerbowa added inline comments.Apr 27 2023, 8:14 AM
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
1724–1725

@rochauha You could also try isFLAT+mayAccessVMEMThroughFlat

rochauha updated this revision to Diff 518137.Apr 29 2023, 12:33 AM
  • using SIInstrInfo::isFLAT(MI) && mayAccessVMEMThroughFlat(MI)
  • updated waitcnt-vmcnt-loop.mir with 2 tests -- 1 for global and 1 for flat memory instructions
  • updated tests affected by considering flat memory instructions
foad accepted this revision.May 2 2023, 7:56 AM

LGTM.

This revision is now accepted and ready to land.May 2 2023, 7:56 AM
rochauha updated this revision to Diff 519439.May 4 2023, 4:40 AM

Added a test case for scratch memory instructions too