This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Revise handling of preexisting waitcnt
ClosedPublic

Authored by kerbowa on Apr 11 2021, 9:28 PM.

Details

Summary

Preexisting waitcnt may not update the scoreboard if the instruction
being examined needed to wait on fewer counters than what was encoded in
the old waitcnt instruction. Fixing this results in the elimination of
some redudnat waitcnt.

These changes also enable combining consecutive waitcnt into a single
S_WAITCNT or S_WAITCNT_VSCNT instruction.

Additionally move the code earlier that inserts waitcnt at function
entry. This allows combining more waticnt in some tests.

Diff Detail

Event Timeline

kerbowa created this revision.Apr 11 2021, 9:28 PM
kerbowa requested review of this revision.Apr 11 2021, 9:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2021, 9:28 PM
t-tye added inline comments.Apr 12 2021, 12:17 AM
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
803

It seems it could delete them if it can prove they have no effect on execution due to no memory operations existing between them and a previous waitcnt. But if did that it must respect the WB/INV instructions that also effect waitcnts.

kerbowa added inline comments.Apr 12 2021, 8:12 AM
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
803

I'm planning to do this in a follow-up change.

rampitec added inline comments.Apr 12 2021, 11:23 AM
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
814

isMetaInstruction() probably?

1158

We probably still want to support forced waitcounts for debugging purposes. A debug trap for memory violation usually happens at a wait count, so it is easier to debug with explicit waits for each memory access.

1601

Isn't it a separate change?

kerbowa added inline comments.Apr 12 2021, 11:54 AM
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
814

thanks.

1158

I tried to make sure forced waitcnt still works.

1601

Sure, I can separate the changes.

kerbowa updated this revision to Diff 340876.Apr 27 2021, 9:22 AM

Use isMetaInstruction; move func entry waitcnt change to separate patch.

arsenm added inline comments.Apr 27 2021, 9:36 AM
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
814

Can you add a test for kill being ignored? Those keep showing up as a problem

foad added a subscriber: foad.Apr 27 2021, 10:01 AM
rampitec accepted this revision.Apr 27 2021, 11:36 AM

LGTM. Update the commit message though to remove this: "Additionally move the code earlier that inserts waitcnt at function entry. This allows combining more waticnt in some tests."

This revision is now accepted and ready to land.Apr 27 2021, 11:36 AM
foad added inline comments.Apr 28 2021, 6:35 AM
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
725–726

This condition was already pretty obscure and inverting it has made it even more obscure. Instead, how about:

assert(LB <= UB); // does this even need an assert, or is it blatantly obvious?
if (Count >= UB - LB)
  ...
kerbowa updated this revision to Diff 342269.May 2 2021, 1:34 PM

Simplify conditional and add comment. Add test with KILL.

This revision was landed with ongoing or failed builds.May 5 2021, 5:21 PM
This revision was automatically updated to reflect the committed changes.