This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][Waitcnt] Add debug options
ClosedPublic

Authored by msearles on Apr 20 2018, 9:54 AM.

Details

Summary
  • Add "amdgpu-waitcnt-forcezero" to force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
  • Add debug counters to control force emit of s_waitcnt instrs; debug counters: si-insert-waitcnts-forceexp: force emit s_waitcnt expcnt(0) instrs si-insert-waitcnts-forcevm: force emit s_waitcnt lgkmcnt(0) instrs si-insert-waitcnts-forcelgkm: force emit s_waitcnt vmcnt(0) instrs
  • Add some debug statements

Note that a variant of this patch was previously attempted but never successfully landed (committed/reverted). I'm creating a new review to shake-off the bad mojo of the prior attempt :) .

Diff Detail

Event Timeline

msearles created this revision.Apr 20 2018, 9:54 AM

This is hard to review because it contains a whole bunch of unrelated spelling changes. Do you think you could separate those out?

msearles updated this revision to Diff 143578.Apr 23 2018, 9:19 AM
msearles edited the summary of this revision. (Show Details)

To simplify review, remove unrelated code cleanup: naming consistency s/SWaitcnt/Waitcnt s/WaitCnt/Waitcnt

nhaehnle accepted this revision.Apr 24 2018, 12:37 AM

Thanks, LGTM.

This revision is now accepted and ready to land.Apr 24 2018, 12:37 AM
This revision was automatically updated to reflect the committed changes.