This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Handle waitcnt overflow
ClosedPublic

Authored by kerbowa on Nov 18 2019, 8:11 PM.

Details

Summary

The waitcnt pass can overflow the counters when the number of outstanding events
for a type exceed the capacity of the counter. This can lead to inefficient
insertion of waitcnts, or to waitcnt instructions with max values for each type.
The last situation can cause an instruction which when disassembled appears to
be an illegal waitcnt without an operand.

In these cases we should add a wait for the 'counter maximum' - 1, and update the
waitcnt brackets accordingly.

Event Timeline

kerbowa created this revision.Nov 18 2019, 8:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 18 2019, 8:11 PM
This revision is now accepted and ready to land.Nov 18 2019, 8:25 PM
arsenm added inline comments.Nov 18 2019, 8:26 PM
llvm/test/CodeGen/AMDGPU/waitcnt-overflow.mir
1

These look like generated checks, but the update_mir_test_checks comment is missing?

8

Should also add an exp test

24

gfx9 and 10 should both be using the _gfx9 variant without the implicit m0

kerbowa updated this revision to Diff 230114.Nov 19 2019, 11:32 AM

Address comments.

kerbowa marked 2 inline comments as done.Nov 19 2019, 11:33 AM
kerbowa added inline comments.
llvm/test/CodeGen/AMDGPU/waitcnt-overflow.mir
8

Do you have an suggestion on adding an exp test?

arsenm added inline comments.Nov 21 2019, 5:36 AM
llvm/test/CodeGen/AMDGPU/waitcnt-overflow.mir
8

just use a lot of EXPs?

kerbowa added inline comments.Nov 21 2019, 10:44 AM
llvm/test/CodeGen/AMDGPU/waitcnt-overflow.mir
8

Okay. But fox expcnt we already reset the LB when we have an overflow, and just don't insert any wait. This change should never affect expcnt.

kerbowa updated this revision to Diff 230493.Nov 21 2019, 10:55 AM

Add exp test.

This revision was automatically updated to reflect the committed changes.