Running this after the scheduler enables scheduling
waits later so other ALU instructions can run while
this would be waiting.
When combined with enabling the post-RA scheduler, this
gives about a ~20% improvement on sgemm.
Differential D10793
AMDGPU: Run SIInsertWaits as pre-emit pass arsenm on Jun 27 2015, 2:40 PM. Authored by
Details
Running this after the scheduler enables scheduling When combined with enabling the post-RA scheduler, this
Diff Detail |