This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Update VMEM scalar write hazard mitigation sequence
ClosedPublic

Authored by critson on Jul 15 2020, 7:10 AM.

Details

Summary

Using s_waitcnt_depctr 0xffe3 is potentially faster than v_nop.

Diff Detail

Event Timeline

critson created this revision.Jul 15 2020, 7:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2020, 7:10 AM
foad accepted this revision.Jul 15 2020, 7:45 AM
foad added inline comments.
llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
935–939

I wonder if there is a way to generalise this to spot any waitcnt that would mitigate the hazard, rather than just these two specific cases. But I guess this is fine in practice.

This revision is now accepted and ready to land.Jul 15 2020, 7:45 AM
This revision was automatically updated to reflect the committed changes.