Using s_waitcnt_depctr 0xffe3 is potentially faster than v_nop.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp | ||
---|---|---|
935–939 | I wonder if there is a way to generalise this to spot any waitcnt that would mitigate the hazard, rather than just these two specific cases. But I guess this is fine in practice. |
I wonder if there is a way to generalise this to spot any waitcnt that would mitigate the hazard, rather than just these two specific cases. But I guess this is fine in practice.