There shall be 1 wait state between M0 write and LDS DMA/LDS_DIRECT use.
Details
Diff Detail
Event Timeline
llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp | ||
---|---|---|
358–366 | The coding style is strange here because it looks like it could call checkReadM0Hazards four times. But I guess in practice at most one of the conditionals will be true? |
llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp | ||
---|---|---|
358–366 | It tests for different types of instructions, so the actual function will be called once at most. Moreover, these are not common instructions. In fact scanning for operands to see if it uses LDS_DIRECT is more expensive. |
Collapsed all conditions around checkReadM0Hazards(). To me it is less readable but since there is a concern we may call checkReadM0Hazards() more than needed, combined.
llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp | ||
---|---|---|
358–359 | Don't see why you merged in these cases that early returned before |
llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp | ||
---|---|---|
358–366 | I am generally concerned by these early returns. We may skip checking other hazards which require more waitstates. I think we may need to remove all early returns here completely with the complexity of the hazard recognizer growing. |
Returned early return, at least for this patch. I have convinced myself that hazards checked later cannot interfere with these instructions.
Don't see why you merged in these cases that early returned before