This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] run post-RA hazard recognizer pass late
ClosedPublic

Authored by msearles on Jul 13 2018, 4:51 AM.

Details

Summary

Memory legalizer, waitcnt, and shrink passes can perturb the instructions, which means that the post-RA hazard recognizer pass should run after them. Otherwise, one of those passes may invalidate the work done by the hazard recognizer. Note that this has adverse side-effect that any consecutive S_NOP 0's, emitted by the hazard recognizer, will not be shrunk into a single S_NOP <N>. This should be addressed in a follow-on patch.

Diff Detail

Event Timeline

msearles created this revision.Jul 13 2018, 4:51 AM
arsenm added inline comments.Jul 13 2018, 5:00 AM
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
902

I'm surprised we run shrink after this point.

I think this can plausibly be an issue.
Suppose you had something like
%vgpr0 = V_FOO
<some hazard>
%vgpr1 = V_MOV_B32 5
%vgpr2 = V_BAR %vgpr1

The immediate materialize would eliminate a wait state that wasn't necessary, but now is since the copy is eliminated. I think in practice the shrink pass isn't very good at folding immediates post-RA, if it even tries

msearles added inline comments.Jul 13 2018, 5:32 AM
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
902

I will move the hazard recognizer after shrink instructions, unless you're suggesting that the shrink pass be moved somewhere else? I think that moving the hazard recognizer after the shrink pass has the side-effect that some S_NOP 0's won't be collapsed into, say S_NOP <N>. The hazard recognizer just creates individual S_NOP 0 instrs and is relying on the shrink pass to turn them into S_NOP <N>. I suppose the hazard recognizer could be enhanced to generate the S_NOP <N> instrs on its own.

arsenm added inline comments.Jul 13 2018, 5:33 AM
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
902

Yes, the hazard recognizer ideally would do that. The shrink pass deals with this as a workaround for it seeming to assume all nops are the same.

It's probably OK to leave as-is for now

rampitec added inline comments.Jul 13 2018, 9:45 AM
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
902

I would move shrink before hazard recognizer.

msearles updated this revision to Diff 155439.Jul 13 2018, 11:40 AM
msearles retitled this revision from [AMDGPU] run post-RA hazard recognizer pass after memory legalizer, waitcnt passes to [AMDGPU] run post-RA hazard recognizer pass late.
msearles edited the summary of this revision. (Show Details)
  • Move post-RA hazard recognizer pass after shrink instr as well.
  • Update tests.
msearles marked 3 inline comments as done.Jul 13 2018, 11:41 AM
This revision is now accepted and ready to land.Jul 13 2018, 12:14 PM
This revision was automatically updated to reflect the committed changes.