Page MenuHomePhabricator

[AMDGPU] Avoid copying dead subregisters in copyPhysReg
Needs ReviewPublic

Authored by foad on Nov 2 2021, 6:56 AM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

When SIInstrInfo::copyPhysReg splits a multi-dword (superregister) copy
into individual dword (subregister) copies, use LivePhysRegs info to
avoid copying dead subregisters.

This fixes a liveness problem where the superregister copy source may
have been only partially defined (which is allowed) but one of the
resulting subregister copy sources would be completely undefined (which
is not allowed by the machine verifier).

This replaces the previous workaround, which was to add implicit
superregister use/def operands to all the subregister copy instructions,
which caused false dependencies between them and restricted the freedom
of post-RA scheduling and other late codegen passes.

Diff Detail

Event Timeline

foad created this revision.Nov 2 2021, 6:56 AM
foad requested review of this revision.Nov 2 2021, 6:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2021, 6:56 AM
foad added a comment.Nov 5 2021, 8:49 AM

Some stats from compiling a corpus of 10,000 graphics shaders:

  • 57 of them (mostly compute shaders) actually have some redundant mov instructions removed
  • 2,000 of them have other codegen differences due to post-ra scheduling being less constrained
  • average compile time is slower by 0.7%