This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] WIP: Use implicit operands instead of RegScavenger for AGPR copy lowering on gfx908
Changes PlannedPublic

Authored by jrbyrnes on Jun 2 2023, 1:30 PM.

Details

Reviewers
None
Summary

By attaching implicit vgpr to these copies, they will be allocated and available at time of copy lowering, so we no longer need register scavenging. The price is that in high RP kernel, this may cause additional spilling.

This is a WIP while I investigate what is needed to do the same for AGPR copy "spills" (and potentially remove VGPRForAGPRCopy).

Diff Detail

Event Timeline

jrbyrnes created this revision.Jun 2 2023, 1:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2023, 1:30 PM
jrbyrnes requested review of this revision.Jun 2 2023, 1:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2023, 1:31 PM

@arsenm, do you have any high level disagreements with the general approach here?

jrbyrnes planned changes to this revision.Jun 5 2023, 10:55 AM

Per offline discussion -- we need to avoid adding new virt regs during live range splitting.

Tentative plan involves a two-part approach:

  1. Rework RA prioritization s.t. AGPR live-range splitting (and copy emission) is reasonably rare on gfx908.
  2. Attach as many implicit vgprs as reasonably possible (i.e. the PreRAOptimizations functionality here), and rely on VGPRForAGPRCopy in the rare copies in which no implicit operand is available.