This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] More selectively attach implicit operands to agpr spills
ClosedPublic

Authored by jrbyrnes on Jan 5 2023, 5:11 PM.

Details

Summary

Implicit def operands are needed when we spill partially undef super registers by each individual subregister. The implicit-def operands will allow us to lower spills without the verifier complaining. Currently, we are overzeously attaching implicit operands, when we really only need them on the first sub reg spill op. By more selectively attached the implicit ops, we will free up some unneeded dependencies for the post-ra scheduler.

Moreover, this enables a previously incorrect optimization / resolves a correctness issue in indirectCopyToAGPR. When lowering AGPR copies on GFX908, we can improve CodeGen by reusing accvgpr_writes. However, we could not reliably determine which agprs accvgpr_writes actually define due to implicit-defs.

Diff Detail

Event Timeline

jrbyrnes created this revision.Jan 5 2023, 5:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2023, 5:11 PM
jrbyrnes requested review of this revision.Jan 5 2023, 5:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2023, 5:11 PM
jrbyrnes added a comment.EditedJan 5 2023, 5:11 PM

A draft / WIP while I finalize approach & test cases. If this is the direction we take, will supersede https://reviews.llvm.org/D140708

arsenm accepted this revision.Jan 6 2023, 5:44 AM

LGTM with nit. The verifier will catch the cases that are wrong. I'd still prefer to not do this optimization here in the first place, or at least to avoid direct overlaps checks by using regunit liveness checking

llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
1563

Lost a space after ||

This revision is now accepted and ready to land.Jan 6 2023, 5:44 AM
jrbyrnes updated this revision to Diff 486900.Jan 6 2023, 9:11 AM

Add tests to capture problematic cases. Be more pessimistic about overlapping agpr copies (since we allocate vgprs in triplets, we cannot special case VGPRForAGPRCopy). Resolve nit.

This revision was landed with ongoing or failed builds.Jan 9 2023, 3:10 PM
This revision was automatically updated to reflect the committed changes.
llvm/test/CodeGen/AMDGPU/partial-regcopy-and-spill-missed-at-regalloc.ll