This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Further reduce attaching of implicit operands to spills / copies
ClosedPublic

Authored by jrbyrnes on Jan 11 2023, 5:22 PM.

Details

Summary

Extension of https://reviews.llvm.org/D141101 to even further reduce the amount of implicit operands we attach. The main benefit is to improve cability of post-ra scheduler, and reduce unneeded dependency resolution (e.g. inserting snops).

Unfortunately, we run into regressions if we completely minimize the amount implicit operands (naively), we run into some regressions (e.g. dual_movs are replaced with multiple calls to v_mov). This is even more reason to switch to LiveRegUnits.

Nonetheless, this patch removes the operands which we can for free (more or less).

Change-Id: Ib4f409202b36bdbc59eed615bc2d19fa8bd8c057

Diff Detail

Event Timeline

jrbyrnes created this revision.Jan 11 2023, 5:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 11 2023, 5:22 PM
jrbyrnes requested review of this revision.Jan 11 2023, 5:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 11 2023, 5:22 PM
arsenm added inline comments.Jan 11 2023, 5:25 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
611–616

Should do logic on the flags operand and have one addReg call

662–666

Ditto

1027

Didn't use the IsLastSubreg defined above

jrbyrnes updated this revision to Diff 488666.Jan 12 2023, 8:28 AM
jrbyrnes marked 3 inline comments as done.

Cleaned up minor code issues.

arsenm added inline comments.Jan 12 2023, 8:31 AM
llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir
554

I don't understand what happened here, a use became a def and is breaking the value for sgpr1. I also doubt we needed this implicit use in the first place

jrbyrnes added inline comments.Jan 12 2023, 8:46 AM
llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir
554

The implicit-def of sgpr is used to provide def of later V_MOV_B32_e32 $sgpr in the case we have partially undef super sgpr. If we just use implicit-use here, then those later uses may be reschedule to a point where it is undef. This specific test doesn't have the issue.

Currently, we just attach implicit use to every V_MOV_B32_e32 $sgpr which for some reason satisfies the verifier in problematic case.

llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll
651

Not sure what has happened here -- I'll look into it.

jrbyrnes updated this revision to Diff 490004.Jan 17 2023, 5:51 PM

Revert attaching one implicit-def instead of implicit use on each sub reg op where applicable.

jrbyrnes added inline comments.Jan 17 2023, 5:53 PM
llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir
554

The verifier will be happy if 1. The subreg is not undef (i.e. we have attached implicit-def previously), or 2. The instruction using undef subreg has implicit use of super reg (https://github.com/llvm/llvm-project/blob/5073a622a785e8fd542fd15484970a435ef2e3e5/llvm/lib/CodeGen/MachineVerifier.cpp#L2485).

Since using implicit-def breaks the value, will revert back to having implicit use on each subreg instruction. Moreover, since hazard recognizer checks MI->explicit_uses() for the uses and MI->modifiesRegister for defs, it will pick up implicit_def but not implicit use. However, post-ra scheduler will have less capability with more dependencies.

llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll
651

This is coming from a previous commit -- it seems the test was not updated.

jrbyrnes updated this revision to Diff 490330.Jan 18 2023, 4:28 PM

Rebase + Cleanup unecessary variables

arsenm accepted this revision.Jan 18 2023, 6:43 PM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
1021

Probably should write this as Idx == 0

llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
1527

Ditto

1731–1734

Ditto

This revision is now accepted and ready to land.Jan 18 2023, 6:43 PM
jrbyrnes updated this revision to Diff 490627.Jan 19 2023, 12:17 PM
jrbyrnes marked 3 inline comments as done.

Minor code fix, passes psdb

This revision was landed with ongoing or failed builds.Jan 19 2023, 2:31 PM
This revision was automatically updated to reflect the committed changes.