This is an archive of the discontinued LLVM Phabricator instance.

[amdgpu][lds] Remove recalculation of LDS frame from backend
ClosedPublic

Authored by JonChesterfield on Jul 13 2023, 5:30 AM.

Details

Summary

Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.

Prior to this patch:
The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol
metadata so that the assembler can build lookup tables out of it. There is a fragile association between
kernel functions and named structs which is used to recompute the frame layout in the backend, with
fatal_errors catching inconsistencies in the second calculation.

After this patch:
The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same
absolute_symbol metadata that the assembler uses to place objects within that frame size.

Deleted the now dead allocation code from the backend. Left for a later cleanup:

  • enabling lowering for anonymous functions
  • removing the elide-module-lds attribute (test churn, it's not used by llc any more)
  • adjusting the dynamic alignment check to not use symbol names

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2023, 5:30 AM
JonChesterfield requested review of this revision.Jul 13 2023, 5:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2023, 5:30 AM
JonChesterfield edited the summary of this revision. (Show Details)Jul 13 2023, 5:32 AM

This is D155125 with the recalculation deleted, instead of set up to catch regressions under CI.

llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp
94

Deleting this association gives lds lowering more data layout freedom and drops the limitation on anonymous functions

103

this association will be removed in a later patch

116

this is the recomputation of the LDS frame being deleted

arsenm added inline comments.Jul 13 2023, 12:54 PM
llvm/docs/AMDGPUUsage.rst
1095

Note frontends shouldn't be setting it?

llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
1109

Typo lnger

llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp
87

Probably should go through DiagnosticInfo. I don't know why we don't have sensible predefined types for this sort of thing. DK_Lowering looks promising but is buried in one specific user

llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp
87

Maybe, it's not user facing. Currently the only way to reach this is via a bug in LowerModuleLDS. Could be an assert instead, or just delete the checks.

Equivalent to the fatal errors in the allocateKnownAddressLDSGlobal which this patch deletes.

  • review comments
JonChesterfield marked 2 inline comments as done.Jul 13 2023, 1:15 PM
arsenm accepted this revision.Jul 13 2023, 3:33 PM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp
48

Not sure exactly what two variables or "profitable" this is talking about

76

If we did the same thing, could probably fix the bug with amdgpu-gds-size not interacting well with globals

This revision is now accepted and ready to land.Jul 13 2023, 3:33 PM
llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp
48

The LDSSize and StaticLDSSize immediately below the comment. I'm pretty sure we don't need both of them. The distinction is meant to elide some padding in some edge cases, but I don't think that edge case exists any more. I'm having some trouble working out exactly how graphics use LDS.

76

Seems likely that gds lowering would work well with absolute addresses, yes.