This is an archive of the discontinued LLVM Phabricator instance.

[amdgpu] Reimplement LDS lowering
ClosedPublic

Authored by JonChesterfield on Dec 6 2022, 7:35 AM.

Details

Summary

Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variable basis.

Unit tests are set up to pass with the default lowering of "module" or "hybrid"
with this patch defaulting to "module", which will be a less dramatic codegen
change relative to the current. This reflects the sparsity of test coverage for
the table lowering method. Hybrid is better than module in every respect and
will be default in a subsequent patch.

Diff Detail

Event Timeline

JonChesterfield created this revision.Dec 6 2022, 7:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 6 2022, 7:35 AM
JonChesterfield requested review of this revision.Dec 6 2022, 7:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 6 2022, 7:35 AM

All previous review cycle comments are at D138141 which was sadly eradicated by a arc diff main --update D138141.

JonChesterfield edited the summary of this revision. (Show Details)Dec 6 2022, 7:38 AM
arsenm added inline comments.Dec 6 2022, 7:41 AM
llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
176–185

Can you fix this situation in a later commit?

398

Not sure why this is false, change in a separate commit?

485

Capitalize

llvm/test/CodeGen/AMDGPU/lower-module-lds-single-var-ambiguous.ll
3

Can you use -passes?

18

New tests should use opaque pointers

llvm/test/CodeGen/AMDGPU/lower-module-lds-single-var-unambiguous.ll
21

Opaque pointers

  • write passes= in the new tests
arsenm accepted this revision.Dec 6 2022, 8:04 AM
This revision is now accepted and ready to land.Dec 6 2022, 8:04 AM
JonChesterfield marked an inline comment as done.Dec 6 2022, 8:08 AM
JonChesterfield added inline comments.
llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
176–185

Unclear. It effectively means using this pass for graphics, as well as compute.

398

it's the defaults except for ignoring llvm used iirc, will have to look up what qualifies for the other cases to see if I can get test coverage for them

This revision was landed with ongoing or failed builds.Dec 6 2022, 8:28 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Dec 6 2022, 8:43 AM

This breaks tests: http://45.33.8.238/linux/93392/step_12.txt

Please take a look and revert for now if it takes a while to fix.

Thanks! Just went to revert and collided with the one you'd landed.

Some of the bots failed on the waitcnt test that fails for me on main so it'll take a while for me to work out what do with that. The addrspace one looks like a cherry pick misfire from applying this to main.

JonChesterfield reopened this revision.Dec 7 2022, 8:16 AM
This revision is now accepted and ready to land.Dec 7 2022, 8:16 AM
  • Postpone on the fly constant folding for a later patch, simplify test changes
This revision was landed with ongoing or failed builds.Dec 7 2022, 2:04 PM
This revision was automatically updated to reflect the committed changes.

The hybrid lowering is much faster but requires a resolution to failure described in comments on D131246 first.

cdevadas added inline comments.
llvm/test/CodeGen/AMDGPU/addrspacecast-known-non-null.ll
31

Why this write to v0 is gone now?
The code looks broken without defining v0 before swappc is called.

llvm/test/CodeGen/AMDGPU/addrspacecast-known-non-null.ll
31

This is a function which is reachable by no kernel, thus ignored by the pass that lowers LDS globals. There's no kernel to put the LDS variable into. It'll compile with a warning from llc and replacing the use of the variable with undef.

Making this function a kernel, or making it reachable from a kernel, will have the effect of allocating the LDS variable (and avoiding the warning from the back end).

Codegen for this function should be identical with the LDS lowering pass enabled or disabled.

arsenm added inline comments.Dec 15 2022, 8:59 AM
llvm/test/CodeGen/AMDGPU/addrspacecast-known-non-null.ll
31

Probably not the most helpful behavior for an externally visible function. Probably should treat this like an external LDS relocation