Page MenuHomePhabricator

[amdgpu] Reimplement LDS lowering
AbandonedPublic

Authored by JonChesterfield on Nov 16 2022, 9:26 AM.
Tags
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project

Details

Summary

Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variable basis.

Unit tests are set up to pass with the default lowering of "module" or "hybrid"
with this patch defaulting to "module", which will be a less dramatic codegen
change relative to the current. This reflects the sparsity of test coverage for
the table lowering method. Hybrid is better than module in every respect and
will be default in a subsequent patch.

Diff Detail

Unit TestsFailed

TimeTest
0 msx64 windows > LLVM-Unit.Support/_/SupportTests_exe/Unicode::columnWidthUTF8
Script: -- C:\ws\w32-1\llvm-project\premerge-checks\build\unittests\Support\.\SupportTests.exe --gtest_filter=Unicode.columnWidthUTF8
270 msx64 windows > LLVM.CodeGen/AMDGPU::global-variable-relocs.ll
Script: -- : 'RUN: at line 1'; c:\ws\w32-1\llvm-project\premerge-checks\build\bin\llc.exe -mtriple=amdgcn--amdhsa -mcpu=fiji < C:\ws\w32-1\llvm-project\premerge-checks\llvm\test\CodeGen\AMDGPU\global-variable-relocs.ll | c:\ws\w32-1\llvm-project\premerge-checks\build\bin\filecheck.exe C:\ws\w32-1\llvm-project\premerge-checks\llvm\test\CodeGen\AMDGPU\global-variable-relocs.ll
110 msx64 windows > LLVM.CodeGen/AMDGPU::waitcnt-looptest.ll
Script: -- : 'RUN: at line 1'; c:\ws\w32-1\llvm-project\premerge-checks\build\bin\llc.exe < C:\ws\w32-1\llvm-project\premerge-checks\llvm\test\CodeGen\AMDGPU\waitcnt-looptest.ll -mtriple=amdgcn--amdhsa -mcpu=fiji -mattr=-flat-for-global -amdgpu-load-store-vectorizer=0 | c:\ws\w32-1\llvm-project\premerge-checks\build\bin\filecheck.exe --check-prefix=GCN C:\ws\w32-1\llvm-project\premerge-checks\llvm\test\CodeGen\AMDGPU\waitcnt-looptest.ll

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
JonChesterfield marked 6 inline comments as done.Dec 1 2022, 6:01 AM
  • address Matt's comments

@arsenm thanks, all addressed. Constantexpr were a recurring pain for this pass because their uniqueness interferes with the callgraph walking so are unconditionally rewritten into instructions at the top of run().

llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
14–49

reluctant to document this in user facing fashion - there's a credible risk of it being improved in the future, and none of the flags here are intended to be used on a per application basis

477

sure

491

arrayref is good, thanks

517

got some mileage out of ArrayRef for these arguments, thanks

589

Constant expressions involving LDS variables are unconditionally expanded before this point by eliminateConstantExprUsesOfLDSFromAllInstructions.

Primarily because one constantexpr can be reachable from multiple kernels through instructions which are only reachable from distinct kernels, so eliminating them up front makes the analysis more precise as well as the rest of this pass much simpler.

Constant initializer is indeed harder and is presently out of scope - LDS variables have undef as their initialiser, and non-LDS variables have program scope which is broadly resistant to per-kernel-execution initialisation with the address of LDS variables.

An exception for that can be carved out for invariant LDS addresses (e.g. if we want an attribute to say a given global is always at offset 42, whatever kernel is involved), but we don't have a concept for that implemented yet beyond the emergent feature of the back end calculating offsets into a per-kernel struct.

622

changed it to i32 for now, can always back it back down to i16 with more error reporting later if we want to. Stopped short of dropping inttoptr as it's difficult to test with D131246 applied

732–735

Today I learned temporary lifetime extension doesn't work through the ternary operator. Sad times. Also using nullptr as the key into the map wins me a segfault, so gone with ugly workaround pending a better idea.

759–776

i think this is OK as is - it's an undocumented command line flag, where "kernel" is commented as only applicable in narrow cases - and the report_fatal_error path is otherwise only reachable on a bug in this pass

1152

this was refactoring noise, inlined replaceLDSVariablesWithStructVec into replaceLDSVariablesWithStruct to remove

arsenm accepted this revision.Dec 6 2022, 7:17 AM

LGTM with some nits

llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
893

Capitalize

1176–1177

Junk comment?

llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp
139

Capitalize

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
6012

Newlines

6018

More early return?

This revision is now accepted and ready to land.Dec 6 2022, 7:17 AM
  • remove obsolete comment
Herald added a reviewer: andreadb. · View Herald Transcript
Herald added a reviewer: shafik. · View Herald Transcript
Herald added a reviewer: sstefan1. · View Herald Transcript
Herald added a reviewer: herhut. · View Herald Transcript
Herald added a reviewer: rriddle. · View Herald Transcript
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a reviewer: MaskRay. · View Herald Transcript
Herald added a reviewer: ftynse. · View Herald Transcript
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a reviewer: rafauler. · View Herald Transcript
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a reviewer: NoQ. · View Herald Transcript
Herald added a reviewer: ributzka. · View Herald Transcript
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a reviewer: njames93. · View Herald Transcript
Herald added projects: Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
This revision now requires review to proceed.Dec 6 2022, 7:25 AM
JonChesterfield removed projects: Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project.
This revision is now accepted and ready to land.Dec 6 2022, 7:31 AM
Herald added a reviewer: andreadb. · View Herald Transcript
Herald added a reviewer: shafik. · View Herald Transcript
Herald added a reviewer: sstefan1. · View Herald Transcript
Herald added a reviewer: herhut. · View Herald Transcript
Herald added a reviewer: rriddle. · View Herald Transcript
Herald added a reviewer: MaskRay. · View Herald Transcript
Herald added a reviewer: ftynse. · View Herald Transcript
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a reviewer: rafauler. · View Herald Transcript
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a reviewer: NoQ. · View Herald Transcript
Herald added a reviewer: ributzka. · View Herald Transcript
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a reviewer: njames93. · View Herald Transcript
Herald added projects: Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
This revision now requires review to proceed.Dec 6 2022, 7:31 AM
JonChesterfield abandoned this revision.Dec 6 2022, 7:31 AM

I can't undo what arc has done here. Apologies for the mass spam.

philnik added a subscriber: philnik.Dec 6 2022, 7:40 AM

I can't undo what arc has done here. Apologies for the mass spam.

You have to update the diff first and then remove the people. Otherwise Herald will just re-add them.