Currently the superalign option only increases the alignment of
variables that are moved into the module.lds block. Change that to all LDS
variables. Also only increase the alignment once, instead of once per function.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This will increase the LDS used by some cases (specifically, padding around variables that aren't moved into the module.lds block). I assume that's expected to be a net win. Internal CI running at 611966.
| llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp | ||
|---|---|---|
| 216 | Somewhat misleading diff. processUsedLDS has had the if (SuperAlignLDSGlobals) {} block copied out into a helper function but is otherwise unchanged. | |
| llvm/test/CodeGen/AMDGPU/lower-kernel-lds-super-align.ll | ||
| 1 | preference for explicit here, means the test keeps working if we change the default on the pass | |
If this is supposed to be the guaranteed to work in all cases pass, isn't that an issue? Also it's probably not a net win if the increase pushes over an occupancy window
I thought overaligning these would be a loss in general when it was proposed (hence the flag) but haven't done the benchmarking. Kind of taking it on faith that someone else has since the feature was implemented. epsdb passed fwiw.
To be more precise, I'm not convinced this makes much real world difference as most variables are naturally aligned anyway but it looks like a mistake that we only change the subset of LDS variables that are already slow to access and allocated in many places.
Also totally happy with deleting this feature or defaulting it to off. The point of this exercise is to get it out of the 'lowering' LDS pass and into a to be written 'optimisation' LDS pass, where this is the (minor?) semantic half of that change and moving it to a separate pass would then be non-functional.
Somewhat misleading diff. processUsedLDS has had the if (SuperAlignLDSGlobals) {} block copied out into a helper function but is otherwise unchanged.