One of the memory types being supported within AMD GPU memory hierarchy is
shared memory, also called Local Data Share or LDS for short. LDS memory
is the second fastest memory in the AMD GPU memory hierarchy (with register
file being fastest available memory in the hierarchy). Being faster also
means LDS memory is comparatively costlier and hence is a limited available
memory resource.
Being global scoped, an LDS variable is accessible within kernel functions
and non-kernel functions, but two different kernel execution paths, say
called from two kernels K1 and K2, cannot access the same instance of an LDS
variable, say L. Both K1 and K2 has to own its own instance of L. This puts
some challenges, especially to lower the LDS variables used within non-kernel
functions.
So, the pass - "Lower Module LDS" lowers the LDS globals by packing them
within in a struct type, and by creating an instance of that struct type
within every kerenl at address zero. Though, the pass - "Lower Module LDS"
makes some effort to minimize unnecessary LDS allocation, it is limited by
means of the fundamental basis and assumption upon which the pass is
implemented.
The current pass acts as an helping aid to the pass - "Lower Module LDS" with
the intention of minimizing unnecessary LDS allocation as much as possible.
The main idea behind the current pass is:
(1) To identify the LDS globals used within non-kernel function scope and
global scope.
(2) To push the use of all the above identified LDS globals to kernel
function scope by initializing their addresses to newly created LDS
global pointer variables (within kernel functions).
(3) To replace the uses of original LDS globals within non-kernel functions
by their pointer counter-parts.
(4) This way, the transformation makes sure that the pass "Lower Module LDS"
packs only pointer variables within struct type, and hence significantly
minimizes unnecessary LDS allocation, espacically when the original LDS
globals are big arrays (as this is the common LDS use case).
clang-tidy: warning: invalid case style for function 'AssociateIndirectCallSiteWithPotentialCallees' [readability-identifier-naming]
not useful