The LDS pointers need to be initialized within the entry basic block of
kernel(s) after all alloca, but before any call instruction. If this is
not possible, then we skip running this pass for now.
Ideally alloca can appear anywhere within the function, and the AMDGPU
backend should be able to handle it, but at the moment it cannot. Once
AMDGPU backend is able to robustly handle alloca inserted anywhere, then
this hack is no longer required.
All this patch should do is skip the insertion point past the allocas clustered at the start of the entry block. If there are further allocas in the program which may be broken, you should not be concerning yourself with them. I don't want to add a workaround for non-entry allocas here