So far, a first attempt at implementing this by looking at gcc and also the X86 llvm backend. Prologue and dynamic allocas handled - some questions remain:
- Prologue:
- GCC emits an 'lgr %r15,%r1' after the loop, which seems redundant, since it is known that %r15 has the value of %r1 already. Is this required to exist for some reason (omitted by patch for now)?
- gcc seems not to be probing the residual allocation after the loop. However if only two (unrolled) allocations were made, the residual is also probed.
- I am not aware of any real reason to not simply do the probing directly in emitPrologue(), but it seems wisest to do like X86 since inlineStackProbe() is called from common-code. Perhaps this relates to implementing shrink-wrapping or other things?
- emitBlockAfter(), splitBlockBefore() copied from SystemZISelLowering. Make into SysteZInstrInfo members instead?
- A little unsure about the use of unsigned vs uint64_t...
- Dynamic allocas:
- I took the X86 tests and copied them over as SystemZ tests and noticed that SystemZ gets these test cases built by SelectionDAGBuilder with dynamic_stackalloc nodes, while X86 seem to get these (constant) allocas merged into the stack frame. This is true also without this patch, but I am not sure why. In this case it seems even more preferred to avoid the dynamic_stackalloca nodes whenever possible...
- With dynamic allocas, it seems wise to always probe no matter what the size, but the "tail" in emitProbedAlloca() is not probed. This seems flawed to me:
First of all, there could be multiple dynamic allocas in a function and if they all are less than the ProbeSize a huge span could be built up without any probing:
Tail1 Tail2 Tail3 Tail4 --> | | | | | GGGGGGGGG
Then I am also worried about exiting the loop and allocating the remainder since only the topmost word in each allocated block is probed. If the guard page lies very close to that, and the remainder is relatively big, the bottom of the stack could end up way past the guard page:
Block1 Block2 Tail --> |P |P | | GGGGGGGGG
P = Probe, G = Guard page
This looks bad to me, but I really don't know - is this perhaps considered harmless for some reason?
I believe this will now fit onto a single line.