We can expand the scope of shrink wrapping by not creating a stack frame in non-leaf functions.
For example, in the call sequence of A->B->C , we can help shrink wrapping while compiling B by allowing C directly return the control to A when exiting from C. I call it direct return.
In this case, we do not need to create stack frame in B and it increases the opportunity of shrink wrapping.
To apply direct return, we need to confirm some conditions including:
- invocation B->C must be a tail call (i.e. no instruction between call and return in B).
- invocation B->C must be with the internal linkage since direct return does not comply with ABI.
- other ABI specific checks.
In this patch, this optimization is enabled only for PowerPC64 with ELFv2 ABI, but it might be applicable for other platforms by implementing a platform-specific part.
The original motivation of this patch is to optimize the hot method of tcmalloc. In which GCC applies shrink wrapping with the direct return, but LLVM cannot. By applying this patch and basic block deduplication ( https://reviews.llvm.org/D30774 ), LLVM can do shrink wrapping for this hot method.
Use /// (doxygen style comment)