This patch concerns chains of calls like so:
pluto(); foo() { pluto(); } wibble() { foo(); }
Suppose foo has internal linkage. Also let's say that the call to foo within wibble is the only call to foo within the module.
Normally, the inlining cost model will apply a large bonus to ensure that foo is inlined into wibble because there's basically no cost in doing so. However, if wibble is unreachable-terminated, this won't happen. This is because the cost model currently does the following:
- Check if the block containing the call is unreachable-terminated (allowSizeGrowth)
- If it is, set the threshold to 0 and return
This happens before the bonus is applied. Therefore, any "zero-cost" case relying on the bonus won't ever be inlined when we're dealing with unreachable-terminated blocks.
This commit
- Removes the early return when allowSizeGrowth is false
- Wraps the threshold tweaks in a conditional which is true only when size growth is allowed.
The tweaks are wrapped in a conditional to reflect that we only want to inline when the cost of inlining is truly 0 or better; any modifications to the threshold would break this assertion. The early return is removed to facilitate inlining the example case.
This produced some minor code size improvements for ARM, AArch64, and x86-64 at Oz.
Output from compare.py here for Oz: https://hastebin.com/fojuquzoru.erl
Edit: More clarity. The word salad from before wasn't that great.
The changes after this point do not seem to be necessary, i.e. could they stay at the same place?