Please, see my comment to PR42084: https://bugs.llvm.org/show_bug.cgi?id=42084#c3.
A core of the problem is in two difference Cost-vs-Threshold checks used in analyzeBlock (Cost >= Threshold) and analyzeCall (Cost < max(1, Threshold)).
I believe a proper fix for this bug would be to use a unified Cost-vs-Threshold check everywhere.
I agree that those two checks seem to be inconsistent with each other, but I insist on my fix. It makes the method analyzeBlock() return same result (negative or positive) regardless of the flag ComputeFullInlineCost.
The root cause is that the analyzeBlock() returns different results (negative or positive) for different ComputeFullInlineCost. And the checks you mentioned if fixed could just hide this difference.
The root cause for what?
Currently as implemented ComputeFullInlineCost implies traversing *all* the instructions in order to:
- find some "interesting" constructs that prohibit inlining, as presence of these constructs is interesting to inline-cost users by itself
- compute full cost (e.g. as though Threshold was infinite)
When ComputeFullInlineCost is false we allow to make shortcuts and skip as many calculations as possible if we can prove that inlining result is negative.
In line with this definition, analyzeBlock's negative return value is only used to terminate the walk through the blocks,
so returning false or true only changes amount of blocks being traversed. Which is a perfectly good and expected effect for ComputeFullInlineCost.
With your suggested change in ComputeFullInlineCost mode we:
- continue traversing a single block in analyzeBlock as before
- start detecting Cost-vs-Threshold violation
- stop traversal of blocks in analyzeCall on Cost-vs-Threshold violation as detected by analyzeBlock
That leads to early termination of the walk through blocks, which I believe is not intended.
An example of inline-cost usage for the purpose of "inline body investigation" on inlining validity is in SampleProfileLoader::inlineCallInstruction.
Comments from it:
// Checks if there is anything in the reachable portion of the callee at // this callsite that makes this inlining potentially illegal.
It is interesting that initial version of D37779 that introduced this usage was initially setting "kinda infinite" Threshold in order to reach the same effect.
It does seem that we need a more disciplined interface for the "investigation" mode, but in absence of that ComputeFullInlineCost mode needs to adhere to the initial idea of traversing *all* the reachable instructions.