This is an archive of the discontinued LLVM Phabricator instance.

[InlineCost] Fix bug 42084: remember negative result when computing full inline cost
Needs RevisionPublic

Authored by yrouban on Jun 24 2019, 2:53 AM.

Details

Summary

This is a minimal fix for the bug https://bugs.llvm.org/show_bug.cgi?id=42084 extracted from the patch D63058.
The rest of D63058 could be treated as a feature.

Diff Detail

Event Timeline

yrouban created this revision.Jun 24 2019, 2:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 24 2019, 2:53 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
yrouban set the repository for this revision to rG LLVM Github Monorepo.Jun 24 2019, 2:54 AM
fedor.sergeev requested changes to this revision.Jun 24 2019, 9:25 AM

Please, see my comment to PR42084: https://bugs.llvm.org/show_bug.cgi?id=42084#c3.
A core of the problem is in two difference Cost-vs-Threshold checks used in analyzeBlock (Cost >= Threshold) and analyzeCall (Cost < max(1, Threshold)).
I believe a proper fix for this bug would be to use a unified Cost-vs-Threshold check everywhere.

This revision now requires changes to proceed.Jun 24 2019, 9:25 AM

Please, see my comment to PR42084: https://bugs.llvm.org/show_bug.cgi?id=42084#c3.
A core of the problem is in two difference Cost-vs-Threshold checks used in analyzeBlock (Cost >= Threshold) and analyzeCall (Cost < max(1, Threshold)).
I believe a proper fix for this bug would be to use a unified Cost-vs-Threshold check everywhere.

I agree that those two checks seem to be inconsistent with each other, but I insist on my fix. It makes the method analyzeBlock() return same result (negative or positive) regardless of the flag ComputeFullInlineCost.
The root cause is that the analyzeBlock() returns different results (negative or positive) for different ComputeFullInlineCost. And the checks you mentioned if fixed could just hide this difference.

Please, see my comment to PR42084: https://bugs.llvm.org/show_bug.cgi?id=42084#c3.
A core of the problem is in two difference Cost-vs-Threshold checks used in analyzeBlock (Cost >= Threshold) and analyzeCall (Cost < max(1, Threshold)).
I believe a proper fix for this bug would be to use a unified Cost-vs-Threshold check everywhere.

I agree that those two checks seem to be inconsistent with each other, but I insist on my fix. It makes the method analyzeBlock() return same result (negative or positive) regardless of the flag ComputeFullInlineCost.
The root cause is that the analyzeBlock() returns different results (negative or positive) for different ComputeFullInlineCost. And the checks you mentioned if fixed could just hide this difference.

The root cause for what?

Currently as implemented ComputeFullInlineCost implies traversing *all* the instructions in order to:

  • find some "interesting" constructs that prohibit inlining, as presence of these constructs is interesting to inline-cost users by itself
  • compute full cost (e.g. as though Threshold was infinite)

When ComputeFullInlineCost is false we allow to make shortcuts and skip as many calculations as possible if we can prove that inlining result is negative.

In line with this definition, analyzeBlock's negative return value is only used to terminate the walk through the blocks,
so returning false or true only changes amount of blocks being traversed. Which is a perfectly good and expected effect for ComputeFullInlineCost.

With your suggested change in ComputeFullInlineCost mode we:

  • continue traversing a single block in analyzeBlock as before
  • start detecting Cost-vs-Threshold violation
  • stop traversal of blocks in analyzeCall on Cost-vs-Threshold violation as detected by analyzeBlock

That leads to early termination of the walk through blocks, which I believe is not intended.

An example of inline-cost usage for the purpose of "inline body investigation" on inlining validity is in SampleProfileLoader::inlineCallInstruction.
Comments from it:

// Checks if there is anything in the reachable portion of the callee at
// this callsite that makes this inlining potentially illegal.

It is interesting that initial version of D37779 that introduced this usage was initially setting "kinda infinite" Threshold in order to reach the same effect.

It does seem that we need a more disciplined interface for the "investigation" mode, but in absence of that ComputeFullInlineCost mode needs to adhere to the initial idea of traversing *all* the reachable instructions.

xbolva00 resigned from this revision.Jul 29 2019, 10:21 AM