This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Add an option to disable (thin)lto internalization.
ClosedPublic

Authored by trentxintong on Oct 15 2018, 10:01 AM.

Details

Summary

LTO and ThinLTO optimizes the IR differently.

One source of differences is the amount of internalizations that
can happen.

Add an option to enable/disable internalization so that other
differences can be studied in isolation. e.g. inlining.

There are other things lto and thinlto do differently, I will add
flags to enable/disable them as needed.

Diff Detail

Event Timeline

trentxintong created this revision.Oct 15 2018, 10:01 AM

Internalization leads to the possibility of changing the function ABI, better global modref AA, better chance of discarding an unused symbol, etc.

ThinLTO usually does not do as good of a job internalizing than LTO.

With this option, we can study the effect of other optimizations, primarily importing+inlining, in (mostly) isolation. i.e. With it disabled, we can study the performance gap between LTO and ThinLTO because of differences in importing and inlining decisions in isolation.

Another benefit we could get is if we disable internalization and LTO, ThinLTO binary performs very similarly (or less differently than before). We can then focus on looking at what have been internalized and what benefits those bring.

The ThinLTO code is shared between the new and old LTO APIs (via thinLTOInternalizeAndPromoteInIndex), but the regular LTO internalization is not. Suggest sharing this option with lib/LTO/LTOCodeGenerator.cpp and guarding the regular LTO internalization there too to cover the old LTO API case.

Address @tejohnson suggestions. Thanks!

tejohnson added inline comments.Oct 15 2018, 12:57 PM
lib/LTO/LTOCodeGenerator.cpp
425 ↗(On Diff #169740)

How about initialize ShouldInternalize in the constructor based on that option value?

Address @tejohnson suggesions. Thanks!

Code looks fine now but just realized there is no test. You can probably create one test with a function that can normally be internalized, and then try it for the 4 combinations of LTO/ThinLTO and new/old LTO with and without your option. See test/ThinLTO/X86/internalize.ll for a ThinLTO-specific internalization test that tests both the old (via llvm-lto) and the new (via llvm-lto2) LTO APIs.

Sure. I will add those tests. Thanks for spending time reviewing this @tejohnson!

Address @tejohson's suggestion to add test cases.

I've added 3/4 suggested scenarios and could not figure out how to add LTO with llvm-lto
More specifically, it seems to me that llvm-lto does not go through the runRegularLTO path
(where GVs are internalized) when processing LTO modules.

Address @tejohson's suggestion to add test cases.

I've added 3/4 suggested scenarios and could not figure out how to add LTO with llvm-lto
More specifically, it seems to me that llvm-lto does not go through the runRegularLTO path
(where GVs are internalized) when processing LTO modules.

That's because llvm-lto invokes the old LTO API (i.e the interfaces in LTO/legacy/LTOCodeGenerator.h), whereas llvm-lto2 invokes the new LTO API (which includes runRegularLTO).

To invoke regular LTO instead of ThinLTO you should not use -module-summary with opt (see my comment in your new test attempting to do regular LTO). You should be able to test regular LTO with the old LTO API. I.e. your change to LTOCodeGenerator.h should cause LTOCodeGenerator::applyScopeRestrictions to exit before invoking internalizeModule.

Note that ThinLTO for both old and new LTO APIs invokes the same thinLTOInternalizeAndPromoteInIndex routine, it's only regular LTO that has very different support for doing the internalization.

test/LTO/X86/internalize.ll
1

Don't use -module-summary for testing regular LTO. With a module summary, the invocations of llvm-lto2 below are both doing ThinLTO.

test/ThinLTO/X86/internalize.ll
7

Nit: line length

17

Nit: line length

Address @tejohnson suggestions.

tejohnson accepted this revision.Nov 5 2018, 6:48 AM

LGTM with some minor comment suggestions

test/LTO/X86/internalize.ll
8

Either change to "enable-lto-internalization=false makes sure..." or maybe just change to "This make sure..." for simplicity, since you already mention the option name on the prior line.

19

Same suggestion: "This makes sure..." (will also fix line length which seems too long).

test/ThinLTO/X86/internalize.ll
7

Same suggestion as above

17

Same suggestion as above

This revision is now accepted and ready to land.Nov 5 2018, 6:48 AM

Update last bit of suggestions. Thanks for review @tejohnson.

This revision was automatically updated to reflect the committed changes.