[ThinLTO] Import globals
ClosedPublic

Authored by evgeny777 on Thu, Feb 8, 8:46 AM.

Details

Summary

This patch imports external global variables references by a function, so that instcombine and globalopt can do some useful work (constant folding and static constructor evaluation) before elim-avail-extern either remove imported declarations or convert them back to externals. To simplify things this version only imports globals which have no outgoing references, so I don't have to care about promotion and importing debug info.

I've tested this on several large in-house projects and got 0.7-0.8% size improvement for each of them. I've also tried building LLVM/clang using this patch and also got some size improvement which is however much less impressive (several KB for some executables, about 56K for all of them in Release configuration). LLVM/clang built with this patch pass all tests successfully.

Diff Detail

Repository
rL LLVM
evgeny777 created this revision.Thu, Feb 8, 8:46 AM

Nice! Glad to hear there are some measurable improvements. A few comments/suggestions below.

lib/Linker/IRMover.cpp
1442 ↗(On Diff #133431)

I'm not as familiar with the struct mapping - what happens without this change? Why did importing variables necessitate it?

lib/Transforms/IPO/FunctionImport.cpp
245 ↗(On Diff #133431)

Why would the modulePath be empty? I think we should only have summaries during the thin link for defs in IR. Oh - answering my own question, I think - this is for regular LTO module summaries which are added to a dummy module. Please add comment.

257 ↗(On Diff #133431)

Add a comment about why we are limiting to only variable refs with external linkage and no other refs.

test/Transforms/FunctionImport/funcimport.ll
109 ↗(On Diff #133431)

This stat is not labeled correctly. It at least needs to change to the number of globals imported. It would be nice to have separate stats for functions and variables imported though.

Please also fix the debug import dumping code in dumpImportListForModule and at the end of ComputeCrossModuleImport to report functions and variables separately.

FYI: Here are some results from running LLVM test suite:

evgeny777 added inline comments.Mon, Feb 12, 1:34 AM
lib/Linker/IRMover.cpp
1442 ↗(On Diff #133431)

Consider this C++ program, which consists of three TUs:

main.cpp:

struct Bar { i64 a,b; } B; // Do not optimize this out
struct Foo { i64 a, b; }; // Foo is isomorphic to Bar

Foo *getFoo(); // We'll import this function from foo.cpp
struct S { Foo *_f } instance = { getFoo(); };

void main() { return instance._f->a; }

foo.cpp

struct Foo { i64 a, b; }; // Same as in main.cpp
struct Baz { i64 a, b; Foo f; }; 

Foo *gFoo;
extern Baz *gBaz;

Foo *getFoo() {
  return gFoo ? gFoo : &gBaz->f;
}

baz.cpp

struct Foo { i64 a, b; };
struct Baz { i64 a, b; Foo f; };  // Same as in foo.cpp

Baz *gBaz;

Now you perform thinlto import and get following definition of Baz in main.o0.3.import.bc when no name mapping used:

%struct.Bar = type { i64, i64 }
%struct.Foo = type { i64, i64 }

; This is incorrect. We should have %struct.Foo in the 3rd field
; This happens because Foo and Bar are recursively isomorphic
; and Bar is seen first by IRLinker.
%struct.Baz = type { i64, i64, %struct.Bar }

The faulted import causes bitcast created in static constructor:

; This static constructor can't be evaluated  due to a bitcast (Bar -> Foo)
define internal void @_GLOBAL__sub_I_main.cpp() #1 section ".text.startup" {
entry:
  %call.i.i = tail call %struct.Foo* bitcast (%struct.Bar* ()* @_Z6getFoov to %struct.Foo* ()*)()
  store %struct.Foo* %call.i.i, %struct.Foo** getelementptr inbounds (%struct.S, %struct.S* @instance, i64 0, i32 0), align 8, !tbaa !2
  ret void
}

Adding name mapping fixes the problem:

%struct.Baz = type { i64, i64, %struct.Foo } ; Good

; No bitcast - evaluator is happy.
define internal void @_GLOBAL__sub_I_main.cpp() section ".text.startup" {
entry:
  %call.i.i = tail call %struct.Foo* @getFoo()
  store %struct.Foo* %call.i.i, %struct.Foo** getelementptr inbounds (%struct.S, %struct.S* @instance, i64 0, i32 0), align 8
  ret void
}

The problem isn't specific at all to importing globals. You can easily rewrite this example to using functions (i.e gBaz -> getBaz) and get the same result. (it probably makes sense to divide this patch into two separate ones).

BTW the negative impact of those bitcasts isn't limited to evaluator. It also affects inliner behavior - see D38896

tejohnson added inline comments.Mon, Feb 12, 7:05 AM
lib/Linker/IRMover.cpp
1442 ↗(On Diff #133431)

Thanks for the detailed explanation. Since this isn't specific to the importing of variables, and can happen already with functions, please do separate out into a different patch.

evgeny777 added inline comments.Mon, Feb 12, 7:17 AM
lib/Transforms/IPO/FunctionImport.cpp
257 ↗(On Diff #133431)

I think we can use !GlobalValue::isInterposableLinkage(RefSummary->linkage()) here just like we use for functions. I'll check

evgeny777 retitled this revision from [ThinLTO] Import external globals to [ThinLTO] Import globals.

Addressed review comments

evgeny777 marked 2 inline comments as done.Wed, Feb 14, 5:00 AM
tejohnson accepted this revision.Wed, Feb 14, 8:38 AM

LGTM with a request for a comment below.

lib/Transforms/IPO/FunctionImport.cpp
259 ↗(On Diff #134079)

Please add comment about why the restriction on having no outgoing refs.

This revision is now accepted and ready to land.Wed, Feb 14, 8:38 AM
This revision was automatically updated to reflect the committed changes.