This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Linker/
-
Linker/
1
IRMover.cpp
-
test/ThinLTO/X86/
-
ThinLTO/
-
X86/
-
index-const-prop.ll
-
index-const-prop2.ll

Differential D55309

ThinLTO: Do not import debug info for imported global constants
ClosedPublic

Authored by dblaikie on Dec 4 2018, 6:49 PM.

Download Raw Diff

Details

Reviewers

tejohnson
evgeny777

Commits

rG01f1d9b5894d: ThinLTO: Do not import debug info for imported global constants
rL348416: ThinLTO: Do not import debug info for imported global constants

Summary

It looks like this isn't necessary (in any tests I've done, it results
in the global being described with no location or value in the imported
side - while it's still fully described in the place it's imported from)
& results in significant/pathological debug info growth to home these
location-less global variable descriptions on the import side.

This is a rather pressing/important issue to address - this regressed
executable size for one example I'm looking at by 15%, object size is probably
similar though I haven't measured it, and a 22x increase in the number of CUs
in the cu_index in split DWARF DWP files, creating a similarly large regression
in the time it takes llvm-symbolizer to run on such binaries.

(This regression was introduced by r346584 (reverted & recommitted again in r347033) reviewed here: https://reviews.llvm.org/D49362 )

Diff Detail

Repository

rL LLVM

Build Status

Buildable 25705
Build 25704: arc lint + arc unit

Event Timeline

dblaikie created this revision.Dec 4 2018, 6:49 PM

Herald added subscribers: llvm-commits, arphaman, dexonsmith and 5 others. · View Herald TranscriptDec 4 2018, 6:49 PM

Harbormaster completed remote builds in B25705: Diff 176755.Dec 4 2018, 6:49 PM

I'm unaware of any regression for debug info quality that occurs with this change, but I haven't done wide scale testing of it - does anyone have a deeper/practical/feature test that demonstrates the motivation for this part of the change in the first place?

dblaikie edited the summary of this revision. (Show Details)Dec 4 2018, 6:51 PM

Ah, I found a comment from the original review:

"Now importing globals list from DICompileUnit in IRMover. This is needed because we may internalise GV in destination module, so it can reach final link."

If someone could provide me with an example of this happening, I'd love to take a look & see what we need to do. How badly this current patch fails in that situation (if it's not too bad, maybe we go with it for now to fix the regression - while working on a more accurate fix). The more accurate fix, I imagine, would be to only import the DIGlobalVariable for the particular imported variables - same way we would do for functions (but of course for variables it's a bit harder - for functions we fixed this by inverting the relationship between the DICompileUnit and the DISubprogram (so the subprogram points to the CU - subprograms are pulled in whenever their function is - even an optimized-away function has the inlined description that holds it live, global variables don't have those luxuries))

Hello David,

This is an example of importing read-only GV:

// main.c
extern int foo;
int main() { return foo; }

// foo.c
int foo = 42;

The variable foo is imported to main TU and internalized.
I don't know if/how importing variable list from DICompileUnit improves debugging experience. What's your concern? Executable size?

In D55309#1319764, @evgeny777 wrote:
Hello David,

This is an example of importing read-only GV:
// main.c
extern int foo;
int main() { return foo; }
// foo.c
int foo = 42;
The variable foo is imported to main TU and internalized.

Could you tell me the commands I need to run to test this? (I guess in this case I'd have to pass some whitelist of exported functions, so llvm could prove the variable wasn't modified/queried elsewhere)

Will this optimization result in the variable not being emitted into for.c's object file? Or will it still be emitted there, but possibly go unused?

I don't know if/how importing variable list from DICompileUnit improves debugging experience. What's your concern? Executable size?

I'm investigating/trying to address significant regressions in object size, executable size, dwp size, symbolizing performance Caused by this change in thinlto debug info.

Could you tell me the commands I need to run to test this?

You can try:

clang -flto=thin -O3 main.c foo.c -fuse-ld=lld

I'm investigating/trying to address significant regressions in object size, executable size, dwp size, symbolizing performance Caused by this change in thinlto debug info.

After some thinking I lean towards reverting my change in IRMover. Most of imported variables will be const-folded by optimizer, so debug info will be a waste of disk space.

In D55309#1320279, @evgeny777 wrote:
Could you tell me the commands I need to run to test this?

You can try:
clang -flto=thin -O3 main.c foo.c -fuse-ld=lld

I'll give that a go when I get to my desk in an hour or so.

In the mean time perhaps you can help me understand some of the corner cases:

Can this happen multiple times for a single variable? (It be imported and internalized into multiple other modules)
Does this alter the original variable in the original modules at all? Or does it remain untouched/external linkage?
In the above example, how does llvm know there won't be other uses of the variable outside the scope of this compilation command, that might modify the value of the variable? (Where does it decide "main is the only entry point"?)

I'm investigating/trying to address significant regressions in object size, executable size, dwp size, symbolizing performance Caused by this change in thinlto debug info.

After some thinking I lean towards reverting my change in IRMover. Most of imported variables will be const-folded by optimizer, so debug info will be a waste of disk space.

Debug info for global constants is still useful, so if this resulted in no description of the global variable, that could be unfortunate - but might be a useful stop-gap to fix the growth regression and then I can help you fix the debug info loss if there is one. But I wouldn't mind understanding that trade-off before we make it

From what I'm seeing, I would imagine this may regress debug info quality a bit. In the narrow example given, the DWARF still ends up with a description of 'foo', but without any location/value - though this is the same as if 'foo' was in the same translation unit (looks like LLVM fails to track the removing of the global and replacing everything with a constant - I believe the debug info metadata supports describing such a situation, but it isn't being used here - with or without this new change). SO that's not a regression.

But my concern would be if the global variable was in a different static library - then the user could have the variable imported, internalized, the debug info dropped there (with the patch I'm proposing), but then the original object in that static library may never be linked in (since its symbol is no longer needed) - so the resulting program has no DWARF that describes the variable.

I'd like to test this situation, if it's possible - does ThinLTO implement static library-style semantics? Yep, I was able to test that - didn't seem to be a problem. I guess maybe the linker looked at the object dependencies pre-optimization, rather than post-optimization, so the backend foo.o was still linked in, by the looks of it. I guess that makes sense.

So yeah, maybe there is no debug info loss here?

In D55309#1320360, @dblaikie wrote:

In D55309#1320279, @evgeny777 wrote:

I can answer some of these questions.

Could you tell me the commands I need to run to test this?

You can try:
clang -flto=thin -O3 main.c foo.c -fuse-ld=lld

Use -Wl,-save-temps to get bitcode after various ThinLTO phases and the final object files. You probably want to first do clang -c then a separate step to link the .o files to get more meaningful names of these intermediate files.

I'll give that a go when I get to my desk in an hour or so.

In the mean time perhaps you can help me understand some of the corner cases:

Can this happen multiple times for a single variable? (It be imported and internalized into multiple other modules)

Yes

Does this alter the original variable in the original modules at all? Or does it remain untouched/external linkage?

It will also be internalized.

In the above example, how does llvm know there won't be other uses of the variable outside the scope of this compilation command, that might modify the value of the variable? (Where does it decide "main is the only entry point"?)

The thin link is driven off of linker symbol resolution info.

I'm investigating/trying to address significant regressions in object size, executable size, dwp size, symbolizing performance Caused by this change in thinlto debug info.

After some thinking I lean towards reverting my change in IRMover. Most of imported variables will be const-folded by optimizer, so debug info will be a waste of disk space.

Debug info for global constants is still useful, so if this resulted in no description of the global variable, that could be unfortunate - but might be a useful stop-gap to fix the growth regression and then I can help you fix the debug info loss if there is one. But I wouldn't mind understanding that trade-off before we make it

In D55309#1320490, @dblaikie wrote:

From what I'm seeing, I would imagine this may regress debug info quality a bit. In the narrow example given, the DWARF still ends up with a description of 'foo', but without any location/value - though this is the same as if 'foo' was in the same translation unit (looks like LLVM fails to track the removing of the global and replacing everything with a constant - I believe the debug info metadata supports describing such a situation, but it isn't being used here - with or without this new change). SO that's not a regression.

But my concern would be if the global variable was in a different static library - then the user could have the variable imported, internalized, the debug info dropped there (with the patch I'm proposing), but then the original object in that static library may never be linked in (since its symbol is no longer needed) - so the resulting program has no DWARF that describes the variable.

That can't happen. The ThinLink only sees object files out of static libraries that the linker decided to include in the final link (again, this is all driven off of linker info).

I'd like to test this situation, if it's possible - does ThinLTO implement static library-style semantics? Yep, I was able to test that - didn't seem to be a problem. I guess maybe the linker looked at the object dependencies pre-optimization, rather than post-optimization, so the backend foo.o was still linked in, by the looks of it. I guess that makes sense.

So yeah, maybe there is no debug info loss here?

LGTM to address regression, especially since the debug info already isn't maintained for global variables that are const propagated. David, as a follow on it would be good to understand if/when there are cases where the debug info should be imported for these variables (i.e. if they aren't in fact constant propagated after internalization).

lib/Linker/IRMover.cpp
1065	Please add a comment here (the old one that was removed isn't quite accurate now, but could probably be modified).

This revision is now accepted and ready to land.Dec 5 2018, 11:37 AM

Closed by commit rL348416: ThinLTO: Do not import debug info for imported global constants (authored by dblaikie). · Explain WhyDec 5 2018, 1:45 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Linker/

IRMover.cpp

1 line

test/

ThinLTO/

X86/

index-const-prop.ll

2 lines

index-const-prop2.ll

2 lines

Diff 176755

lib/Linker/IRMover.cpp

Show First 20 Lines • Show All 1,056 Lines • ▼ Show 20 Lines	for (unsigned I = 0, E = SrcCompileUnits->getNumOperands(); I != E; ++I) {
// Enums, macros, and retained types don't need to be listed on the		// Enums, macros, and retained types don't need to be listed on the
// imported DICompileUnit. This means they will only be imported		// imported DICompileUnit. This means they will only be imported
// if reached from the mapped IR. Do this by setting their value map		// if reached from the mapped IR. Do this by setting their value map
// entries to nullptr, which will automatically prevent their importing		// entries to nullptr, which will automatically prevent their importing
// when reached from the DICompileUnit during metadata mapping.		// when reached from the DICompileUnit during metadata mapping.
ValueMap.MD()[CU->getRawEnumTypes()].reset(nullptr);		ValueMap.MD()[CU->getRawEnumTypes()].reset(nullptr);
ValueMap.MD()[CU->getRawMacros()].reset(nullptr);		ValueMap.MD()[CU->getRawMacros()].reset(nullptr);
ValueMap.MD()[CU->getRawRetainedTypes()].reset(nullptr);		ValueMap.MD()[CU->getRawRetainedTypes()].reset(nullptr);
		ValueMap.MD()[CU->getRawGlobalVariables()].reset(nullptr);
		tejohnsonUnsubmitted Not Done Reply Inline Actions Please add a comment here (the old one that was removed isn't quite accurate now, but could probably be modified). tejohnson: Please add a comment here (the old one that was removed isn't quite accurate now, but could…

// Imported entities only need to be mapped in if they have local		// Imported entities only need to be mapped in if they have local
// scope, as those might correspond to an imported entity inside a		// scope, as those might correspond to an imported entity inside a
// function being imported (any locally scoped imported entities that		// function being imported (any locally scoped imported entities that
// don't end up referenced by an imported function will not be emitted		// don't end up referenced by an imported function will not be emitted
// into the object). Imported entities not in a local scope		// into the object). Imported entities not in a local scope
// (e.g. on the namespace) only need to be emitted by the originating		// (e.g. on the namespace) only need to be emitted by the originating
// module. Create a list of the locally scoped imported entities, and		// module. Create a list of the locally scoped imported entities, and
▲ Show 20 Lines • Show All 411 Lines • Show Last 20 Lines

test/ThinLTO/X86/index-const-prop.ll

	Show All 15 Lines
	; STATS: 2 module-summary-index - Number of live global variables marked read only			; STATS: 2 module-summary-index - Number of live global variables marked read only

	; Check that we don't internalize gBar when it is exported			; Check that we don't internalize gBar when it is exported
	; RUN: llvm-lto -thinlto-action=import -exported-symbol main -exported-symbol gBar %t1.bc -thinlto-index=%t3.index.bc -o %t1.imported2.bc			; RUN: llvm-lto -thinlto-action=import -exported-symbol main -exported-symbol gBar %t1.bc -thinlto-index=%t3.index.bc -o %t1.imported2.bc
	; RUN: llvm-dis %t1.imported2.bc -o - \| FileCheck %s --check-prefix=IMPORT2			; RUN: llvm-dis %t1.imported2.bc -o - \| FileCheck %s --check-prefix=IMPORT2

	; IMPORT: @gFoo.llvm.0 = internal unnamed_addr global i32 1, align 4, !dbg !0			; IMPORT: @gFoo.llvm.0 = internal unnamed_addr global i32 1, align 4, !dbg !0
	; IMPORT-NEXT: @gBar = internal local_unnamed_addr global i32 2, align 4, !dbg !5			; IMPORT-NEXT: @gBar = internal local_unnamed_addr global i32 2, align 4, !dbg !5
	; IMPORT: !DICompileUnit({{.*}}, globals: !{{[0-9]+}})			; IMPORT: !DICompileUnit({{.*}})

	; OPTIMIZE: define i32 @main			; OPTIMIZE: define i32 @main
	; OPTIMIZE-NEXT: ret i32 3			; OPTIMIZE-NEXT: ret i32 3

	; IMPORT2: @gBar = available_externally local_unnamed_addr global i32 2, align 4, !dbg !5			; IMPORT2: @gBar = available_externally local_unnamed_addr global i32 2, align 4, !dbg !5

	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-pc-linux-gnu"			target triple = "x86_64-pc-linux-gnu"
	Show All 17 Lines

test/ThinLTO/X86/index-const-prop2.ll

	Show All 28 Lines
	; RUN: -r=%t1.bc,foo, \			; RUN: -r=%t1.bc,foo, \
	; RUN: -r=%t1.bc,bar, \			; RUN: -r=%t1.bc,bar, \
	; RUN: -r=%t1.bc,gBar, \			; RUN: -r=%t1.bc,gBar, \
	; RUN: -o %t3			; RUN: -o %t3
	; RUN: llvm-dis %t3.1.3.import.bc -o - \| FileCheck %s --check-prefix=IMPORT2			; RUN: llvm-dis %t3.1.3.import.bc -o - \| FileCheck %s --check-prefix=IMPORT2

	; IMPORT: @gFoo.llvm.0 = internal unnamed_addr global i32 1, align 4			; IMPORT: @gFoo.llvm.0 = internal unnamed_addr global i32 1, align 4
	; IMPORT-NEXT: @gBar = internal local_unnamed_addr global i32 2, align 4			; IMPORT-NEXT: @gBar = internal local_unnamed_addr global i32 2, align 4
	; IMPORT: !DICompileUnit({{.*}}, globals: !{{[0-9]+}})			; IMPORT: !DICompileUnit({{.*}})

	; CODEGEN: i32 @main()			; CODEGEN: i32 @main()
	; CODEGEN-NEXT: ret i32 3			; CODEGEN-NEXT: ret i32 3

	; IMPORT2: @gBar = available_externally dso_local local_unnamed_addr global i32 2, align 4			; IMPORT2: @gBar = available_externally dso_local local_unnamed_addr global i32 2, align 4

	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-pc-linux-gnu"			target triple = "x86_64-pc-linux-gnu"
	Show All 14 Lines