This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/ELF/
-
ELF/
-
Driver.cpp
-
InputFiles.h
4/7
InputFiles.cpp
-
Relocations.cpp
1/3
SymbolTable.h

Differential D80765

[ELF] Handle bitcode comdat groups separately to deduplicate thinlto comdat sections
Needs ReviewPublic

Authored by christylee on May 28 2020, 3:29 PM.

Download Raw Diff

Details

Reviewers

• espindola
MaskRay
ruiu
sbc100
dblaikie
grimar
tejohnson

Summary

This change allows lld to deduplicate bitcode file comdat groups, which will allow it to deduplicate .debug_types sections in thinlto.

For more context, see https://reviews.llvm.org/D62884

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	20 ms	LLVM.tools/llvm-xray/X86::Unknown Unit Message ("")
	50 ms	LLVM.tools/llvm-xray/X86::Unknown Unit Message ("")

Event Timeline

christylee created this revision.May 28 2020, 3:29 PM

Herald added a reviewer: • espindola. · View Herald TranscriptMay 28 2020, 3:30 PM

Herald added a reviewer: MaskRay. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, dexonsmith, MaskRay and 3 others. · View Herald Transcript

christylee edited the summary of this revision. (Show Details)May 28 2020, 3:34 PM

christylee added reviewers: ruiu, sbc100, dblaikie.

christylee added subscribers: wenlei, hoy.

(I'm probably not the right person to do detailed/semantic review of this, but here's some basic things I spotted)

lld/ELF/InputFiles.cpp
610	typo? ("proceed" was intended to be "processed", perhaps?)
1569–1572	Unrelated change in this patch - probably best to remove it or commit it separately?

Fixed typo and removed unrelated change.

christylee marked an inline comment as done.May 28 2020, 4:29 PM

Harbormaster failed remote builds in B58337: Diff 267063!May 28 2020, 5:05 PM

Harbormaster failed remote builds in B58346: Diff 267078!

Would you mind making the equivalent changes to lld/wasm?

And a test?

lld/ELF/InputFiles.cpp
393	Perhaps this would better be names something like `isLTOOutput`? Otherwise it sounds like it might a bitcode object which is a different class here.

christylee added a comment.May 29 2020, 12:23 PM

This comment was removed by christylee.

@sbc100 What would be a good way to go about writing a test for this? The reason why I noticed this bug was because .debug_types are not deduplicated in thinlto, but writing a lld test case using debug metadata seems clunky.

I've also tried something like this but it doesn't repro:

; REQUIRES: x86
; RUN: llvm-as %s -o %t.o
; RUN: llc %t.o -o %t2.o -filetype=obj
; RUN: ld.lld %t.o %t2.o %t.o %t2.o -o %t3.o --shared
; RUN: llvm-readobj --symbols %t3.o | FileCheck %s

; CHECK:      Name: foo
; CHECK-NEXT: Value:
; CHECK-NEXT: Size: 1
; CHECK-NEXT: Binding: Global
; CHECK-NEXT: Type: Function
; CHECK-NEXT: Other: 0
; CHECK-NEXT: Section: .text

target triple = "x86_64-unknown-linux-gnu"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

$foo = comdat any
define void @foo() comdat {
  ret void
}

In D80765#2063931, @christylee wrote:

@sbc100 What would be a good way to go about writing a test for this? The reason why I noticed this bug was because .debug_types are not deduplicated in thinlto, but writing a lld test case using debug metadata seems clunky.

I've also tried something like this but it doesn't repro:

; REQUIRES: x86
; RUN: llvm-as %s -o %t.o
; RUN: llc %t.o -o %t2.o -filetype=obj
; RUN: ld.lld %t.o %t2.o %t.o %t2.o -o %t3.o --shared
; RUN: llvm-readobj --symbols %t3.o | FileCheck %s

; CHECK:      Name: foo
; CHECK-NEXT: Value:
; CHECK-NEXT: Size: 1
; CHECK-NEXT: Binding: Global
; CHECK-NEXT: Type: Function
; CHECK-NEXT: Other: 0
; CHECK-NEXT: Section: .text

target triple = "x86_64-unknown-linux-gnu"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

$foo = comdat any
define void @foo() comdat {
  ret void
}

I'm sorry I've not looked into this in much detail but shouldn't this same bug reproduce with all comdats? Maybe there is something different about debug metatdata? BTW, the test above doesn't use lto or thinglto does it?

Addressed comments:

1). Changed isBitcodeFile to isLTOOutput.
2). Added test. Before this patch, lld does not deduplicate comdat sections generated by the backend. .debug_types is an example.

Change for COFF to follow.

Herald added subscribers: steven_wu, hiraditya. · View Herald TranscriptMay 29 2020, 5:01 PM

Harbormaster completed remote builds in B58520: Diff 267420.May 29 2020, 6:34 PM

Changed variable name bitcodeComdatGroup to ltoOutputComdatGroup for consistency.

Fixed a comment.

We don't need to make the same change to COFF. COFF does not ignore comdats, it attempts to add a comdat symbol as long as the symbol is external. It also tracks whether a symbol is from a regular object file or bitcode file so it won't have duplicates.

christylee retitled this revision from [lld] Handle bitcode comdat groups separately to deduplicate thinlto comdat sections to [ELF] Handle bitcode comdat groups separately to deduplicate thinlto comdat sections.Jun 3 2020, 12:14 PM

Harbormaster completed remote builds in B58966: Diff 268266.Jun 3 2020, 1:46 PM

Harbormaster completed remote builds in B58974: Diff 268275.

hoyFB added a reviewer: grimar.Jun 8 2020, 1:55 PM

grimar added inline comments.Jun 9 2020, 1:42 AM

lld/ELF/SymbolTable.h
66	Double space before "Define".
67	fromt -> from
68	Could you expand the comment to mention why it is important to distinct them?

Fixed comment.

Harbormaster completed remote builds in B59712: Diff 269693.Jun 9 2020, 5:09 PM

The LLD change looks fine to me. But I am not well familiar with LTO stuff and hence do not feel myself is a right person to approve it.

wenlei added a reviewer: tejohnson.Jun 10 2020, 10:03 AM

@grimar Do you know who might be a good reviewer?

In D80765#2090905, @christylee wrote:

@grimar Do you know who might be a good reviewer?

@tejohnson is a good one.

I'll also get to this patch soon. I tried an earlier version of this patch and noticed issues testing internally. I'll retest since the patch has changed a lot.

In D80765#2090939, @MaskRay wrote:

In D80765#2090905, @christylee wrote:

@grimar Do you know who might be a good reviewer?

@tejohnson is a good one.

Since this is very specific to lld, and not in the LLVM LTO handling, I'm not sure I'm the best person to review. I looked at the change but don't have enough understanding of how lld is otherwise handling comdat groups to do a really informed review.

I'll also get to this patch soon. I tried an earlier version of this patch and noticed issues testing internally. I'll retest since the patch has changed a lot.

Sounds good.

lld/test/ELF/lto/debug-types-deduplication.ll
4 ↗	(On Diff #269693)	IMO it is better to test ThinLTO by using "opt -module-summary" to create the bitcode object, and not have the serialized summary. It makes it clearer that it is a ThinLTO test.

dexonsmith removed a subscriber: dexonsmith.Jun 16 2020, 11:14 AM

Apologies for my slowness getting to this patch.

symtab->ltoOutputComdatGroups does work:

% readelf -g a3.o.lto.o a3.o1.lto.o 

File: a3.o.lto.o

COMDAT group section [    4] `.group' [4068369915778327548] contains 2 sections:
   [Index]    Name
   [    5]   .debug_types
   [    6]   .rela.debug_types

File: a3.o1.lto.o

COMDAT group section [    4] `.group' [4068369915778327548] contains 2 sections:
   [Index]    Name
   [    5]   .debug_types
   [    6]   .rela.debug_types

The .debug_types from a3.o1.lto.o can be discarded by the new COMDAT logic. However, I am concerned that making it general as this patch does can miss some codegen bugs. See @pcc's argument in
in https://reviews.llvm.org/D56015#1339411 . Since .debug_types (notably, a non-SHF_ALLOC section) is the only COMDAT rule this patch will discard, how about special casing .debug_types (i.e. if isLTOOutput && the group is related to .debug_types)?

MaskRay added inline comments.Jun 17 2020, 10:51 PM

lld/test/ELF/lto/debug-types-deduplication.ll
37 ↗	(On Diff #269693)	Can you drop some unneeded metadata nodes here?

In D80765#2099946, @MaskRay wrote:
Apologies for my slowness getting to this patch.

symtab->ltoOutputComdatGroups does work:
% readelf -g a3.o.lto.o a3.o1.lto.o 

File: a3.o.lto.o

COMDAT group section [    4] `.group' [4068369915778327548] contains 2 sections:
   [Index]    Name
   [    5]   .debug_types
   [    6]   .rela.debug_types

File: a3.o1.lto.o

COMDAT group section [    4] `.group' [4068369915778327548] contains 2 sections:
   [Index]    Name
   [    5]   .debug_types
   [    6]   .rela.debug_types
The .debug_types from a3.o1.lto.o can be discarded by the new COMDAT logic. However, I am concerned that making it general as this patch does can miss some codegen bugs. See @pcc's argument in
in https://reviews.llvm.org/D56015#1339411 . Since .debug_types (notably, a non-SHF_ALLOC section) is the only COMDAT rule this patch will discard, how about special casing .debug_types (i.e. if isLTOOutput && the group is related to .debug_types)?

Thanks for taking time reviewing this change. It is an interesting argument whether LTO should introduce new COMDAT symbols. Do we have a conclusion on this? So far the .debug_types are the only COMDAT we've seen introduced by LTO, but in theory any LTO-exclusive pass may introduce a new COMDAT symbol and it's up to the pass how to set the symbol's linkage type.

christylee marked an inline comment as done.Jun 24 2020, 9:34 AM

christylee added inline comments.

lld/ELF/InputFiles.cpp
612	@MaskRay Since .debug_types (notably, a non-SHF_ALLOC section) is the only COMDAT rule this patch will discard, how about special casing .debug_types (i.e. if isLTOOutput && the group is related to .debug_types)? At the time we run this check, not all InputSections have been initialized, so a the group might be related to an uninitialized section. Since it might be uninitialized, what's the best way to check if that section is non-SHF_ALLOC?

MaskRay added inline comments.Jun 24 2020, 10:07 AM

lld/ELF/InputFiles.cpp

612

This is difficult. An alternative is to use ltoOutputComdatGroups but assert that no collision happens

if (isLTOOutput) {
  CachedHashStringRef ref(signature);
  isNew = symtab->ltoOutputComdatGroups.try_emplace(ref, this).second);
  if (symtab->comdatGroups.coumt(ref))
    errorOrWarn(toString(this) + ": LTO output should not emit a section group whose signature collides with a non-LTO group signature");
}

christylee marked an inline comment as done.Jun 24 2020, 11:10 AM

christylee added inline comments.

lld/ELF/InputFiles.cpp
612	When we link something with thinlto, we first process the comdat groups when they are in bitcode form. We add that to `symtab->comdatGroups` since we want `symtab->comdatGroups` to contain all possible comdatGroups. After compiling the bitcode to object files, we process comdat groups again. At this point, if they are lto output, then we need to check whether we've seen it from other lto outputs. In other words, if something is in `symtab->ltoOutputComdatGroups`, then it must also be in `symtab->comdatGroups`, but we need the two distinct maps to ensure there are no duplicate symbols.

christylee marked 2 inline comments as done and an inline comment as not done.Jul 23 2020, 9:36 AM

christylee added inline comments.

lld/ELF/InputFiles.cpp
612	@MaskRay , can we ship this as is?
lld/test/ELF/lto/debug-types-deduplication.ll
37 ↗	(On Diff #269693)	I've tried dropping the metadata but this is as small as I can get it.

In case we are concerned about correctness, we've been using this patch internally for a couple of months with no problems.

@MaskRay , can we ship this as is?

This still looks like a workaround done at an inappropriate layer to me. The .debug_types code generator should probably know that some .debug_types sections should not be emitted.
BTW, do you have statistics about the effectiveness for this change?

BTW, do you have statistics about the effectiveness for this change?

@MaskRay , one of our thinlto binaries went from a 6.9 GiB .debug_types section to 420 MiB, while another went from 7.8 GiB to 370 MiB.

Yeah - one alternative here would be for ThinLTO To "home" debug type information, the same way I believe it "homes" inline functions. (original/incoming IR to the thinlink may have multiple definitions of inline functions - one in every module that has a call to the inline function - but something in the think link summary basically says "drop this definition" in all but one of those modules and in that one blessed module it says "always produce a weak definition of this function, even if you don't need it" - something like that should be done with type information, which would remove the redundancy in the object files)

In the case of a full executable link (with the homing above implemented), I'd then suggest not enabling type units, since they just add overhead and wouldn't reduce any duplication - in the case of a static library built with ThinLTO (if that's even a thing), then you'd still potentially want to use type units so they could be deduplicated against other libraries.

In D80765#2170820, @dblaikie wrote:

Yeah - one alternative here would be for ThinLTO To "home" debug type information, the same way I believe it "homes" inline functions. (original/incoming IR to the thinlink may have multiple definitions of inline functions - one in every module that has a call to the inline function - but something in the think link summary basically says "drop this definition" in all but one of those modules and in that one blessed module it says "always produce a weak definition of this function, even if you don't need it" - something like that should be done with type information, which would remove the redundancy in the object files)

In the case of a full executable link (with the homing above implemented), I'd then suggest not enabling type units, since they just add overhead and wouldn't reduce any duplication - in the case of a static library built with ThinLTO (if that's even a thing), then you'd still potentially want to use type units so they could be deduplicated against other libraries.

Enabling compile-time type deduplication is indeed an ideal solution, as you pointed out finding a prevailing symbol definition and dropping others for duplicate functions from different modules. This would require quite some work in parsing type metadata on the IR and might slow down thinLink. Also this may not completely address deduplication of new data introduced during LTO postLink code generation time. The type unit level deduplication could be the last resort to handle the duplicates, like what is done in non-LTO build.

In D80765#2174141, @hoyFB wrote:

In D80765#2170820, @dblaikie wrote:

Yeah - one alternative here would be for ThinLTO To "home" debug type information, the same way I believe it "homes" inline functions. (original/incoming IR to the thinlink may have multiple definitions of inline functions - one in every module that has a call to the inline function - but something in the think link summary basically says "drop this definition" in all but one of those modules and in that one blessed module it says "always produce a weak definition of this function, even if you don't need it" - something like that should be done with type information, which would remove the redundancy in the object files)

In the case of a full executable link (with the homing above implemented), I'd then suggest not enabling type units, since they just add overhead and wouldn't reduce any duplication - in the case of a static library built with ThinLTO (if that's even a thing), then you'd still potentially want to use type units so they could be deduplicated against other libraries.

Enabling compile-time type deduplication is indeed an ideal solution, as you pointed out finding a prevailing symbol definition and dropping others for duplicate functions from different modules. This would require quite some work in parsing type metadata on the IR and might slow down thinLink.

Fair point - though presumably the first stage compile would do the metadata walk through the IR, provide a list of types in the thin summary file and the thin link would then communicate to the backend compiles specifying which types each backend compile should emit into its respective object file.

Also this may not completely address deduplication of new data introduced during LTO postLink code generation time. The type unit level deduplication could be the last resort to handle the duplicates, like what is done in non-LTO build.

Not sure I follow - what duplication would remain if the thin link homed type descriptions? When linking to non-LTO (normal machine code/object files) libraries? Yeah, fair point (tangential warning: LLVM's type unit hashes are non-DWARF-conforming and not compatible with GCC's type unit hashes... :/)

It /sounds/ like, maybe the original fix in D62884 maybe was going in the wrong direction, and instead ThinLTO should be removing the comdat group from "homed" inline functions? Then there wouldn't be a need for a special case in how comdats are handled.

In D80765#2174250, @dblaikie wrote:

In D80765#2174141, @hoyFB wrote:

In D80765#2170820, @dblaikie wrote:

Yeah - one alternative here would be for ThinLTO To "home" debug type information, the same way I believe it "homes" inline functions. (original/incoming IR to the thinlink may have multiple definitions of inline functions - one in every module that has a call to the inline function - but something in the think link summary basically says "drop this definition" in all but one of those modules and in that one blessed module it says "always produce a weak definition of this function, even if you don't need it" - something like that should be done with type information, which would remove the redundancy in the object files)

In the case of a full executable link (with the homing above implemented), I'd then suggest not enabling type units, since they just add overhead and wouldn't reduce any duplication - in the case of a static library built with ThinLTO (if that's even a thing), then you'd still potentially want to use type units so they could be deduplicated against other libraries.

Enabling compile-time type deduplication is indeed an ideal solution, as you pointed out finding a prevailing symbol definition and dropping others for duplicate functions from different modules. This would require quite some work in parsing type metadata on the IR and might slow down thinLink.

Fair point - though presumably the first stage compile would do the metadata walk through the IR, provide a list of types in the thin summary file and the thin link would then communicate to the backend compiles specifying which types each backend compile should emit into its respective object file.

Also this may not completely address deduplication of new data introduced during LTO postLink code generation time. The type unit level deduplication could be the last resort to handle the duplicates, like what is done in non-LTO build.

Not sure I follow - what duplication would remain if the thin link homed type descriptions? When linking to non-LTO (normal machine code/object files) libraries? Yeah, fair point (tangential warning: LLVM's type unit hashes are non-DWARF-conforming and not compatible with GCC's type unit hashes... :/)

It /sounds/ like, maybe the original fix in D62884 maybe was going in the wrong direction, and instead ThinLTO should be removing the comdat group from "homed" inline functions? Then there wouldn't be a need for a special case in how comdats are handled.

I am with @dblaikie here. A bit of archaeology finds that rG4de44b7ef87bcef83798eee69fdcbfab9866d52e was probably going in the wrong direction but it was indeed the simplest/least intrusive approach.
D62884 just made some refactorings. Just a thought: for ThinLTO backend, dropping comdat in processGlobalsForThinLTO may achieve some results. However, one issue is that on the MC layer the section group signature is part of ELFSectionKey.

.section foo,"aG",@progbits,aaa,comdat
.section foo,"aG",@progbits,bbb,comdat

are two sections. Dropping the comdat key will make them one combined section. Unique linkage may work around it but there may be a fair amount of complexity in TargetLoweringObjectFileImpl/MC.
I agree with pcc that LTO should not generate additional arbitrary comdat objects.

In D80765#2174269, @MaskRay wrote:

In D80765#2174250, @dblaikie wrote:

In D80765#2174141, @hoyFB wrote:

In D80765#2170820, @dblaikie wrote:

Yeah - one alternative here would be for ThinLTO To "home" debug type information, the same way I believe it "homes" inline functions. (original/incoming IR to the thinlink may have multiple definitions of inline functions - one in every module that has a call to the inline function - but something in the think link summary basically says "drop this definition" in all but one of those modules and in that one blessed module it says "always produce a weak definition of this function, even if you don't need it" - something like that should be done with type information, which would remove the redundancy in the object files)

In the case of a full executable link (with the homing above implemented), I'd then suggest not enabling type units, since they just add overhead and wouldn't reduce any duplication - in the case of a static library built with ThinLTO (if that's even a thing), then you'd still potentially want to use type units so they could be deduplicated against other libraries.

Enabling compile-time type deduplication is indeed an ideal solution, as you pointed out finding a prevailing symbol definition and dropping others for duplicate functions from different modules. This would require quite some work in parsing type metadata on the IR and might slow down thinLink.

Fair point - though presumably the first stage compile would do the metadata walk through the IR, provide a list of types in the thin summary file and the thin link would then communicate to the backend compiles specifying which types each backend compile should emit into its respective object file.

Also this may not completely address deduplication of new data introduced during LTO postLink code generation time. The type unit level deduplication could be the last resort to handle the duplicates, like what is done in non-LTO build.

Not sure I follow - what duplication would remain if the thin link homed type descriptions? When linking to non-LTO (normal machine code/object files) libraries? Yeah, fair point (tangential warning: LLVM's type unit hashes are non-DWARF-conforming and not compatible with GCC's type unit hashes... :/)

It /sounds/ like, maybe the original fix in D62884 maybe was going in the wrong direction, and instead ThinLTO should be removing the comdat group from "homed" inline functions? Then there wouldn't be a need for a special case in how comdats are handled.

I am with @dblaikie here. A bit of archaeology finds that rG4de44b7ef87bcef83798eee69fdcbfab9866d52e was probably going in the wrong direction but it was indeed the simplest/least intrusive approach.
D62884 just made some refactorings. Just a thought: for ThinLTO backend, dropping comdat in processGlobalsForThinLTO may achieve some results.

Yeah, maybe something near/similar to https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Utils/FunctionImportUtils.cpp#L275 - where the comdat is imported as available externally, and its comdat group is stripped. Theory is, maybe we should be stripping the comdat on the other side too.

Hmm, maybe not, though? What about non-whole-program-ThinLTO? There could be a comdat copy of this function in a static library that it should be deduplicated against?

I take it the underlying assumption lld is making is that a comdat group is only kept alive by references from within the object file that defines it? Is that formally specified/required? If not, then maybe we could/should remoev that assumption, so that ThinLTO can work naturally with non-(Thin)LTO archives and still do the suitable comdat deduplication or dropping?

However, one issue is that on the MC layer the section group signature is part of ELFSectionKey.
.section foo,"aG",@progbits,aaa,comdat
.section foo,"aG",@progbits,bbb,comdat
are two sections. Dropping the comdat key will make them one combined section.

Is that a problem? When linked they end up in the same section, right?

Unique linkage may work around it but there may be a fair amount of complexity in TargetLoweringObjectFileImpl/MC.
I agree with pcc that LTO should not generate additional arbitrary comdat objects.

What additional arbitrary comdat objects are being generated or proposed to be generated?

In D80765#2177014, @dblaikie wrote:

In D80765#2174269, @MaskRay wrote:

In D80765#2174250, @dblaikie wrote:

In D80765#2174141, @hoyFB wrote:

In D80765#2170820, @dblaikie wrote:

Yeah - one alternative here would be for ThinLTO To "home" debug type information, the same way I believe it "homes" inline functions. (original/incoming IR to the thinlink may have multiple definitions of inline functions - one in every module that has a call to the inline function - but something in the think link summary basically says "drop this definition" in all but one of those modules and in that one blessed module it says "always produce a weak definition of this function, even if you don't need it" - something like that should be done with type information, which would remove the redundancy in the object files)

In the case of a full executable link (with the homing above implemented), I'd then suggest not enabling type units, since they just add overhead and wouldn't reduce any duplication - in the case of a static library built with ThinLTO (if that's even a thing), then you'd still potentially want to use type units so they could be deduplicated against other libraries.

Enabling compile-time type deduplication is indeed an ideal solution, as you pointed out finding a prevailing symbol definition and dropping others for duplicate functions from different modules. This would require quite some work in parsing type metadata on the IR and might slow down thinLink.

Fair point - though presumably the first stage compile would do the metadata walk through the IR, provide a list of types in the thin summary file and the thin link would then communicate to the backend compiles specifying which types each backend compile should emit into its respective object file.

Also this may not completely address deduplication of new data introduced during LTO postLink code generation time. The type unit level deduplication could be the last resort to handle the duplicates, like what is done in non-LTO build.

Not sure I follow - what duplication would remain if the thin link homed type descriptions? When linking to non-LTO (normal machine code/object files) libraries? Yeah, fair point (tangential warning: LLVM's type unit hashes are non-DWARF-conforming and not compatible with GCC's type unit hashes... :/)

It /sounds/ like, maybe the original fix in D62884 maybe was going in the wrong direction, and instead ThinLTO should be removing the comdat group from "homed" inline functions? Then there wouldn't be a need for a special case in how comdats are handled.

I am with @dblaikie here. A bit of archaeology finds that rG4de44b7ef87bcef83798eee69fdcbfab9866d52e was probably going in the wrong direction but it was indeed the simplest/least intrusive approach.
D62884 just made some refactorings. Just a thought: for ThinLTO backend, dropping comdat in processGlobalsForThinLTO may achieve some results.

Yeah, maybe something near/similar to https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Utils/FunctionImportUtils.cpp#L275 - where the comdat is imported as available externally, and its comdat group is stripped. Theory is, maybe we should be stripping the comdat on the other side too.

Hmm, maybe not, though? What about non-whole-program-ThinLTO? There could be a comdat copy of this function in a static library that it should be deduplicated against?

This is probably fine if the static library is a native input to lld , since lld itself can deduplicate comdat groups of its native input.

I take it the underlying assumption lld is making is that a comdat group is only kept alive by references from within the object file that defines it? Is that formally specified/required? If not, then maybe we could/should remoev that assumption, so that ThinLTO can work naturally with non-(Thin)LTO archives and still do the suitable comdat deduplication or dropping?
However, one issue is that on the MC layer the section group signature is part of ELFSectionKey.
.section foo,"aG",@progbits,aaa,comdat
.section foo,"aG",@progbits,bbb,comdat
are two sections. Dropping the comdat key will make them one combined section.
Is that a problem? When linked they end up in the same section, right?

Unique linkage may work around it but there may be a fair amount of complexity in TargetLoweringObjectFileImpl/MC.
I agree with pcc that LTO should not generate additional arbitrary comdat objects.

What additional arbitrary comdat objects are being generated or proposed to be generated?

So far the Dwarf types are the only new artifacts generated by the LTO backend. What I was wondering was new LTO-exclusive codegen passes that may introduce new comdat groups, but sound like it is a design principle that new symbols/data should be only generated in preLink LTO phase.

In D80765#2177208, @hoyFB wrote:

In D80765#2177014, @dblaikie wrote:

In D80765#2174269, @MaskRay wrote:

In D80765#2174250, @dblaikie wrote:

In D80765#2174141, @hoyFB wrote:

In D80765#2170820, @dblaikie wrote:

Yeah - one alternative here would be for ThinLTO To "home" debug type information, the same way I believe it "homes" inline functions. (original/incoming IR to the thinlink may have multiple definitions of inline functions - one in every module that has a call to the inline function - but something in the think link summary basically says "drop this definition" in all but one of those modules and in that one blessed module it says "always produce a weak definition of this function, even if you don't need it" - something like that should be done with type information, which would remove the redundancy in the object files)

In the case of a full executable link (with the homing above implemented), I'd then suggest not enabling type units, since they just add overhead and wouldn't reduce any duplication - in the case of a static library built with ThinLTO (if that's even a thing), then you'd still potentially want to use type units so they could be deduplicated against other libraries.

Enabling compile-time type deduplication is indeed an ideal solution, as you pointed out finding a prevailing symbol definition and dropping others for duplicate functions from different modules. This would require quite some work in parsing type metadata on the IR and might slow down thinLink.

Fair point - though presumably the first stage compile would do the metadata walk through the IR, provide a list of types in the thin summary file and the thin link would then communicate to the backend compiles specifying which types each backend compile should emit into its respective object file.

Also this may not completely address deduplication of new data introduced during LTO postLink code generation time. The type unit level deduplication could be the last resort to handle the duplicates, like what is done in non-LTO build.

Not sure I follow - what duplication would remain if the thin link homed type descriptions? When linking to non-LTO (normal machine code/object files) libraries? Yeah, fair point (tangential warning: LLVM's type unit hashes are non-DWARF-conforming and not compatible with GCC's type unit hashes... :/)

It /sounds/ like, maybe the original fix in D62884 maybe was going in the wrong direction, and instead ThinLTO should be removing the comdat group from "homed" inline functions? Then there wouldn't be a need for a special case in how comdats are handled.

I am with @dblaikie here. A bit of archaeology finds that rG4de44b7ef87bcef83798eee69fdcbfab9866d52e was probably going in the wrong direction but it was indeed the simplest/least intrusive approach.
D62884 just made some refactorings. Just a thought: for ThinLTO backend, dropping comdat in processGlobalsForThinLTO may achieve some results.

Yeah, maybe something near/similar to https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Utils/FunctionImportUtils.cpp#L275 - where the comdat is imported as available externally, and its comdat group is stripped. Theory is, maybe we should be stripping the comdat on the other side too.

Hmm, maybe not, though? What about non-whole-program-ThinLTO? There could be a comdat copy of this function in a static library that it should be deduplicated against?

This is probably fine if the static library is a native input to lld , since lld itself can deduplicate comdat groups of its native input.

Oh, yeah, fair point.

I take it the underlying assumption lld is making is that a comdat group is only kept alive by references from within the object file that defines it? Is that formally specified/required? If not, then maybe we could/should remoev that assumption, so that ThinLTO can work naturally with non-(Thin)LTO archives and still do the suitable comdat deduplication or dropping?
However, one issue is that on the MC layer the section group signature is part of ELFSectionKey.
.section foo,"aG",@progbits,aaa,comdat
.section foo,"aG",@progbits,bbb,comdat
are two sections. Dropping the comdat key will make them one combined section.
Is that a problem? When linked they end up in the same section, right?

Unique linkage may work around it but there may be a fair amount of complexity in TargetLoweringObjectFileImpl/MC.
I agree with pcc that LTO should not generate additional arbitrary comdat objects.

What additional arbitrary comdat objects are being generated or proposed to be generated?
So far the Dwarf types are the only new artifacts generated by the LTO backend.

Ah, OK, "new" in the sense that they weren't present as comdat groups in the ThinLTO input IR.

What do we gain by having this restriction? I guess it removes a certain amount of work the linker would otherwise do when performing the final link step in ThinLTO?

Anyone have a sense of the significance of that restriction compared to linking "as normal"?

ayermolo added a subscriber: ayermolo.Sep 18 2020, 10:38 AM

nickdesaulniers added a subscriber: nickdesaulniers.Mar 24 2021, 1:14 PM

ayermolo mentioned this in D148754: [LLD][RFC] Deduplicate type units with local ThinLTO.Apr 19 2023, 4:14 PM

Revision Contents

Path

Size

lld/

ELF/

2 lines

4 lines

22 lines

8 lines

6 lines

Diff 267063

lld/ELF/Driver.cpp

Show First 20 Lines • Show All 1,674 Lines • ▼ Show 20 Lines	template <class ELFT> void LinkerDriver::compileBitcodeFiles() {
llvm::TimeTraceScope timeScope("LTO");		llvm::TimeTraceScope timeScope("LTO");
// Compile bitcode files and replace bitcode symbols.		// Compile bitcode files and replace bitcode symbols.
lto.reset(new BitcodeCompiler);		lto.reset(new BitcodeCompiler);
for (BitcodeFile *file : bitcodeFiles)		for (BitcodeFile *file : bitcodeFiles)
lto->add(*file);		lto->add(*file);

for (InputFile *file : lto->compile()) {		for (InputFile *file : lto->compile()) {
auto *obj = cast<ObjFile<ELFT>>(file);		auto *obj = cast<ObjFile<ELFT>>(file);
obj->parse(/ignoreComdats=/true);		obj->parse(/isBitcodeFile=/true);
for (Symbol *sym : obj->getGlobalSymbols())		for (Symbol *sym : obj->getGlobalSymbols())
sym->parseSymbolVersion();		sym->parseSymbolVersion();
objectFiles.push_back(file);		objectFiles.push_back(file);
}		}
}		}

// The --wrap option is a feature to rename symbols so that you can write		// The --wrap option is a feature to rename symbols so that you can write
// wrappers for existing functions. If you pass `-wrap=foo`, all		// wrappers for existing functions. If you pass `-wrap=foo`, all
▲ Show 20 Lines • Show All 406 Lines • Show Last 20 Lines

lld/ELF/InputFiles.h

Show First 20 Lines • Show All 195 Lines • ▼ Show 20 Lines	public:

ArrayRef<Symbol *> getLocalSymbols();		ArrayRef<Symbol *> getLocalSymbols();
ArrayRef<Symbol *> getGlobalSymbols();		ArrayRef<Symbol *> getGlobalSymbols();

ObjFile(MemoryBufferRef m, StringRef archiveName) : ELFFileBase(ObjKind, m) {		ObjFile(MemoryBufferRef m, StringRef archiveName) : ELFFileBase(ObjKind, m) {
this->archiveName = std::string(archiveName);		this->archiveName = std::string(archiveName);
}		}

void parse(bool ignoreComdats = false);		void parse(bool isBitcodeFile = false);

StringRef getShtGroupSignature(ArrayRef<Elf_Shdr> sections,		StringRef getShtGroupSignature(ArrayRef<Elf_Shdr> sections,
const Elf_Shdr &sec);		const Elf_Shdr &sec);

Symbol &getSymbol(uint32_t symbolIndex) const {		Symbol &getSymbol(uint32_t symbolIndex) const {
if (symbolIndex >= this->symbols.size())		if (symbolIndex >= this->symbols.size())
fatal(toString(this) + ": invalid symbol index");		fatal(toString(this) + ": invalid symbol index");
return *this->symbols[symbolIndex];		return *this->symbols[symbolIndex];
Show All 34 Lines	public:

// SHT_LLVM_CALL_GRAPH_PROFILE table		// SHT_LLVM_CALL_GRAPH_PROFILE table
ArrayRef<Elf_CGProfile> cgProfile;		ArrayRef<Elf_CGProfile> cgProfile;

// Get cached DWARF information.		// Get cached DWARF information.
DWARFCache *getDwarf();		DWARFCache *getDwarf();

private:		private:
void initializeSections(bool ignoreComdats);		void initializeSections(bool isBitcodeFile);
void initializeSymbols();		void initializeSymbols();
void initializeJustSymbols();		void initializeJustSymbols();

InputSectionBase *getRelocTarget(const Elf_Shdr &sec);		InputSectionBase *getRelocTarget(const Elf_Shdr &sec);
InputSectionBase *createInputSection(const Elf_Shdr &sec);		InputSectionBase *createInputSection(const Elf_Shdr &sec);
StringRef getSectionName(const Elf_Shdr &sec);		StringRef getSectionName(const Elf_Shdr &sec);

bool shouldMerge(const Elf_Shdr &sec, StringRef name);		bool shouldMerge(const Elf_Shdr &sec, StringRef name);
▲ Show 20 Lines • Show All 143 Lines • Show Last 20 Lines

lld/ELF/InputFiles.cpp

Show First 20 Lines • Show All 384 Lines • ▼ Show 20 Lines	if (this->symbols.empty())
return {};		return {};
return makeArrayRef(this->symbols).slice(1, this->firstGlobal - 1);		return makeArrayRef(this->symbols).slice(1, this->firstGlobal - 1);
}		}

template <class ELFT> ArrayRef<Symbol *> ObjFile<ELFT>::getGlobalSymbols() {		template <class ELFT> ArrayRef<Symbol *> ObjFile<ELFT>::getGlobalSymbols() {
return makeArrayRef(this->symbols).slice(this->firstGlobal);		return makeArrayRef(this->symbols).slice(this->firstGlobal);
}		}

template <class ELFT> void ObjFile<ELFT>::parse(bool ignoreComdats) {		template <class ELFT> void ObjFile<ELFT>::parse(bool isBitcodeFile) {
		sbc100Unsubmitted Not Done Reply Inline Actions Perhaps this would better be names something like `isLTOOutput`? Otherwise it sounds like it might a bitcode object which is a different class here. sbc100: Perhaps this would better be names something like `isLTOOutput`? Otherwise it sounds like it…
// Read a section table. justSymbols is usually false.		// Read a section table. justSymbols is usually false.
if (this->justSymbols)		if (this->justSymbols)
initializeJustSymbols();		initializeJustSymbols();
else		else
initializeSections(ignoreComdats);		initializeSections(isBitcodeFile);

// Read a symbol table.		// Read a symbol table.
initializeSymbols();		initializeSymbols();
}		}

// Sections with SHT_GROUP and comdat bits define comdat section groups.		// Sections with SHT_GROUP and comdat bits define comdat section groups.
// They are identified and deduplicated by group name. This function		// They are identified and deduplicated by group name. This function
// returns a group name.		// returns a group name.
▲ Show 20 Lines • Show All 134 Lines • ▼ Show 20 Lines	else
head = s;		head = s;
prev = s;		prev = s;
}		}
if (prev)		if (prev)
prev->nextInSectionGroup = head;		prev->nextInSectionGroup = head;
}		}

template <class ELFT>		template <class ELFT>
void ObjFile<ELFT>::initializeSections(bool ignoreComdats) {		void ObjFile<ELFT>::initializeSections(bool isBitcodeFile) {
const ELFFile<ELFT> &obj = this->getObj();		const ELFFile<ELFT> &obj = this->getObj();

ArrayRef<Elf_Shdr> objSections = CHECK(obj.sections(), this);		ArrayRef<Elf_Shdr> objSections = CHECK(obj.sections(), this);
uint64_t size = objSections.size();		uint64_t size = objSections.size();
this->sections.resize(size);		this->sections.resize(size);
this->sectionStringTable =		this->sectionStringTable =
CHECK(obj.getSectionStringTable(objSections), this);		CHECK(obj.getSectionStringTable(objSections), this);

▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	case SHT_GROUP: {
// An group with the empty flag doesn't define anything; such sections		// An group with the empty flag doesn't define anything; such sections
// are just skipped.		// are just skipped.
if (entries[0] == 0)		if (entries[0] == 0)
continue;		continue;

if (entries[0] != GRP_COMDAT)		if (entries[0] != GRP_COMDAT)
fatal(toString(this) + ": unsupported SHT_GROUP format");		fatal(toString(this) + ": unsupported SHT_GROUP format");

bool isNew =		// If this is a bitcode file, ignore already proceed comdat groups.
		dblaikieUnsubmitted Done Reply Inline Actions typo? ("proceed" was intended to be "processed", perhaps?) dblaikie: typo? ("proceed" was intended to be "processed", perhaps?)
ignoreComdats \|\|		// Else, check all comdat groups.
symtab->comdatGroups.try_emplace(CachedHashStringRef(signature), this)		bool isNew = isBitcodeFile
		christyleeAuthorUnsubmitted Not Done Reply Inline Actions @MaskRay Since .debug_types (notably, a non-SHF_ALLOC section) is the only COMDAT rule this patch will discard, how about special casing .debug_types (i.e. if isLTOOutput && the group is related to .debug_types)? At the time we run this check, not all InputSections have been initialized, so a the group might be related to an uninitialized section. Since it might be uninitialized, what's the best way to check if that section is non-SHF_ALLOC? christylee: @MaskRay > Since .debug_types (notably, a non-SHF_ALLOC section) is the only COMDAT rule this…
		MaskRayUnsubmitted Not Done Reply Inline Actions This is difficult. An alternative is to use ltoOutputComdatGroups but assert that no collision happens if (isLTOOutput) { CachedHashStringRef ref(signature); isNew = symtab->ltoOutputComdatGroups.try_emplace(ref, this).second); if (symtab->comdatGroups.coumt(ref)) errorOrWarn(toString(this) + ": LTO output should not emit a section group whose signature collides with a non-LTO group signature"); } MaskRay: This is difficult. An alternative is to use ltoOutputComdatGroups but assert that no collision…
		christyleeAuthorUnsubmitted Done Reply Inline Actions When we link something with thinlto, we first process the comdat groups when they are in bitcode form. We add that to `symtab->comdatGroups` since we want `symtab->comdatGroups` to contain all possible comdatGroups. After compiling the bitcode to object files, we process comdat groups again. At this point, if they are lto output, then we need to check whether we've seen it from other lto outputs. In other words, if something is in `symtab->ltoOutputComdatGroups`, then it must also be in `symtab->comdatGroups`, but we need the two distinct maps to ensure there are no duplicate symbols. christylee: When we link something with thinlto, we first process the comdat groups when they are in…
		christyleeAuthorUnsubmitted Done Reply Inline Actions @MaskRay , can we ship this as is? christylee: @MaskRay , can we ship this as is?
		? symtab->bitcodeComdatGroups
		.try_emplace(CachedHashStringRef(signature), this)
		.second
		: symtab->comdatGroups
		.try_emplace(CachedHashStringRef(signature), this)
.second;		.second;
if (isNew) {		if (isNew) {
if (config->relocatable)		if (config->relocatable)
this->sections[i] = createInputSection(sec);		this->sections[i] = createInputSection(sec);
selectedGroups.push_back(entries);		selectedGroups.push_back(entries);
continue;		continue;
}		}

// Otherwise, discard group members.		// Otherwise, discard group members.
▲ Show 20 Lines • Show All 934 Lines • ▼ Show 20 Lines	static Symbol *createBitcodeSymbol(const std::vector<bool> &keptComdats,
Defined newSym(&f, name, binding, visibility, type, 0, 0, nullptr);		Defined newSym(&f, name, binding, visibility, type, 0, 0, nullptr);
if (canOmitFromDynSym)		if (canOmitFromDynSym)
newSym.exportDynamic = false;		newSym.exportDynamic = false;
return symtab->addSymbol(newSym);		return symtab->addSymbol(newSym);
}		}

template <class ELFT> void BitcodeFile::parse() {		template <class ELFT> void BitcodeFile::parse() {
std::vector<bool> keptComdats;		std::vector<bool> keptComdats;
for (StringRef s : obj->getComdatTable())		for (StringRef s : obj->getComdatTable()) {
keptComdats.push_back(		keptComdats.push_back(
symtab->comdatGroups.try_emplace(CachedHashStringRef(s), this).second);		symtab->comdatGroups.try_emplace(CachedHashStringRef(s), this).second);
		}
		dblaikieUnsubmitted Done Reply Inline Actions Unrelated change in this patch - probably best to remove it or commit it separately? dblaikie: Unrelated change in this patch - probably best to remove it or commit it separately?

for (const lto::InputFile::Symbol &objSym : obj->symbols())		for (const lto::InputFile::Symbol &objSym : obj->symbols())
symbols.push_back(createBitcodeSymbol<ELFT>(keptComdats, objSym, *this));		symbols.push_back(createBitcodeSymbol<ELFT>(keptComdats, objSym, *this));

for (auto l : obj->getDependentLibraries())		for (auto l : obj->getDependentLibraries())
addDependentLibrary(l, this);		addDependentLibrary(l, this);
}		}

▲ Show 20 Lines • Show All 143 Lines • Show Last 20 Lines

lld/ELF/Relocations.cpp

Show First 20 Lines • Show All 688 Lines • ▼ Show 20 Lines	static std::string maybeReportDiscarded(Undefined &sym) {
msg += "\n>>> defined in " + toString(file);		msg += "\n>>> defined in " + toString(file);

Elf_Shdr_Impl<ELFT> elfSec = objSections[sym.discardedSecIdx - 1];		Elf_Shdr_Impl<ELFT> elfSec = objSections[sym.discardedSecIdx - 1];
if (elfSec.sh_type != SHT_GROUP)		if (elfSec.sh_type != SHT_GROUP)
return msg;		return msg;

// If the discarded section is a COMDAT.		// If the discarded section is a COMDAT.
StringRef signature = file->getShtGroupSignature(objSections, elfSec);		StringRef signature = file->getShtGroupSignature(objSections, elfSec);
if (const InputFile *prevailing =		const InputFile *prevailing =
symtab->comdatGroups.lookup(CachedHashStringRef(signature)))		symtab->comdatGroups.lookup(CachedHashStringRef(signature));
		if (!prevailing)
		prevailing =
		symtab->bitcodeComdatGroups.lookup(CachedHashStringRef(signature));
		if (prevailing)
msg += "\n>>> section group signature: " + signature.str() +		msg += "\n>>> section group signature: " + signature.str() +
"\n>>> prevailing definition is in " + toString(prevailing);		"\n>>> prevailing definition is in " + toString(prevailing);
return msg;		return msg;
}		}

// Undefined diagnostics are collected in a vector and emitted once all of		// Undefined diagnostics are collected in a vector and emitted once all of
// them are known, so that some postprocessing on the list of undefined symbols		// them are known, so that some postprocessing on the list of undefined symbols
// can happen before lld emits diagnostics.		// can happen before lld emits diagnostics.
▲ Show 20 Lines • Show All 1,337 Lines • Show Last 20 Lines

lld/ELF/SymbolTable.h

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	public:
// Set of .so files to not link the same shared object file more than once.		// Set of .so files to not link the same shared object file more than once.
llvm::DenseMap<StringRef, SharedFile *> soNames;		llvm::DenseMap<StringRef, SharedFile *> soNames;

// Comdat groups define "link once" sections. If two comdat groups have the		// Comdat groups define "link once" sections. If two comdat groups have the
// same name, only one of them is linked, and the other is ignored. This map		// same name, only one of them is linked, and the other is ignored. This map
// is used to uniquify them.		// is used to uniquify them.
llvm::DenseMap<llvm::CachedHashStringRef, const InputFile *> comdatGroups;		llvm::DenseMap<llvm::CachedHashStringRef, const InputFile *> comdatGroups;

		// We link twice in thinlto. Define a separate map of comdat groups for
		grimarUnsubmitted Done Reply Inline Actions Double space before "Define". grimar: Double space before "Define".
		// bitcode files to make sure the symbols are distinct fromt the ones found
		grimarUnsubmitted Not Done Reply Inline Actions fromt -> from grimar: fromt -> from
		// in object files
		grimarUnsubmitted Not Done Reply Inline Actions Could you expand the comment to mention why it is important to distinct them? grimar: Could you expand the comment to mention why it is important to distinct them?
		llvm::DenseMap<llvm::CachedHashStringRef, const InputFile *>
		bitcodeComdatGroups;

private:		private:
std::vector<Symbol *> findByVersion(SymbolVersion ver);		std::vector<Symbol *> findByVersion(SymbolVersion ver);
std::vector<Symbol *> findAllByVersion(SymbolVersion ver);		std::vector<Symbol *> findAllByVersion(SymbolVersion ver);

llvm::StringMap<std::vector<Symbol *>> &getDemangledSyms();		llvm::StringMap<std::vector<Symbol *>> &getDemangledSyms();
void assignExactVersion(SymbolVersion ver, uint16_t versionId,		void assignExactVersion(SymbolVersion ver, uint16_t versionId,
StringRef versionName);		StringRef versionName);
void assignWildcardVersion(SymbolVersion ver, uint16_t versionId);		void assignWildcardVersion(SymbolVersion ver, uint16_t versionId);
Show All 24 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Handle bitcode comdat groups separately to deduplicate thinlto comdat sectionsNeeds ReviewPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 267063

lld/ELF/Driver.cpp

lld/ELF/InputFiles.h

lld/ELF/InputFiles.cpp

lld/ELF/Relocations.cpp

lld/ELF/SymbolTable.h

[ELF] Handle bitcode comdat groups separately to deduplicate thinlto comdat sections
Needs ReviewPublic