Page MenuHomePhabricator

Discard debuginfo for object files empty after GC
ClosedPublic

Authored by rocallahan on Nov 20 2018, 3:02 AM.

Details

Summary

Rust projects tend to link in all object files from all dependent libraries and rely on --gc-sections to strip unused code and data. Unfortunately --gc-sections doesn't currently strip any debuginfo associated with GC'ed sections, so lld links in the full debuginfo from all dependencies even if almost all that code has been discarded. See https://github.com/rust-lang/rust/issues/56068 for some details.

Properly stripping debuginfo for discarded sections would be difficult, but a simple approach that helps significantly is to mark debuginfo sections as live only if their associated object file has at least one live code/data section. This patch does that. In a (contrived but not totally artificial) Rust testcase linked above, it reduces the final binary size from 46MB to 5.1MB.

Diff Detail

Repository
rL LLVM

Event Timeline

rocallahan created this revision.Nov 20 2018, 3:02 AM
ruiu added a comment.Nov 28 2018, 2:57 PM

Thank you for the patch.

What you are doing in this patch is not too complicated and makes sense to me. That said, if actual size saving is not significant as you said in https://github.com/rust-lang/rust/issues/56068#issuecomment-440160568, it may not be worth doing. It seems to me that if debug info is already 2.4GB, shrinking it to 2GB doesn't make much difference. Do you have more convincing examples?

If we decide to optimize DWARF garbage collection, something generic will be cool.

This generic-abi thread has some discussion about that https://groups.google.com/d/msg/generic-abi/A-1rbP8hFCA/EDA7Sf3KBwAJ (e.g. using COMDAT but it seems challenging and it comes with its own costs)

Thank you for the patch.

What you are doing in this patch is not too complicated and makes sense to me. That said, if actual size saving is not significant as you said in https://github.com/rust-lang/rust/issues/56068#issuecomment-440160568, it may not be worth doing. It seems to me that if debug info is already 2.4GB, shrinking it to 2GB doesn't make much difference. Do you have more convincing examples?

400MB is 400MB... or, if you'd prefer 17% of overall size that was mentioned there. Additional testing might be nice in order to get a better idea of what we're looking at in practice (I guess a clang build would be another good choice), but the overall patch seems to be small and in a lot of ways simplifying for how we're linking.

grimar added a subscriber: grimar.Nov 28 2018, 11:50 PM
ruiu added a comment.Nov 30 2018, 2:03 PM

But 2GB is perhaps still too big and I guess a large part of it can be for dead sections. If we fix this, I'd like to fix it in a proper way so that we can completely eliminate debug info for dead sections.

rocallahan, can I ask why Rust passes all object files to the linker and use -gc-sections to eliminate unused part of these files?

One possible way to fix it is to pass --start-lib to the linker before any object file paths in the command line. That option gives archive file semantics to object files, so if you option, each object file is treated as if that were in an archive file containing that the single object file. (Note that --start-lib is a positional option just like --start-group, so all files between --start-lib and --end-lib get that special treament.)

But 2GB is perhaps still too big and I guess a large part of it can be for dead sections. If we fix this, I'd like to fix it in a proper way so that we can completely eliminate debug info for dead sections.

There's a few options here, mostly using comdat groups for debug infomation associated with sections that the debug information comes from. That said, it still wouldn't encompass compile units that would have no code in them because we didn't choose a single function from a .o file and so this patch is still a strict improvement over that. In addition, it doesn't involve future work to change how we emit debug info and doesn't depend on everyone emitting debug info the same way.

Thoughts?

ruiu added a comment.Dec 4 2018, 4:21 PM

It seems to me that just adding --start-lib to his command line can fix the issue, so I'm waiting for Robert's response. If it doesn't work for some reason, we can analyze why it doesn't work and then discuss what we can do for his problem.

rocallahan added a comment.EditedDec 28 2018, 11:17 PM

Sorry for the delayed reply. I just discovered that Phabricator nofications were being buried by my mail filters.

What you are doing in this patch is not too complicated and makes sense to me. That said, if actual size saving is not significant as you said in https://github.com/rust-lang/rust/issues/56068#issuecomment-440160568, it may not be worth doing. It seems to me that if debug info is already 2.4GB, shrinking it to 2GB doesn't make much difference.

17% saving for a pretty small and conservative patch seems good to me, but I guess it's subjective.

rocallahan, can I ask why Rust passes all object files to the linker and use -gc-sections to eliminate unused part of these files?

I'm not a Rust compiler contributor, but my guess is developer ergonomics. Once upon a time users manually divided code into translation units, each producing an object file, and garbage collection occurred only at the granularity of object files. Later --gc-sections and -ffunction-sections let compilers automatically subdivide object files into smaller units and GC with a granularity of functions. In Rust each translation unit is a whole module (crate); users aren't expected to subdivide libraries into smaller units for GC purposes, and in fact aren't able to. Emitting one section per function mostly means they don't need to ... except it breaks down with DWARF here.

One possible way to fix it is to pass --start-lib to the linker before any object file paths in the command line. That option gives archive file semantics to object files, so if you option, each object file is treated as if that were in an archive file containing that the single object file. (Note that --start-lib is a positional option just like --start-group, so all files between --start-lib and --end-lib get that special treament.)

I will try that and follow up but I don't see how it would make any difference.

Here are some results for the rusoto test in https://github.com/rust-lang/rust/issues/56068#issue-382175735:

LLDBinary size in bytes
LLD 6.0.143,791,192
LLD 8.0.0 1cfcf404f6b23943405bc3c5bb5940b10b91462443,861,056
LLD 8.0.0 1cfcf404f6b23943405bc3c5bb5940b10b914624 --start-lib43,844,760
LLD 8.0.0 1cfcf404f6b23943405bc3c5bb5940b10b914624 + this patch6,281,488

For --start-lib I wrapped --start-lib/--end-lib around all object files.

As I expected --start-lib didn't make much difference. The dependencies containing debuginfo that I want to be GCed away are already packaged into static libraries before being passed to the final link.

I agree that ideally the linker would be able to do fine-grained GC of DWARF at the granularity of individual DIEs and other data items. However, implementing that would be a huge project. Currently AFAICT lld does very little DWARF processing. Wholesale DWARF rewriting would expand the scope of the linker and require lots of testing against various DWARF producers and consumers. I definitely wouldn't want to implement that.

Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2019, 10:31 PM
MaskRay added inline comments.Mar 21 2019, 1:01 AM
lld/ELF/MarkLive.cpp
192 ↗(On Diff #174739)

This check can be changed to !isa<InputSection> && !isa<MergeInputSection>. But do you just want to exclude EhInputSection?

Here are some results for the rusoto test in https://github.com/rust-lang/rust/issues/56068#issue-382175735:

LLDBinary size in bytes
LLD 6.0.143,791,192
LLD 8.0.0 1cfcf404f6b23943405bc3c5bb5940b10b91462443,861,056
LLD 8.0.0 1cfcf404f6b23943405bc3c5bb5940b10b914624 --start-lib43,844,760
LLD 8.0.0 1cfcf404f6b23943405bc3c5bb5940b10b914624 + this patch6,281,488

For --start-lib I wrapped --start-lib/--end-lib around all object files.

As I expected --start-lib didn't make much difference. The dependencies containing debuginfo that I want to be GCed away are already packaged into static libraries before being passed to the final link.

The improvement of this patch looks really promising! Do you have numbers with ld.bfd and gold?

lld/ELF/MarkLive.cpp
195 ↗(On Diff #174739)

!ObjFile<ELFT>::classof(Sec->File)

Can this happen?

199 ↗(On Diff #174739)

The brace is redundant.

@rocallahan I find that people are discussing a generic approach in D59553

rocallahan marked 3 inline comments as done.Mar 24 2019, 6:24 PM

@rocallahan I find that people are discussing a generic approach in D59553

Thanks for the reference. It seems to me that there is no consensus there yet for any approach that would obsolete this change. Of the three approaches discussed there:

  1. Having the linker rewrite DWARF: seems to have been decisively rejected.
  2. Give each function standalone debuginfo in COMDAT: sounds to me like it would be very inefficient, both in the size of intermediate files and in the size of the final binary. DWARF type information currently shared between functions in the same CU would have to be duplicated, and eliminating that duplication would require the linker rewrite DWARF, which brings us back to point 1.
  3. Optimize away unnecessary debuginfo using a post-link tool like dwz: Considerably less efficient than not writing out the bloated DWARF in the first place.
lld/ELF/MarkLive.cpp
192 ↗(On Diff #174739)

Shouldn't I also be excluding SyntheticSection?

195 ↗(On Diff #174739)

I think not. Happy to remove this check if you don't want it.

199 ↗(On Diff #174739)

Thanks.

Do you have numbers with ld.bfd and gold?

I don't have numbers for that exact test with those linkers but they basically behave the same way as LLD.

To be clear, I think the best long-term solution is for LLD to rewrite the DWARF, but from my (admittedly limited) perspective that seems to be at best a distant prospect.

Another clarification:

DWARF type information currently shared between functions in the same CU would have to be duplicated.

I guess that's not necessary. The compiler could put all debuginfo that's shared between functions into a non-COMDAT section for the compilation unit. The downside of that is that that data is not removed by gc-sections ... although my patch here would remove it, if all functions in the CU are removed.

I guess this is what @echristo was getting at in their comment above:

There's a few options here, mostly using comdat groups for debug infomation associated with sections that the debug information comes from. That said, it still wouldn't encompass compile units that would have no code in them because we didn't choose a single function from a .o file and so this patch is still a strict improvement over that. In addition, it doesn't involve future work to change how we emit debug info and doesn't depend on everyone emitting debug info the same way.

ruiu added inline comments.Mar 25 2019, 11:35 AM
lld/ELF/MarkLive.cpp
190 ↗(On Diff #174739)

Since this file is MarkLive, markSection is perhaps a better name.

192 ↗(On Diff #174739)

This needs a comment.

Do you have to visit each file during the mark phase? Looks like you can mark only sections first, and after marking all sections, you can scan all sections to mark files. Looks like they can be two separate stages.

rocallahan marked 2 inline comments as done.Mar 25 2019, 1:07 PM
rocallahan added inline comments.
lld/ELF/MarkLive.cpp
190 ↗(On Diff #174739)

OK

192 ↗(On Diff #174739)

Maybe I'm wrong but I would have expected adding another pass over all sections to be slower than what I'm doing here.

Also we need to distinguish LSDA sections from other live sections since LSDA does not count as "making the file live". So we'd have to add an LSDA flag to InputSection.

Are you sure you want me to make this change?

Also I just noticed I've written LDSA in several places where it should be LSDA!

ruiu added inline comments.Mar 25 2019, 1:18 PM
lld/ELF/MarkLive.cpp
192 ↗(On Diff #174739)

Ah OK, I thought that you set Sec->Live to true only in setSectionLive() but you manipulated that in EnqueueMaybeLDSA as well. Looks like we have too many callback functions in this function -- this file is organized that way because the callback functions were very simple. Now it's been growing organically and probably get to the point that we should just use the regular class-based abstraction. Let me do that first.

ruiu added a comment.Mar 25 2019, 4:29 PM

I committed https://reviews.llvm.org/D59800 which I believe makes your change easier to follow once rebased. Could you rebase it and upload a patch? Thanks!

avl added a subscriber: avl.Wed, Apr 3, 4:46 AM

@rocallahan Would you mind to share the performance results of that patch, please ? Similar to above table for rusoto test, but with timings ...

rocallahan updated this revision to Diff 194244.Mon, Apr 8, 9:59 PM

Addressed all comments AFAIK. I'll post some performance numbers in a moment.

Updated results for the rusoto test in https://github.com/rust-lang/rust/issues/56068#issue-382175735. The test changed a bit because I'm using an updated Rust toolchain and rusoto_core 0.37.0.

LinkerSize (bytes)Real time (ms)
GNU ld version 2.29.1-23.fc2848,554,7362234
GNU gold (version 2.29.1-23.fc28) 1.1449,888,392813
lld-6.0.1-1.fc28.x86_6449,800,824247
lld d918d74461724a22cedd0b76dc1237392f29565649,873,960224
lld d918d74461724a22cedd0b76dc1237392f295656 + this patch5,390,632158

I am also pleased to report that for my real project, switching from ld.bfd to lld + this patch reduces the total size of our dist built binaries from 2.9GB to 2.0GB.

I'm not sure why it's become more effective, but several variables have changed, including some significant restructuring of the build artifacts for our project. Still, this seems like a pretty good result. (Perhaps even suspiciously good! But I can't find any missing debuginfo with cursory checks.)

ruiu added a comment.Mon, Apr 8, 10:48 PM

Nice. One more thing you might want to try is to add -O2 to the linker command line option. When that option is given, lld attempts to tail-merge strings in the string table. That's not very effective, but you might be able to shave off 0.2% or something like that.

ruiu added a comment.Mon, Apr 8, 10:58 PM

Overall looking good.

lld/ELF/MarkLive.cpp
190 ↗(On Diff #194244)

Move this code after Sec->Live = true so that when we visit the same section more than once, this piece of code runs only once.

190–191 ↗(On Diff #194244)

S is true only when it is an InputSection, so you have duplicate tests for this. It also looks like you don't have to care about whether a section is an input section or a merged section. Isn't this condition sufficient?

if (Sec->File && !IsLSDA)
  Sec->getFile<ELFT>()->HasLiveCodeOrData = true;
313 ↗(On Diff #194244)

Let's remove this optimization -- I don't think we need this.

321 ↗(On Diff #194244)

Then you can simplify the condition to

if (!IsAlloc && !IsLinkOrder && !IsRel && !Sec->Debug)
  Sec->Live = true;
330 ↗(On Diff #194244)

Remove this variable and just visit all sections even if there's no debug info (which shouldn't take too much time anyway.)

rocallahan marked 4 inline comments as done.Mon, Apr 8, 11:17 PM
rocallahan added inline comments.
lld/ELF/MarkLive.cpp
190 ↗(On Diff #194244)

We can't do that. The comment I added tries to explain why. Is it unclear?

190–191 ↗(On Diff #194244)

I want to skip EhInputSection and SyntheticSection here. So I guess the advice to use isa here was wrong and I should revert to checking kind() == Regular || kind() == Merge?

313 ↗(On Diff #194244)

OK

321 ↗(On Diff #194244)

Yep.

ruiu added inline comments.Mon, Apr 8, 11:36 PM
lld/ELF/MarkLive.cpp
190 ↗(On Diff #194244)

Ah, OK, got it.

190–191 ↗(On Diff #194244)

But you don't really care even if it is EhInputSection or SyntheticSection, no? I mean an assigned value would not be used but doesn't harm.

rocallahan marked an inline comment as done.Mon, Apr 8, 11:55 PM
rocallahan added inline comments.
lld/ELF/MarkLive.cpp
190–191 ↗(On Diff #194244)

I'm assuming that .eh_frame sections don't have associated debuginfo so even if such a section is enqueued somehow, it should not cause debuginfo to be included for that object file. Is it impossible for a .eh_frame section to be enqueued here? Ditto for a SyntheticSection?

To put it another way, the logic here is supposed to be: "debuginfo can only be associated with Regular or Merge sections, so it only makes sense to mark files as having 'live code or data' possibly associated with debuginfo for those section types."

rocallahan edited the summary of this revision. (Show Details)

The results are basically unchanged with the section type checks removed, so I've just gone ahead and done that.

ruiu added inline comments.Tue, Apr 9, 12:52 AM
lld/ELF/MarkLive.cpp
190–191 ↗(On Diff #194244)

No debug sections are attached to synthetic sections, but I don't think that logically matters. The logic we want to implement here is

  1. if a file contains one or more debug info sections, and
  2. if at least one of them are marked live, then
  3. all debug sections should be marked live.

If Sec->Debug is true, it is a debug section, and vice versa. If Sec->File is a non-null, it is a file that the section belongs to.

So, in the above logic, there's no logic that depends on the type of the section. That's why I think it is better to not assert any expected type for a debug section. As long as !IsLSDA && Sec->File, that File should be marked live here.

rocallahan marked an inline comment as done.Tue, Apr 9, 3:31 AM
rocallahan added inline comments.
lld/ELF/MarkLive.cpp
190–191 ↗(On Diff #194244)
  1. if a file contains one or more debug info sections, and
  1. if at least one of them are marked live, then
  2. all debug sections should be marked live.

That is not quite right. We are not marking any debuginfo sections live until the final pass. The first condition is "if a file contains one or more non-debuginfo sections marked live for non-LSDA data..."

Anyway the diff I just posted contains the change you wanted here.

ruiu accepted this revision.Tue, Apr 9, 11:45 PM

LGTM

I'd write a comment explaining why we are handling debug sections in a special way, but that can be done later. Please submit. Thank you for doing this!

This revision is now accepted and ready to land.Tue, Apr 9, 11:45 PM

I don't have commit access so I believe someone has to commit this for me?

ruiu added a comment.Wed, Apr 10, 3:27 AM

I will do that.

This revision was automatically updated to reflect the committed changes.