This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Support of compressed input sections implemented.
ClosedPublic

Authored by grimar on May 15 2016, 3:02 AM.

Details

Summary

Patch implements support of zlib style compressed sections.
SHF_COMPRESSED flag is used to recognize that decompression is required.
After that decompression is performed and flag is removed from output.

Diff Detail

Event Timeline

grimar updated this revision to Diff 57292.May 15 2016, 3:02 AM
grimar retitled this revision from to [ELF] - Support of compressed input sections implemented..
grimar updated this object.
grimar added reviewers: ruiu, rafael.
grimar added subscribers: grimar, llvm-commits.
grimar added inline comments.
test/ELF/compressed-debug.s
37

I will update that place with correct constant name as soon as D20273 be landed.

ruiu edited edge metadata.May 15 2016, 8:08 AM

This patch uncompresses all input sections whether they are going to be added to the output or not. A lot of sections are in fact not going to make it to the output because of comdat deduplication and section gc. So uncompressing all inputs beforehand would be a waste of resource.

Until the content is copied to the output, I think we don't have to uncompress inputs. Do you think you can do it lazily?

Until the content is copied to the output, I think we don't have to uncompress inputs. Do you think you can do it lazily?

It should be done on demand transparently. You only need to decompress input
when getting the input contents.

ruiu added a comment.May 15 2016, 9:09 AM

Also please try to minimize the number of data copies. In this patch, compressed data is uncompressed to a buffer in memory and then copied to an mmap'ed output file. So the data is copied twice. It can be just once -- you can uncompress data directly to the mmap'ed output.

ruiu added a comment.May 15 2016, 4:07 PM

It's also worth to mention that you could utilize multiple cores if you uncompress sections in writeTo() since writeTo() is parallelized for each input section when --thread is given. Since uncompressing gzipped data is fairly CPU intensive task, I'd expect it would make the linker noticeable faster when handling compressed sections.

In D20272#430485, @ruiu wrote:

This patch uncompresses all input sections whether they are going to be added to the output or not. A lot of sections are in fact not going to make it to the output because of comdat deduplication and section gc. So uncompressing all inputs beforehand would be a waste of resource.

Until the content is copied to the output, I think we don't have to uncompress inputs. Do you think you can do it lazily?

Thank you for this and other comments about the patch. I want to assure you that this time I was aware about most of details you wrote :)

There were 2 reasons from my side why I implemented it in this way:

  1. MergeOutputSection requires early access to uncompressed data. It uses it in constructor, it uses it in InputSectionBase<ELFT>::getOffset() and in other places. I am currently investigating how it is possible to improve that. I am pretty sure we do not want to have different logic for compressed/uncompressed sections, so I hope to refactor how MergeOutputSection works at first then. I mean that we probably want to delay the access to data as much as we can here. That was technical problem I faced, but I was sure it is not a problem for implementation way I used because of next:
  1. I was thinking that since compressed sections are used for compressing debug stuff, then nobody really cares about perfomance. That is something I am missing probably, but if we are talking about linkage speed during development process, then I assume that compression probably not involved. Compression-decompression takes time, and hdd space it too much cheap to use that. But when we switch to generating "production" binary using some inputs with debugging sources, then I assumed there is not much sence to save few percents (probably) of final time if we can keep code simple instead. Please point me what I missing here.

Anyways, I am currently investigating the MergeOutputSection logic, I hope to be able to refactor it a bit. I am hard in commenting here, I still need to investigate that.

ruiu added a comment.May 16 2016, 6:56 AM

Do compressed and mergeable sections exist? If the combination of the two doesn't exist in the real world, we can simply reject that.

I think performance matters even with compressed debug info. For example, if you are doing distributed build, you may want to reduce the network traffic by compressing large debug sections to reduce the overall build time (sending small files is faster than larger ones).

In D20272#430820, @ruiu wrote:

Do compressed and mergeable sections exist?

That was the main case I debugged..

In D20272#430820, @ruiu wrote:

Do compressed and mergeable sections exist?

So I mean that testcase for patch was created from real world code and it has that, unfortunately or not.

ruiu added a comment.May 16 2016, 7:15 AM

OK, but compressed sections shouldn't be uncompressed until they need to be. At the symbol resolution phase, we don't need any section contents (and at the phase and gc phase, a lot of sections are eliminated.)

In D20272#430837, @ruiu wrote:

OK, but compressed sections shouldn't be uncompressed until they need to be. At the symbol resolution phase, we don't need any section contents (and at the phase and gc phase, a lot of sections are eliminated.)

Right now the first major place I am looking how to avoid is:

>template <class ELFT> void Writer<ELFT>::run() {
>  copyLocalSymbols();
>> includeInSymtab()
....
  if (auto *S = dyn_cast<MergeInputSection<ELFT>>(D->Section))
    if (S->getRangeAndSize(D->Value).first->second ==
        MergeInputSection<ELFT>::PieceDead)
      return false;
}
I probably do not want to comment it because still investigating it, I just have a guess that we can avoid uncompression here and just check if whole section is dead or not instead of if checking if PieceDead. I am not ready to continue that thoughts now, still looking at, I am not familar with that code yet :)

So just please let me to spend some time on this, I am sure will able to suggest something to improve this patch.

I`ll update this right after relanding D20331 (which is currently reverted and waits for D20466, which is under review).

grimar updated this revision to Diff 58251.May 24 2016, 9:41 AM
grimar edited edge metadata.
  • Reworked testcase.
  • Rebased patch to support latest changes about mergable sections.
grimar added a comment.EditedMay 25 2016, 2:19 AM

llvm-mc changes required for testcase of this were reverted in r270638. I am looking into it, will ping/update this one after resolving that.

grimar updated this revision to Diff 58779.May 27 2016, 6:34 AM
  • Updated testcase after r270987, which updated llvm-mc to work with both zlib, zlib-gnu. Now I think this patch is ready to commit.
ruiu added inline comments.Jun 23 2016, 10:31 PM
ELF/InputSection.cpp
112

UncompressedSize

113–114

Cast to Elf{32,64}_Chdr and access ch_size member instead of hard-coding the offsets.

ELF/InputSection.h
91

I'd name this Uncompressed.

It needs a comment.

// If a section is compressed, this vector has uncompressed section data.
ELF/Writer.cpp
1194–1195
H->sh_flags & ~SHF_GROUP & ~SHF_COMPRESSED
grimar updated this revision to Diff 61764.Jun 24 2016, 2:17 AM
grimar marked 4 inline comments as done.
  • Addressed review comments.
ELF/InputSection.cpp
112

Done.

113–114

Done.

ELF/InputSection.h
91

Done.

ELF/Writer.cpp
1194–1195

Done.

ruiu accepted this revision.Jun 24 2016, 3:55 AM
ruiu edited edge metadata.

I think this patch is nice and clean. LGTM with a few nits.

ELF/InputSection.cpp
104

Let's name this Elf_Chdr for consistency with other Elf_ types.

test/ELF/compressed-debug-input.s
6 ↗(On Diff #61764)

Please s/PACK/COMPRESS/g this file.

This revision is now accepted and ready to land.Jun 24 2016, 3:55 AM
This revision was automatically updated to reflect the committed changes.
grimar marked 2 inline comments as done.