This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Decompress debug info sections early
ClosedPublic

Authored by smeenai on Oct 2 2017, 9:33 PM.

Details

Summary

When reporting a symbol conflict, LLD parses the debug info to report
source location information. Sections have not been decompressed at this
point, so if an object file contains zlib compressed debug info, LLD
ends up passing this compressed debug info to the DWARF parser, which
causes debug info parsing failures and can trigger assertions in the
parser (as the test case demonstrates).

Decompress debug sections when constructing the LLDDwarfObj to avoid
this issue. This doesn't handle GNU-style compressed debug info sections
(.zdebug_*), which at present are simply ignored by LLDDwarfObj; those
can be done in a follow-up.

Diff Detail

Repository
rL LLVM

Event Timeline

smeenai created this revision.Oct 2 2017, 9:33 PM
davide added a subscriber: davide.Oct 2 2017, 9:35 PM

what's the link time impact?

ELF/GdbIndex.cpp
32 ↗(On Diff #117468)

Not your fault but M looks like a not so great variable name choice.

what's the link time impact?

I didn't measure it, but I don't think this should make a noticeable difference, since if we don't decompress the sections here, they'll be decompressed later in decompressAndMergeSections anyway.

Additionally, I believe this codepath is only used in error reporting scenarios.

grimar added inline comments.Oct 3 2017, 2:07 AM
ELF/GdbIndex.cpp
38 ↗(On Diff #117468)

We call decompressAndMergeSections at some point in linker, and at first I thought that better would be to just call it earlier or at least to decompress sections early, but that is not so convinent to do. Assuming we still want to use threading for that it is more convinent to do when we have full list of input sections, like now.

And here uncompress is used for few sections in a single file and only when we reporting errors. That is probably looks fine for me.

test/ELF/compressed-debug-conflict.test
77 ↗(On Diff #117468)

I am pretty sure you should not need this and few other sections, symbols and relocations. Testcase can be reduced.
Though I would suggest not to use yaml2obj here, but use asm input instead. Just like we usually do (see compressed-debug-input.s, gdb-index-empty.s for example). Doing that should simplify testcase a lot.

smeenai added inline comments.Oct 3 2017, 8:31 AM
ELF/GdbIndex.cpp
38 ↗(On Diff #117468)

You're right; this loses the parallelism. I think that's okay as long as there are no plans to eventually use this outside of error-reporting scenarios.

test/ELF/compressed-debug-conflict.test
77 ↗(On Diff #117468)

I originally had the YAML because I wanted to reproduce the exact debug info that was causing the assertion failure, but I realize now that you can do so via assembly debug info as well. Lemme play around with reducing this.

smeenai updated this revision to Diff 117537.Oct 3 2017, 8:48 AM

Simplify test

grimar added inline comments.Oct 3 2017, 8:56 AM
test/ELF/compressed-debug-conflict.s
5 ↗(On Diff #117537)

As far I remember, if compressed section size is greater then uncompressed, then
producer (llvm-mc) will not compress it.
I would add check here to verify that input object contains SHF_COMPRESSED section.

19 ↗(On Diff #117537)

I guess that is why you duplicated "repeat" here: to increase debug info size to force section be compressed, right ?

smeenai added inline comments.Oct 3 2017, 8:57 AM
test/ELF/compressed-debug-conflict.s
5 ↗(On Diff #117537)

That's a good idea.

19 ↗(On Diff #117537)

Yup, since zlib will be able to compress the repeats well and get the compressed size below the uncompressed size.

smeenai updated this revision to Diff 117538.Oct 3 2017, 9:03 AM

Ensure section is compressed

grimar edited edge metadata.Oct 3 2017, 9:16 AM

I have no futher comments, this LGTM, please wait for other reviewers approval. Thanks !

ruiu edited edge metadata.Oct 3 2017, 3:40 PM

I'd make two changes:

  1. Decompressor::isCompressedELFSection(Sec->Flags, Sec->Name) is always called before calling uncompress(). It is better to move that check inside uncompress and rename it maybeUncompress.
  2. With this change, it's not clear whether it is guaranteed that uncompress is called at most once. I'd make a change to uncompress so that it doesn't do anything on a second call. You can just check if UncompressBuf is empty.
smeenai updated this revision to Diff 117604.Oct 3 2017, 5:17 PM

Address @ruiu's comments

ruiu accepted this revision.Oct 3 2017, 5:18 PM

LGTM

ELF/GdbIndex.cpp
20 ↗(On Diff #117604)

You can now remove this #include.

This revision is now accepted and ready to land.Oct 3 2017, 5:18 PM
This revision was automatically updated to reflect the committed changes.
smeenai added inline comments.Oct 3 2017, 5:22 PM
ELF/GdbIndex.cpp
20 ↗(On Diff #117604)

Removed before committing. Thanks for the catch.