This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Do not collect SHT_REL[A] sections unconditionally when --gc-sections and --emit-relocs used together.
ClosedPublic

Authored by grimar on Oct 10 2017, 6:16 AM.

Details

Summary

This is "Bug 34836 - --gc-sections remove relocations from --emit-relocs",

When --emit-relocs is used, LLD currently always drops SHT_REL[A] sections from
output if --gc-sections is present. Patch fixes the issue.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar created this revision.Oct 10 2017, 6:16 AM
ruiu added inline comments.Oct 10 2017, 1:07 PM
ELF/InputSection.cpp
81 ↗(On Diff #118368)

I don't think we want to export this function. Can you move this function to MarkLive.cpp, and leave Live uninitialized in this file? If GC is disabled, you can visit all sections to turn Live bits on in MarkLive.cpp.

ELF/InputSection.h
362–363 ↗(On Diff #118368)

Returns true if a given section is a subject of garbage collection.

364 ↗(On Diff #118368)

isLiveByDefault is not a good name as a non-file-scoped function name. I'd name this canBeCollected.

grimar updated this revision to Diff 118592.Oct 11 2017, 4:45 AM
  • Rebased, addressed comments.
ruiu added inline comments.Oct 11 2017, 12:06 PM
ELF/MarkLive.cpp
165–167 ↗(On Diff #118592)

"reserved" doesn't make sense unless you describe what is considered "reserved". If you name a function "reserved" and write a comment saying that this function returns true if a given object is reserved, it doesn't explain the function. You need to explain to those who don't have a priori knowledge about the function.

If you revisit this code a few year later, you'd wonder why they are considered reserved and why they had to be treated in a different way than the other sections. Generally, your comments should answer these questions.

I'd explain: Some sections are used directly by the loader, so they should never be garbage-collected. This function returns true if a given section is such section.

197 ↗(On Diff #118592)

we also want to remove ...

219–221 ↗(On Diff #118592)

Why do you have to check this? All non-collectible sections have already been marked alive, so you don't need to check for the same condition again in this function.

grimar updated this revision to Diff 118769.Oct 12 2017, 3:42 AM
grimar marked 2 inline comments as done.
  • Addressed review comments.
ELF/MarkLive.cpp
165–167 ↗(On Diff #118592)

Ok, thanks.

219–221 ↗(On Diff #118592)

It is needed for REL[A] sections which are collectible. Initial condition exits
early for non-alloc sections, but REL[A] sections are not-alloc, but collectible,
so I want to let them go through.
Updated comment.

ruiu added inline comments.Oct 13 2017, 11:02 AM
ELF/MarkLive.cpp
219–221 ↗(On Diff #118592)

I do not understand. All non-gc'able objects have already marked alive when the control flow reaches here. Why don't you use Sec->Live instead?

grimar updated this revision to Diff 119147.Oct 16 2017, 6:14 AM
  • Addressed comments, added testcase.
ELF/MarkLive.cpp
219–221 ↗(On Diff #118592)

Let me use following theses to explain:

  1. I can't use Sec->Live. See it is already used at line 230. But we can not use it for early return before we do markLiveAt for MergeInputSections.
  2. markLiveAt would assert if called for non-alloc section.
  3. I could remove whole my/original code check and all tests would pass with this patch.
  4. But I can't do that because it could be a reason of fail. I added testcase to demonstrate that LLD asserts when I am trying to KEEP non-alloc mergeable section from linker script.

So this check used as a guard that does not allow non-alloc sections to pass to prevent asserting in markLiveAt.

  1. REL[A] sections are non-alloc sections that are not marked live yet at this place. And I want them pass this check.
ruiu edited edge metadata.Oct 16 2017, 10:50 AM

I'm not convinced. It may fail on some assert, or it may not work well with the current code, but my comment was from a more high-level design point of view. I'd like you to also think at high-level instead of tracing the existing code to explain why your patch needs to be in the current design.

You are writing a mark-sweep garbage collector. In the mark phase, you mark all root objects as alive, and mark all reachable objects as alive. In this phase, the knowledge of whether some object is a gc root or not is needed only at the beginning of the algorithm. If not, it is probably a sign that something is not correct. If that's the case, we need to find a design flaw and fix it, instead of writing more code on top of a wrong assumption.

Did you take a look why we had the assert? Can you explain why the assertion is there? I wonder if it makes sense.

In D38724#898776, @ruiu wrote:

Did you take a look why we had the assert? Can you explain why the assertion is there? I wonder if it makes sense.

Sure. We have is as a guard for internal logic of MergeInputSection I believe and it can easily be removed.
I prepared cleanup patch: D38996.
(It is almost completely independent from this one though).

This change works on my case, but it names the reloc sections it emits based on the first input section in the output section being relocated rather than on the output section itself.
e.g. in my -ffunction-sections --gc-sections --emit-relocs build, the SHT_RELA section with sh_info pointing to the .text section is called .rela.text.first_function_in_text rather than .rela.text.

ruiu added inline comments.Oct 22 2017, 7:58 PM
ELF/MarkLive.cpp
228 ↗(On Diff #119147)

I took time to investigate this, and looks like the easiest way of doing this is to change

void markLiveAt(uint64_t Offset) {
 assert(this->Flags & llvm::ELF::SHF_ALLOC);
 LiveOffsets.insert(Offset);
}

to

void markLiveAt(uint64_t Offset) {
  if (this->Flags & llvm::ELF::SHF_ALLOC)
    LiveOffsets.insert(Offset);
 }

This change works on my case, but it names the reloc sections it emits based on the first input section in the output section being relocated rather than on the output section itself.
e.g. in my -ffunction-sections --gc-sections --emit-relocs build, the SHT_RELA section with sh_info pointing to the .text section is called .rela.text.first_function_in_text rather than .rela.text.

I am not sure that is an issue as SHT_RELA section name should be insignificant here, but it I would probably fix that to make output nicer and be consistent with GNU linkers. I'll take a look.

ELF/MarkLive.cpp
228 ↗(On Diff #119147)

Thanks for your time. It is probably the easiest way indeed,
but I supposed D38996 is a better cleanup solution for this place.
Sounds like I should abandon it then ?

grimar updated this revision to Diff 119829.Oct 23 2017, 4:04 AM
  • Updated in according to review comments.
ruiu added a comment.Oct 23 2017, 8:53 AM

Well, it doesn't seems like https://reviews.llvm.org/D38996 is particularly a good change. Somewhat neutral, but I wouldn't land that.

ruiu accepted this revision.Oct 23 2017, 10:06 AM

LGTM

This revision is now accepted and ready to land.Oct 23 2017, 10:06 AM
This revision was automatically updated to reflect the committed changes.