This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Assign file offsets of non-SHF_ALLOC after SHF_ALLOC and set sh_addr=0 to non-SHF_ALLOC
ClosedPublic

Authored by MaskRay on Aug 12 2020, 5:13 PM.

Details

Summary
  • GNU ld places non-SHF_ALLOC sections after SHF_ALLOC sections. This has the advantage that the file offsets of a non-SHF_ALLOC cannot be contained in a PT_LOAD. This patch matches the behavior.
  • For non-SHF_ALLOC non-orphan sections, GNU ld may assign non-zero sh_addr and treat them similar to SHT_NOBITS (not advance location counter). This is an alternative approach to what we have done in D85100. By placing non-SHF_ALLOC sections at the end, we can drop special cases in createSection and findOrphanPos added by D85100.

    Different from GNU ld, we set sh_addr to 0 for non-SHF_ALLOC sections. 0 arguably is better because non-SHF_ALLOC sections don't appear in the memory image.

ELF spec says:

sh_addr - If the section will appear in the memory image of a process, this
member gives the address at which the section's first byte should
reside. Otherwise, the member contains 0.

D85100 appeared to take a detour. If we take a combined view on D85100 and this
patch, the overall complexity slightly increases (one more 3-line loop) and
compatibility with GNU ld improves.

The behavior we don't want to match is the special treatment of .symtab
.shstrtab .strtab: they can be matched in LLD but not in GNU ld.

Diff Detail

Event Timeline

MaskRay created this revision.Aug 12 2020, 5:13 PM
MaskRay requested review of this revision.Aug 12 2020, 5:13 PM

Just to check I understand.

In D85100 we moved non SHF_ALLOC orphans after all other SHF_ALLOC sections so that they didn't affect address assignment of SHF_ALLOC sections. Now we don't move the SHF_ALLOC section positions but instead we do not move the location counter, and also move their file offset after all SHF_ALLOC sections so that they have no affect on addresses of SHF_ALLOC sections and cannot be part of a PT_LOAD section.

This is related to pr47094.

I don't have any objections, in the general case I think compatibility with BFD is worthwhile.

lld/ELF/LinkerScript.cpp
858

May be worth writing the condition the other way around as it is the most common case. Only a weak preference though.

if (sec->flags & SHF_ALLOC) {
...
} else {
  // Non-SHF_ALLOC OutputSections start at address 0
  ...
}
925

I think it's worth adding a comment. Suggest something like:
// Non-SHF_ALLOC sections do not affect the addresses of other OutputSections as they are not part of the process image.

lld/ELF/Writer.cpp
1236

I guess we've removed this as it is not strictly necessary any more?

jhenderson added a subscriber: edd.

I think this looks like it makes sense. Having the section header table out of order is a little weird to me, but also not illegal (indeed, I've used it on a few occasions to do crazy things before).

I've added @edd who reported pr47094, who might have further insight.

lld/test/ELF/linkerscript/sections-nonalloc.s
21–23

Just to confirm - these three sections appear here in the section header table (before data2 and .text), because they are being placed near the other non-alloc sections, due to orphan section handling, I presume?

MaskRay marked an inline comment as done.Aug 13 2020, 8:26 AM

Just to check I understand.

In D85100 we moved non SHF_ALLOC orphans after all other SHF_ALLOC sections so that they didn't affect address assignment of SHF_ALLOC sections. Now we don't move the SHF_ALLOC section positions but instead we do not move the location counter, and also move their file offset after all SHF_ALLOC sections so that they have no affect on addresses of SHF_ALLOC sections and cannot be part of a PT_LOAD section.

Yes

lld/test/ELF/linkerscript/sections-nonalloc.s
21–23

Yes.In GNU ld, the three sections are special: they can't be matched by output section descriptions.

Other than orphan section placement, our approach assigning section addresses is identical to GNU ld after this patch.

MaskRay updated this revision to Diff 285392.Aug 13 2020, 9:00 AM
MaskRay marked 4 inline comments as done.

Address comments

MaskRay edited the summary of this revision. (Show Details)Aug 13 2020, 9:01 AM
psmith accepted this revision.Aug 14 2020, 5:38 AM

Thanks for the update. I'm happy with the changes. I've tentatively approved, James/Edd please shout if you have any objections?

This revision is now accepted and ready to land.Aug 14 2020, 5:38 AM
MaskRay added inline comments.Aug 14 2020, 7:59 AM
lld/ELF/LinkerScript.cpp
925

Adopted. Thanks

lld/ELF/Writer.cpp
1236

Correct. Non-SHF_ALLOC sections affected address ranges so we moved orphan sections to the end to avoid their side effect. Now non-SHF_ALLOC orphan sections do not affect SHF_ALLOC.

I have nothing useful to add. Happy if others are happy, the change looks fine to me I think.

jhenderson accepted this revision.Aug 18 2020, 2:12 AM

@edd is now off for the week, and didn't get a chance to look as far as I know, so LGTM'ing myself. We can always fix any unexpected fallout post-commit if needed.

edd added a comment.Aug 24 2020, 6:45 AM

Apologies for not keeping on top of this while I was off work. FTR, looks good. Thanks for taking the time to look at it.