This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Do not ICF two sections with different output sections (by SECTIONS commands)

Authored by MaskRay on Aug 25 2019, 9:13 AM.



Fixes PR39418. Complements D47241 (the non-linker-script case).

processSectionCommands() assigns input sections to output sections.
ICF is called before it, so and may be folded even if
their output sections are made different by SECTIONS commands.

doIcf<ELFT>()                      // During ICF, we don't know the output sections
  script->processSectionCommands() // InputSection -> OutputSection assignment

This patch splits processSectionCommands() into processSectionCommands() and
processSymbolAssignments(), and moves processSectionCommands() before ICF:

doIcf<ELFT>()                      // should remove folded input sections

An alternative approach is to unfold a section sec in
processSectionCommands() when we find sec and sec->repl belong to
different output sections. I feel this patch is superior because this
can fold more sections and the decouple of
SectionCommand/SymbolAssignment gives flexibility:

  • An ExprValue can't be evaluated before its section is assigned to an output section -> we can delete getOutputSectionVA and simplify another place where we had to check if the output section is null. Moreover, a case in linkerscript/early-assign-symbol.s can be handled now.
  • processSectionCommands/processSymbolAssignments can be freely moved around.

Diff Detail


Event Timeline

MaskRay created this revision.Aug 25 2019, 9:13 AM
MaskRay updated this revision to Diff 217069.Aug 25 2019, 7:55 PM
MaskRay edited the summary of this revision. (Show Details)
MaskRay added subscribers: andrewng, bd1976bris.

Add subscribed people in an previous approach D54422

Delete a stale comment
Fix deletion of dead input section members
Improve linkerscript/icf-different-output-sections.s to catch the case

MaskRay updated this revision to Diff 217072.Aug 25 2019, 9:06 PM
MaskRay edited the summary of this revision. (Show Details)

Fix the call tree in the description

Simplify processSymbolAssignments

MaskRay updated this revision to Diff 217081.Aug 25 2019, 11:12 PM
MaskRay edited the summary of this revision. (Show Details)

I realized EhInputSection and .ARM.exidx* filtering is tricky.
Created D66727 as a prerequisite. This shall make overall diff smaller

MaskRay updated this revision to Diff 217083.EditedAug 26 2019, 12:30 AM

Delete a stale .ARM.exidx* check.

I think the new processSymbolAssignments() gives us a simple workaround for

% ld.lld -shared a.o -T a.x -Bsymbolic
ld.lld: error: can't create dynamic relocation R_AARCH64_ABS32 against symbol: __efistub_stext_offset in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output
>>> defined in a.x:1
>>> referenced by a.o:(.text+0x4)

If we call script->processSymbolAssignments() the second time, __efistub_stext_offset will be changed to an absolute value.

   // Change values of linker-script-defined symbols from placeholders (assigned
   // by declareSymbols) to actual definitions.
+  script->processSymbolAssignments();
% ld.lld -shared a.o -T a.x -Bsymbolic # works

If we say D66279 solves the the value convergence problem, then adding another processSymbolAssignments() can be said to solve the type (absolute or non-absolute) convergence problem.

Given the direction of how linkerscript support is going forward in LLD last time, this approach looks generally good to me.

11 ↗(On Diff #217083)

Since with this patch we can handle the case above, this comment needs either to be moved/removed and/or updated.

45 ↗(On Diff #217083)

Seems you only need to have a 2 .rodata sections for this test?

I'd suggest testing/demonstrating the output from the 2 scripts here probably:

  1. : { *(*) } : { *(*) }
  1. .rodata: { *(.rodata.*) }

I.e. idea is to show in the test that we can put sections in a single or a multiple sections and
that in the case (2) input sections will be folded and in case (1) they will not.

ruiu added a comment.Aug 26 2019, 2:33 AM

I like this new approach because it seems more modular than before.

503 ↗(On Diff #217083)

This comment is perhaps a bit too terse. Can you explain in the comment why we are doing this?

155 ↗(On Diff #217083)

Maybe this and createSynethticSections should be moved to a new file if they are no longer a part of the writer?

MaskRay marked an inline comment as done.Aug 26 2019, 3:35 AM
MaskRay added inline comments.
155 ↗(On Diff #217083)

Do you have a recommendation what the new file should be called? While this is under review, I am experimenting other reordering this change will enable (e.g. moving addOrphanSections from Writer.cpp to between processSectionCommands/ICF). The definition of "Writer" is indeed unclear to me now...

MaskRay updated this revision to Diff 217113.Aug 26 2019, 4:31 AM
MaskRay marked 4 inline comments as done.

Delete a redundant dyn_cast<OutputSection>
Add more comments to ICF<ELFT>::run()

I agree that this is going in the right direction. I'm particularly happy about splitting out the symbol assignments.

53 ↗(On Diff #217113)

If we end up calling this with sec->getOutputSection() == nullptr we'll get a segfault that could be hard to track down. It may be worth preserving getOutputSectionVA with an assert non nullptr. Alternatively if we can guarantee that getOutputSection() is non nullptr then we could put an assert into getOutputSection().

155 ↗(On Diff #217083)

I've not got a good answer right now, will try and have a think. May be worth going through the overall control flow and seeing what comes out. My high-level view of a linker control flow is:

  • Load all content from files into an address independent representation.
  • Do address independent transformation such as GC and ICF.
  • Create synthetic InputSections.
  • Layout the output file (InputSections -> OutputSections), although not necessarily assign addresses yet.
  • Populate synthetic content such as the .plt and .got. It helps to have removed as much as possible prior to this point.
  • Assign addresses.
  • Perfom address dependent transformations.
  • Finalize addresses (We now have a logical final ELF file).
  • Write ELF file.

At each stage we are adding information, permitting more transformations, but also restricting our freedom as making changes invalidates the information. That is a control flow view of the linker, rather than a structural view of the components though.

I think our Writer.cpp combines everything from create synthetic InputSections down. If I were hazarding a guess at a new name for the moved functions I'd recommend something like layout.cpp. In any case it may be worth taking a step back to see what the options are.

MaskRay updated this revision to Diff 217403.Aug 27 2019, 7:38 AM
MaskRay marked an inline comment as done.

Restore static uint64_t getOutputSectionVA(SectionBase *sec) { with an assert

MaskRay added inline comments.Aug 27 2019, 7:40 AM
53 ↗(On Diff #217113)

We cannot guarantee getOutputSection() returns non nullptr.

merge-gc-piece2.s and gc-sections-non-alloc-to-merge.s are tests where a MergeInputSection may have a null InputSection parent.

I tried adding more asserts to SectionBase::getOutputSection but that seems hard. We probably should do some other layout changes before enforcing nonnullness in getOutputSection.

Restored with an assert.

45 ↗(On Diff #217083)

I think .rodata* are redundant. I will delete them.

Will add another script for your point (1), and add another script for orphans.

Shall we move forward with this change? 🤡

(BTW, Peter's comment at is great.)

I have no more comments/objections, I think it is good to go.

I'm happy to move forward with the change as well.

ruiu accepted this revision.Sep 2 2019, 3:12 AM


I'm happy about this change too.

This revision is now accepted and ready to land.Sep 2 2019, 3:12 AM
This revision was automatically updated to reflect the committed changes.