This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Add OVERWRITE_SECTIONS command
ClosedPublic

Authored by MaskRay on May 28 2021, 1:16 AM.

Details

Summary

This implements https://sourceware.org/bugzilla/show_bug.cgi?id=26404

An OVERWRITE_SECTIONS command is a SECTIONS variant which contains several
output section descriptions. The output sections do not have specify an order.
Similar to INSERT [BEFORE|AFTER], LinkerScript::hasSectionsCommand is not
set, so the built-in rules (see docs/ELF/linker_script.rst) still apply.
OVERWRITE_SECTIONS can be more convenient than INSERT because it does not
need an anchor section.

The initial syntax is intentionally narrow to facilitate backward compatible
extensions in the future. Symbol assignments cannot be used.

This feature is versatile. To list a few usage:

  • Use section : { KEEP(...) } to retain input sections under GC
  • Define encapsulation symbols (start/end) for an output section
  • Use section : ALIGN(...) : { ... } to overalign an output section (similar to ld64 -sectalign)

When an output section is specified by both OVERWRITE_SECTIONS and
INSERT, INSERT is processed after overwrite sections. To make this work,
this patch changes InsertCommand to use name based matching instead of pointer
based matching. (This may cause a difference when INSERT moves one output
section more than once. Such duplicate commands should not be used in practice
(seems that in GNU ld the output sections may just disappear).)

A linker script can be used without -T/--script. The traditional SECTIONS
commands are concatenated, so a wrong rule can be more noticeable from the
section order. This feature if misused can be less noticeable, just like
INSERT.

Diff Detail

Event Timeline

MaskRay created this revision.May 28 2021, 1:16 AM
MaskRay requested review of this revision.May 28 2021, 1:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2021, 1:17 AM
MaskRay edited the summary of this revision. (Show Details)May 28 2021, 12:49 PM

Will take a look this week. Apologies, bit behind due to UK public holiday on Monday.

Sorry for the delay in responding. I'm in favour of adding this as it should combine well with LLD when no default linker script is used.

I'm thinking overwrite-section-script may be a better name than --overwrite-script. It would reinforce the narrow nature of the script, to overwrite specific SECTIONS commands, and also help people not seeing it as overwriting their whole linker script.

With respect to Roland's comment in https://sourceware.org/bugzilla/show_bug.cgi?id=26404#c10 one way of having this work by adding another script fragment would be to add a keyword at the top of the linker script for example:

/* This linker script fragment is an overwrite script */
OVERWRITE_SCRIPT()

Will be good to have a test to show how --overwrite-script combines with INSERT AFTER and INSERT BEFORE

lld/ELF/Driver.cpp
222 ↗(On Diff #348461)

Could we put /* overwrite */ before the false.

lld/ELF/Options.td
369 ↗(On Diff #348461)

I'd recommend being more specific here
"Read a script containing SECTION commands that replace SECTIONS in the main or default linker script."

Will be good to update docs/ld.lld.1 and perhaps the release notes?

lld/ELF/ScriptParser.cpp
284

expect("SECTIONS") will do a good job of preventing other linker script commands, but it may leave people wondering why? Would it be possible to do something like:

expect("SECTIONS")
if (!errorCount())
  // Diagnostic "overwrite script can only contain SECTIONS commands."
lld/test/ELF/linkerscript/overwrite-script.test
106 ↗(On Diff #348461)

Will be good to have a diagnostic for an overwrite script that doesn't contain SECTIONS.

MaskRay updated this revision to Diff 349637.Jun 3 2021, 12:08 PM
MaskRay marked 4 inline comments as done.
MaskRay retitled this revision from [ELF] Add --overwrite-script to [ELF] Add --overwrite-section-script.
MaskRay edited the summary of this revision. (Show Details)
  • Rename to --overwrite-section-script
  • Test INSERT
  • Test A = 1; outside SECTIONS

Sorry for the delay in responding. I'm in favour of adding this as it should combine well with LLD when no default linker script is used.

I'm thinking overwrite-section-script may be a better name than --overwrite-script. It would reinforce the narrow nature of the script, to overwrite specific SECTIONS commands, and also help people not seeing it as overwriting their whole linker script.

Much appreciate your review. I agree that --overwrite-section-script is better.

With respect to Roland's comment in https://sourceware.org/bugzilla/show_bug.cgi?id=26404#c10 one way of having this work by adding another script fragment would be to add a keyword at the top of the linker script for example:

/* This linker script fragment is an overwrite script */
OVERWRITE_SCRIPT()

This would add some complexity to readLinkerScript as we will need to check the first token. Do we really need it?
This is just a matter between $file and -Wl,--overwrite-script=$file in the build system.

This would add some complexity to readLinkerScript as we will need to check the first token. Do we really need it?
This is just a matter between $file and -Wl,--overwrite-script=$file in the build system.

That is no small matter in many build systems. Switches that name input files are especially difficult in general.

Even if you have no sympathy for build systems you don't know about, requiring a switch precludes using this functionality as a "transparent" part of a library implementation. Supporting "just add -lfoo to your compile" as the end-user API for using a library is highly desireable, and often a non-negotiable requirement. Having a libfoo.a input linker script that wraps an INPUT ( libfoo-impl.a ) along with other magic is a well-worn and very successful technique. The use cases you cite as motivations for adding this feature in the first place are exactly the kinds of things one would like to encapsulate as "hidden magic" in a library that's "just linked".

MaskRay edited the summary of this revision. (Show Details)Jun 3 2021, 9:29 PM

Thanks for the updates.

One idea that might work better for the linker script fragment is to use a new keyword. For example instead of using --overwrite-section-script and SECTIONS. Use a new keyword like OVERWRITE_SECTIONS this could essentially be mixed in with any other linker script without needing --overwrite-section-script.

Thanks for the updates.

One idea that might work better for the linker script fragment is to use a new keyword. For example instead of using --overwrite-section-script and SECTIONS. Use a new keyword like OVERWRITE_SECTIONS this could essentially be mixed in with any other linker script without needing --overwrite-section-script.

I think that's an excellent idea. The new command really has different semantics from the old one, so making it a different keyword feels right.
It also makes it much simpler to have clear rules about what linker constructs are allowed in what kinds of contexts. Another syntax that would have equivalent properties from the user's point of view would be a trailing OVERWRITE keyword. That frankly seems less clear than OVERWRITE_SECTIONS, but it's more consistent with the existing SECTIONS { ... } INSERT ... syntax.

MaskRay updated this revision to Diff 349948.Jun 4 2021, 12:52 PM
MaskRay retitled this revision from [ELF] Add --overwrite-section-script to [ELF] Add OVERWRITE_SECTIONS command.
MaskRay edited the summary of this revision. (Show Details)

Use OVERWRITE_SECTIONS
Delete --overwrite-section-script

Introducing new syntax was my initial thought. Now I agree that having overwrite in the name is useful, OVERWRITE_SECTIONS looks great to me.
It now feels to me that --overwrite-section-script is less essential so I just removed the option from the patch.

Thanks very much for the update. I've made a few suggestions for the documentation.

One case I wanted to check was something like the following:

OVERWRITE_SECTIONS {
  orphan_section_name_matching_nothing_seen_before : { ... }
}

SECTIONS {
  another_orphan_section_name_matching_nothing_seen_before_including_overwrite : { ... }
} INSERT AFTER orphan_section_name_matching_nothing_seen_before

Would that place "orphan_section_name_matching_nothing_seen_before" using orphan rules then "orphan_section_name_matching_nothing_seen_before" immediately after it?

lld/docs/ELF/linker_script.rst
57

Suggest adding something like the insertion occurs after input sections have been mapped to output sections but before orphan sections have been processed.

59

A suggestion to avoid the "absence of a non-insert" which is difficult to parse.

In the case where no linker script has been provided or every `SECTIONS` command is followed by INSERT, LLD applies built-in rules which are similar to GNU ld's internal linker scripts.

147

"also appears in a `SECTIONS` command," might read a bit better. Only a weak suggestion though.

151

"also appears in"?

156

I got a little confused about this at first. Maybe worth being more specific:
"will be provided by the description in the `OVERWRITE_SECTIONS` command."

MaskRay updated this revision to Diff 350384.Jun 7 2021, 12:24 PM
MaskRay marked 5 inline comments as done.

Thanks for the comments. Addressed and add another INSERT test.

peter.smith accepted this revision.Jun 8 2021, 1:35 AM

LGTM thanks. One tiny typo spotted in a comment in the test.

lld/test/ELF/linkerscript/overwrite-sections.test
81

typo "than -> then"

This revision is now accepted and ready to land.Jun 8 2021, 1:35 AM
MaskRay updated this revision to Diff 350637.Jun 8 2021, 9:51 AM
MaskRay marked an inline comment as done.

fix a typo

I'll push this next week.

MaskRay edited the summary of this revision. (Show Details)Jun 13 2021, 12:37 PM
MaskRay edited the summary of this revision. (Show Details)Jun 13 2021, 12:39 PM
This revision was landed with ongoing or failed builds.Jun 13 2021, 12:41 PM
This revision was automatically updated to reflect the committed changes.