This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Linkerscript: allow adding start/end symbols to arbitrary section
ClosedPublic

Authored by evgeny777 on Aug 19 2016, 8:42 AM.

Details

Summary

Current revision does not allow adding symbols to special output sections, like .eh_frame, .eh_frame_hdr and others.
This patch implements this feature (which is required by our dynamic linker)

Diff Detail

Repository
rL LLVM

Event Timeline

evgeny777 updated this revision to Diff 68694.Aug 19 2016, 8:42 AM
evgeny777 retitled this revision from to [ELF] Linkerscript: allow adding start/end symbols to arbitrary section.
evgeny777 updated this object.
evgeny777 added a reviewer: ruiu.
evgeny777 set the repository for this revision to rL LLVM.
evgeny777 added a project: lld.
evgeny777 added subscribers: grimar, ikudrin, llvm-commits.
ruiu edited edge metadata.Aug 19 2016, 8:50 AM

This patch seems a bit tricky because it handles symbols at start or end of sections command specially. For example, it works as expected if a command is

.eh_frame_hdr { _foo = .; *(.eh_frame_hdr); _bar = .; }

but doesn't if

.eh_frame_hdr { _foo = .; *(.eh_frame_hdr); _bar1 = .; bar2 = .; }

I can see why you needed to do this (because special output sections are not a list of output sections), but I'm wondering if there's a better way than the way you implemented in this patch.

I agree, but unfortunately I don't see a lot of other options. What comes to my mind is:

  • Implement more sophisticated logic, so it would handle your case.
  • Check for symbol names (section name + "_end" or "_start"), and if there are then add start and/or end symbols.

Any other suggestions are appreciated, of course.

evgeny777 updated this revision to Diff 68831.Aug 22 2016, 1:45 AM
evgeny777 edited edge metadata.
evgeny777 added subscribers: emaste, davide.

An attempt to make it slightly more clever.

ruiu added inline comments.Aug 22 2016, 1:51 AM
ELF/LinkerScript.cpp
313 ↗(On Diff #68831)

I agree we need this function, but this is tricky, so this needs comments.

316 ↗(On Diff #68831)

Add is used only once, so please inline.

evgeny777 updated this revision to Diff 68835.Aug 22 2016, 2:24 AM
evgeny777 removed rL LLVM as the repository for this revision.

Added support for expressions (see test case)

evgeny777 updated this revision to Diff 68838.Aug 22 2016, 2:35 AM

Addressed review comments

ruiu added inline comments.Aug 22 2016, 2:46 AM
ELF/LinkerScript.cpp
314 ↗(On Diff #68838)

I'd also mention that special sections are not a list of regular output sections and therefore the regular mechanism to define symbols doesn't work. It's also good to mention that this approach is not perfect -- it only handles start and end symbols.

321 ↗(On Diff #68838)

Use isa instead of dyn_cast.

321–324 ↗(On Diff #68838)

Let's reduce the indentation level by using early continues.

if (isa<InputSectionDescription>(...)) {
  Start = false;
  continue;
}
auto *AssignCmd = dyn_cast...;
if (!AssignCmd)
  continue;
evgeny777 updated this revision to Diff 68841.Aug 22 2016, 3:18 AM

Addressed review comments

ruiu added inline comments.Aug 23 2016, 11:33 PM
ELF/LinkerScript.cpp
316 ↗(On Diff #68841)

Format.

333–335 ↗(On Diff #68841)

If an output section is defined as

.eh_frame { start = .; foo.o(.eh_frame); bar = .; bar.o(.eh_frame); end = . }

then bar is silently set so that it points to the end of .eh_frame section? I think we want to print an error message for that instead.

evgeny777 updated this revision to Diff 69087.Aug 24 2016, 12:41 AM

Added error checking + test case for error.

evgeny777 updated this revision to Diff 69490.Aug 27 2016, 8:02 AM

Fixed bug in addStartEndSymbols() which caused the following script to fail

.eh_frame_hdr : {
   *(.eh_frame_hdr)
   *(.eh_frame_hdr)
}
ruiu added inline comments.Aug 29 2016, 2:01 PM
ELF/LinkerScript.cpp
321 ↗(On Diff #69490)

It seems unnecessarily complicated to me. Why don't you do something like this?

// Add start symbols.
for (int I = 0, E = Cmd->Commands.size(); I < E; ++I) {
  if (!isa<SymbolAssignment>(Cmd->Commands[I]))
    break;
  ....
}

// Add end symbols.
for (int I = Cmd->Commands.size() - 1; I >= 0; --I) {
  if (!isa<SymbolAssignment>(Cmd->Commands[I]))
    break;
  ....
}
evgeny777 added inline comments.Aug 29 2016, 2:15 PM
ELF/LinkerScript.cpp
321 ↗(On Diff #69490)

That's fine, but what about error diagnostics, you suggested to add last time?

ruiu added inline comments.Aug 29 2016, 2:18 PM
ELF/LinkerScript.cpp
321 ↗(On Diff #69490)

Well, maybe something like this?

 // Add start symbols.
 int I = 0;
 for (int E = Cmd->Commands.size(); I < E; ++I) {
   if (!isa<SymbolAssignment>(Cmd->Commands[I]))
     break;
   ....
 }

 // Add end symbols.
 int J = Cmd->Commands.size() - 1;
 for (; J >= 0; --J) {
   if (!isa<SymbolAssignment>(Cmd->Commands[J]))
     break;
   ....
 }

// Warn on stray assignment expressions.
for (; I <= J; ++I)
  if (auto *Foo = dyn_cast<SymbolAssignment>(Cmd->Commands[I])
    ....
evgeny777 added inline comments.Aug 30 2016, 2:54 AM
ELF/LinkerScript.cpp
321 ↗(On Diff #69490)

The problem in this implementation is that it add 'end' symbols in reverse order. The following one will not work:

.eh_frame_hdr : {
    foo = .;
    *(.eh_frame_hdr)
    bar1 = .;
    bar2 = bar1 + 1;
}
evgeny777 updated this revision to Diff 69657.Aug 30 2016, 2:56 AM

Diff updated.
I've simplified the 'addStartEndSymbols' function and hope it's easier to understand now.

ruiu accepted this revision.Aug 30 2016, 1:11 PM
ruiu edited edge metadata.

LGTM

This revision is now accepted and ready to land.Aug 30 2016, 1:11 PM
This revision was automatically updated to reflect the committed changes.