This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Allow defined symbols to be assigned from linker script
ClosedPublic

Authored by meadori on Nov 30 2016, 11:52 AM.

Details

Reviewers
ruiu
atanasyan
Summary

Hi All,

I have been working on extending LLD to handle assigning new values to symbols that are already defined via the linker script (e.g. override). This is common in some scripts that have (mainly from bare-metal GCC toolchains). The attached patch mostly works, but it has some problems related to linker-defined symbols that I will describe next. I am *not* submitting this patch as a formal submission, but a request for feedback.

As mentioned, this patch mostly works for overriding a symbol's value from the linker script, but currently falls over with respect to _gp on MIPS. In particular, the test added for D27036 fails. The reason being is that _gp is defined relative to .got. Thus consider a script with something like:

_gp = 0x100;
.got  : { *(.got) }

_gp gets a value of &.got + 0x100. The *new* definition from the linker script gets defined in .got too.

FWIW, in most MIPS toolchains I have work with _gp is an absolute symbol and *not* a .got relative one and the default linker scripts have things like:

HIDDEN (_gp = ALIGN (16) + 0x7ff0);
.got            : { *(.got) }

Comments on the general symbol override functionality and the MIPS related issue are most welcome.

Diff Detail

Event Timeline

meadori updated this revision to Diff 79784.Nov 30 2016, 11:52 AM
meadori retitled this revision from to [ELF, WIP] Allow defined symbols to be assigned from linker script.
meadori updated this object.
meadori added reviewers: ruiu, atanasyan.
meadori added a subscriber: llvm-commits.
ruiu edited edge metadata.Nov 30 2016, 1:32 PM

In the first place, I wonder why you want to overwrite _gp. What is the background of your motivation?

LLD supports ADDR() funciton, so I think you can do this in a straightforward manner like this.

_gp = ADDR(.got) + 0x100;

Does it work for you?

I want to be able to override *any* symbol's value (which is support in GNUnLD, for example). Currently LLD issues multiple definition errors if I say something like:

foo = 1

and foo is defined in an object file or something.

The _gp thing was just a issue I ran into while implementing the general ability to redefine a symbol's value from a linker script because _gp is defined by the linker relative to .got and thus causes problems when completely overriding it from a linker script.

atanasyan edited edge metadata.Nov 30 2016, 1:53 PM

_gp gets a value of &.got + 0x100. The *new* definition from the linker script gets defined in .got too.

FWIW, in most MIPS toolchains I have work with _gp is an absolute symbol and *not* a .got relative one

You are right. It's an old bug in LLD and I regularly postpone to fix it. The _gp symbol should be an absolute symbol. If it is not defined in a linker script we need to assign default value .got + 0x7ff0. It looks like it's time to fix this problem.

To Rui: The _gp symbol marks the base of the small data area. Not only GOT relocations depend on _gp. Users of bare-metal toolchains might want to place small data to specific address.

ruiu added a comment.Nov 30 2016, 3:25 PM

Got it. So that's not specific to _gp, but you are talking about the fact that LLD complains if a symbol you are trying to define is already defined in an object file.

That's just a negligence or a bug. Symbols in linker scripts should take precedence over pre-existing symbols.

FYI

I fixed the problem with _gp symbol in the D27524.

Nice! Thank you! I will check it out.

ruiu added a comment.Dec 7 2016, 10:10 AM

What is the status of this patch? Is this ready for review?

In D27276#616083, @ruiu wrote:

What is the status of this patch? Is this ready for review?

I waiting for D27524 to go through. Once that happens I will rework the patch with better comments and tests. If you prefer, we can close this review and I will open a new one later.

meadori updated this revision to Diff 80809.Dec 8 2016, 12:36 PM
meadori retitled this revision from [ELF, WIP] Allow defined symbols to be assigned from linker script to [ELF] Allow defined symbols to be assigned from linker script.
meadori edited edge metadata.
  • Added a few more comments.
  • Added test cases.
ruiu added inline comments.Dec 8 2016, 12:48 PM
ELF/LinkerScript.cpp
85

What is "it" here?

103

I think this function can be simplified. We do not create a new symbol only when

  • it is in a PROVIDE(), and
  • the existing symbol is not an Undefined symbol.

Otherwise, we always want to create symbol and then assign a value to it. So, we can return early for the PROVIDE() case.

106–107

nit: the ternary operator doesn't look nice if it need too much indentation. Probably this is better.

if (Cmd_>Expression.IsAbsolute())
  Cmd->Sym = addRegular...;
else
  Cmd->Sym = ...;
meadori updated this revision to Diff 80830.Dec 8 2016, 2:42 PM

Address review feedback:

  • Clarify code comments.
  • Simplify logic in addSymbol.
ruiu added inline comments.Dec 8 2016, 2:51 PM
ELF/LinkerScript.cpp
92–96

Use early return and assignment-in-if.

  return;
}
if (const OutputSectionBase *Sec = Cmd->Expression.Section())
  cast<....>... = ...;
110–117

I don't think you need to check if B exists. You can add a new symbol unconditionally.

meadori added inline comments.Dec 8 2016, 3:03 PM
ELF/LinkerScript.cpp
110–117

If you do it unconditionally, then duplicate symbols might be created, e.g. foo is defined in an input object *and* the linker script.

ruiu added inline comments.Dec 8 2016, 3:06 PM
ELF/LinkerScript.cpp
110–117

That shouldn't happen because of the architecture of LLD. Did you actually try that?

meadori added inline comments.Dec 8 2016, 3:29 PM
ELF/LinkerScript.cpp
110–117

I did, and several tests in the test suite fail. Consider making unconditional with the following diff against my current patch:

-  if (!B || B->isUndefined()) {
-    if (Cmd->Expression.IsAbsolute())
-      Cmd->Sym = addRegular<ELFT>(Cmd);
-    else
-      Cmd->Sym = addSynthetic<ELFT>(Cmd);
-  } else {
-    Cmd->Sym = B;
-  }
+  if (Cmd->Expression.IsAbsolute())
+    Cmd->Sym = addRegular<ELFT>(Cmd);
+  else
+    Cmd->Sym = addSynthetic<ELFT>(Cmd);

Then:

$ ./bin/llvm-nm test.o 
0000000000000001 A foo
$ cat test.ld
foo = 12;
$ ./bin/ld.lld test.o -T test.ld
./bin/ld.lld: error: duplicate symbol 'foo' in test.o and (internal)
ruiu added inline comments.Dec 8 2016, 3:40 PM
ELF/LinkerScript.cpp
110–117

Got it. So addRegular and such check for conflicts. But what we want to do here is to replace symbols unconditionally. Otherwise, this code doesn't work for some cases. For example, what if an existing symbol is a common symbol?

I think you want directly call replaceBody<> to replace symbol body unconditionally.

I'm concerned a bit about the XFAIL mips test. Do you have any estimations when this patch be ready?

I'm concerned a bit about the XFAIL mips test. Do you have any estimations when this patch be ready?

I will have a new patch out this week. Sorry for the delay, I hit another bug when implementing Rui's latest feedback.

I'm concerned a bit about the XFAIL mips test. Do you have any estimations when this patch be ready?

I will have a new patch out this week. Sorry for the delay, I hit another bug when implementing Rui's latest feedback.

No problem. Thanks for the information.

meadori updated this revision to Diff 82948.Jan 3 2017, 2:13 PM

Apologies for the delay on this one. @ruiu, if I understand correctly you are suggesting for the unconditional replacement of the symbols in the symbol table. That make sense to me. The latest patch adds a "force create" option to the appropriate symbol table methods. Is that okay?

With that, I can get the desired behavior for replacing symbols *however* there is still one test failing: lld :: ELF/mips-gp-ext.s. The reason this fails is that the expression _gp = . + 0x100; gets put in the wrong section. In particular, it gets put in .text instead of *ABS*. This section assignment problem is due to another bug in lld that needs to be invested.

In suggestion on a way forward? Fixing the aforementioned section assignment problem doesn't seem entirely straight forward.

ruiu added inline comments.Jan 3 2017, 3:13 PM
ELF/SymbolTable.h
63–68 ↗(On Diff #82948)

This is towards a right direction, but you don't need to modify the symbol table. The symbol table is to map names to symbols, but because you already have a symbol object in this case, symbol table lookups are not necessary. Please use replaceBody() to overwrite existing symbols with new ones.

meadori added inline comments.Jan 5 2017, 7:55 PM
ELF/SymbolTable.h
63–68 ↗(On Diff #82948)

Oh, I missed that you can get the Symbol object from the SymbolBody object. I get it now. Thanks.

meadori updated this revision to Diff 83341.EditedJan 5 2017, 7:58 PM
  • Address @ruiu's point about using replaceBody.
  • Add XFAIL back to test/ELF/mips-gp-ext.s -- this test fails now due to another lld bug. I will look into the other bug after pushing this through.
ruiu added inline comments.Jan 5 2017, 8:52 PM
ELF/LinkerScript.cpp
62–63

It is probably easier to add a dummy symbol and then replace that unconditionally like this.

Symbol *Sym = Symtab<ELFT>::X->addUndefined(Cmd->Name);
replaceBody<DefinedRegular<ELFT>>(Sym, Cmd->Name, ...);
return Sym->body();

You can remove Symbol *Sym from the parameter list of this function.

67

Ditto. You don't need to pass Sym.

meadori added inline comments.Jan 6 2017, 11:12 AM
ELF/LinkerScript.cpp
62–63

Much better. Thanks!

meadori updated this revision to Diff 83387.Jan 6 2017, 11:14 AM
  • Address @ruiu's point about using addUndefined. Note that I had to use the long form to control the visibility (for symbols defined in PROVIDE).
  • XFAIL of test/ELF/mips-gp-ext.s remains -- this test fails now due to another lld bug. I will look into the other bug after pushing this through.
ruiu accepted this revision.Jan 7 2017, 12:46 AM
ruiu edited edge metadata.

LGTM

ELF/LinkerScript.cpp
70–71

nit: remove this blank line so that it matches with the style in the above function.

This revision is now accepted and ready to land.Jan 7 2017, 12:46 AM
meadori closed this revision.Jan 9 2017, 10:50 AM

Committed. Thanks for all the review @ruiu. Much appreciated.