This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Linkerscript: fix issue with multiple output sections definitions.
AbandonedPublic

Authored by grimar on Nov 1 2017, 5:20 AM.

Details

Reviewers
ruiu
rafael
Summary

Currently for following script LLD produce broken output silently:

bar : ONLY_IF_RO { *(no_such_section) }
bar : ONLY_IF_RW { *(foo_aw) }
sym1 = SIZEOF(bar); 
sym2 = ADDR(bar);

Values of sym1 and sym2 are zeroes.

That happens because of NameToOutputSection member which currently is a
mapping from name to OutputSection. Though in case above there are two different
output section commands, so I believe it should be a mapping to vector.

Patch changes implementation of mapping what fixes the issue observed.

Diff Detail

Event Timeline

grimar created this revision.Nov 1 2017, 5:20 AM
grimar updated this revision to Diff 123004.Nov 15 2017, 4:16 AM
  • Rebased. Ping.

If I understand correctly is that there is no restriction on OutputSection name so something like:

bar : { *.bar.1 }
bar : { *.bar.2 }

Would be legal, but not very sensible and a user could just change the name of the second bar. Whereas the case you describe:

bar : ONLY_IF_RO { *(no_such_section) }
bar : ONLY_IF_RW { *(foo_aw) }
sym1 = SIZEOF(bar); 
sym2 = ADDR(bar);

Is useful as it permits a program to get the [base, limit) of an OutputSection that can contain either RO or RW? This would make the added complexity of the change worthwhile?

If I've understood their purpose correctly; I'm not too sure that the forward declarations are a good idea. Recent ld manuals prohibit forward references for many of the builtin functions.

ELF/ScriptParser.cpp
974

If I understand correctly this might create a forward reference to some OutputSection Name that we've not seen yet? I don't think that this is allowed in this case: https://sourceware.org/binutils/docs/ld/Builtin-Functions.html

Return the address (VMA) of the named section. Your script must previously have defined the location of that section.

999

If this is a forward reference then the manual says we should give an error message:

If the section has not been allocated when this is evaluated, the linker will report an error.

1050

The docs don't say anything about forward references for this particular builtin. I think that there is a possibility in creating a cycle in address allocation though with something like . = LOADADDR(forward)

1073

Another case where we are supposed to give an error if there is a forward reference:

If the section has not been allocated when this is evaluated, the linker will report an error.

grimar added inline comments.Nov 15 2017, 7:20 AM
ELF/ScriptParser.cpp
974

It is actually what original code did already before my change. getOrCreateOutputSection creates forward reference.
We rely on that in our testcases, for example absolute.s has just:
"PROVIDE(foo = 1 + ABSOLUTE(ADDR(.text)));" and expects we can evaluate that. I believe we have the same
behavior as gnu linkers here.

Below you referenced to spec saying that for ALIGNOF "If the section *has not been allocated* when this is evaluated, the linker will report an error.". I think that is how it should say for all commands you mentioned.
We use checkIfExists to verify that section was allocated at the moment when we call evaluation of ADDR or other commands, and for parsing script we create forward references atm.

ruiu edited edge metadata.Nov 15 2017, 11:15 PM

Before getting into details, I'd like to ask if such linker script is valid in GNU linkers. Also I wonder how you noticed this.

In D39489#927076, @ruiu wrote:

Before getting into details, I'd like to ask if such linker script is valid in GNU linkers.

Yes, both gold and bfd accepts this script and produce valid result.

Also I wonder how you noticed this.

Just found during review of code that our logic is probably weak here and tried to break it
with the testcase from this patch.

ruiu added a comment.Nov 16 2017, 12:59 AM

If this is hypothetical, I wouldn't work this hard to "fix" it.

In D39489#927124, @ruiu wrote:

If this is hypothetical, I wouldn't work this hard to "fix" it.

Its not only fixes the issue. Our current logic of 'createOutputSection' honestly looks very odd for me.
It can depending on conditions:

  1. Create and return new output section and remember it in NameToOutputSection.
  2. Create and return new output section and don't remember it.
  3. Just declare and return previously defined output section.

And It works with name to ouput sections as 1:1 though we know its 1:many semantically.

I believe new logic and naming introduced in this patch is much more clear.
That is what I tried to improve as well here.

ruiu added a comment.Nov 20 2017, 3:49 AM

But with this patch the code is longer than before. If it doesn't have any practical benefits, it is probably not a good idea to add more code.

grimar abandoned this revision.Dec 1 2017, 4:14 AM