This is an archive of the discontinued LLVM Phabricator instance.

[LLD][ELF] - Fix the different behavior of the linker script symbols on different platforms.
ClosedPublic

Authored by grimar on Dec 11 2018, 2:16 AM.

Details

Summary

This should help make D55423 be even more simple and also
fixes the broken behavior shown in one of our test cases
for some targets, like x86-64.

The issue occurs when the forward declarations are used in the script.
One of the samples is:

SECTIONS {
  foo = ADDR(.text) - ABSOLUTE(ADDR(.text));
};

In that case, we have a broken output when output target does
not use thunks. That happens because thunks creating code
calls Script->assignAddresses() at least once one more time,
what fixups the values.

In this patch, I generalize and rename maybeAddThunks to be used
by all targets.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar created this revision.Dec 11 2018, 2:16 AM

Overall I'm happy with this change as I think it is simpler than adding another call to Writer<ELFT>::run(). The one remaining thought is whether Writer<ELFT>::run() can be moved into finalizeAddressDependentContent() as there shouldn't be anything after that function that changes address. Could we move the remaining Writer<ELFT>::run() to the end of finalizeAddressDependentContent() ?

Overall I'm happy with this change as I think it is simpler than adding another call to Writer<ELFT>::run(). The one remaining thought is whether Writer<ELFT>::run() can be moved into finalizeAddressDependentContent() as there shouldn't be anything after that function that changes address. Could we move the remaining Writer<ELFT>::run() to the end of finalizeAddressDependentContent() ?

I am not sure honestly. I like that we do not spread and have all the finalizeSynthetic in one place (in one method) and I
like that finalizeAddressDependentContent is an isolated piece where the iteration magic happens.
I would probably try to avoid adding the additional code into it. This patch only removes the code from there
and I think it is good.

But If we decide to do something like this, I would suggest doing it in a different patch.

I think this should be landed because at least it simplifies the code. Ping.

E5ten added a subscriber: E5ten.Apr 4 2019, 6:10 PM

As mentioned above, I'm in favour of making this change.

Rui, I am a bit tired of uncertainty in some LLD patches like this one. Could you please either confirm you're not going to accept this, so I'll abandon it. Or if we can proceed with that somehow, please say your word.

Rui, I am a bit tired of uncertainty in some LLD patches like this one. Could you please either confirm you're not going to accept this, so I'll abandon it. Or if we can proceed with that somehow, please say your word.

Somewhat off-topic; I'm wondering if it is worth revisiting this in a wider context to see if we can find a better way of resolving these compatibility problems. At EuroLLVM there was an interesting discussion in the binutils BOF/Round-table on how important compatibility with GNU tools was. The general consensus was that it was particularly important in projects that had to support both sets of tools and that rewriting inputs to work around limitations in one or other of the tools was a significant headache. The overall philosophy to be taken forward was for GNU compatible option names and output wherever possible, i.e. unless the GNU behaviour was impossible to implement or was crazy anyway. I think that it would be a reasonable to come up with a similar position for LLD, perhaps an RFC to LLVM-DEV might be a way forward.

With the linux kernel attempting to use LLD, first with AArch64 and Arm as part of Android, but there are people trying with x86_64 as well, simultaneously there is an increased interest from embedded systems as LLD is naturally part of the Clang/LLVM binutils. I think we are so close to be able to support these projects, but we'll have a long tail of small inconsistencies, like this one that we should fix. It would be good to come up with a common process/design to fix these that didn't get stuck on reviews.

ruiu accepted this revision.Apr 17 2019, 11:50 PM

LGTM

Changes to the linker script processor oftentimes has subtle implications and in order to review such change I have to stop and think. But for this patch I took too much time. Apologies for the delay.

Somewhat off-topic; I'm wondering if it is worth revisiting this in a wider context to see if we can find a better way of resolving these compatibility problems. At EuroLLVM there was an interesting discussion in the binutils BOF/Round-table on how important compatibility with GNU tools was. The general consensus was that it was particularly important in projects that had to support both sets of tools and that rewriting inputs to work around limitations in one or other of the tools was a significant headache. The overall philosophy to be taken forward was for GNU compatible option names and output wherever possible, i.e. unless the GNU behaviour was impossible to implement or was crazy anyway. I think that it would be a reasonable to come up with a similar position for LLD, perhaps an RFC to LLVM-DEV might be a way forward.

I think that's a fair argument and we should take that stance; as long as it's not impossible to implement nor too crazy, we should make lld compatible with GNU bfd linker whenever possible at least in terms of linker script compatibility. I don't personally like the linker script and perhaps the ELF file format itself (I dislike the fact that segments and sections are orthogonal in the ELF format), but in many cases a linker script just does its job, and without that some kind of task is not doable. In practice, GNU compatibility is very important.

One of the reasons why it is not easy to review linker script code change is because it's complicated. I don't feel I fully understand the existing code of the linker script processor. It grows organically, and in many cases a small issue can be resolved by adding another call of assignOffsets(). It would for sure fixes a problem, but it is sometimes hard to say whether that's correct or not. Each stage doesn't seem clearly separated, and it is sometimes not clear whether doing something in some stage is the right thing or not. Is this due to the nature of the complexity of the linker script? I don't think so -- we didn't understand the linker script well when we started implementing it a few years ago, and the way how we organized code in the few years was still there. But now we understand it well including corner cases. I want to sit down and think hard about how to (re)organize the code. However, that idea seems the cause of slow review of linker script patches (I always started thinking how this should really be organized). I'll defer the idea of refactoring and let me review linker script changes more casually. I'm sorry about the delay, again.

ELF/Writer.cpp
1503 ↗(On Diff #177672)

Can you mention that for x86 this loop iterates only once?

This revision is now accepted and ready to land.Apr 17 2019, 11:50 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2019, 1:18 AM