Download Raw Diff

Details

Reviewers

ruiu
• espindola

Commits

rG69186f82f3f5: [LLD][ELF] - A fix for "linker script assignment loses relative nature of…
rLLD358652: [LLD][ELF] - A fix for "linker script assignment loses relative nature of…
rL358652: [LLD][ELF] - A fix for "linker script assignment loses relative nature of…

Summary

This is https://bugs.llvm.org//show_bug.cgi?id=39857.
I added the comment with much more details to the bug page,
the short version is below.

The following script and code demonstrates the issue:

aliasto__text = __text;
 SECTIONS {
  .text 0x1000 : { __text = . ; *(.text) }
 }

call aliasto__text@PLT

LLD fails with "cannot refer to absolute symbol: aliasto__text" error.
It happens because at the moment of scanning the relocations
we do not yet assign the correct/final/any section value for the symbol aliasto__text.
I made a change to Relocations.cpp to workaround that.

I had to remove the symbol-location.s test case completely, because now it does not
trigger any error. Since now all linker scripts symbols are resolved to constants, no errors can be triggered
at all it seems. I checked that it is consistent with the behavior of bfd and gold (they do not trigger errors
for the case from symbol-location.s), so it should be OK. I.e. at least it is probably not the best possible, but
natural behavior.

Diff Detail

Event Timeline

grimar created this revision.Dec 7 2018, 3:20 AM

Herald added a reviewer: • espindola. · View Herald TranscriptDec 7 2018, 3:20 AM

Herald added subscribers: arichardson, emaste. · View Herald Transcript

grimar edited the summary of this revision. (Show Details)Dec 7 2018, 3:26 AM

Updated the comment.

Note that for some targets we already call the assignAddresses at least twice.
That happens because maybeAddThunks calls it.

At first, I thought to reuse maybeAddThunks, it could always run if Script->HasSectionsCommand set,
but it is perhaps not the right place, so I added `assignAddresses to Writer<ELFT>::run().

Thanks for the fix. I kind of thought it might be possible to lazily postpone symbol assignments to other symbols till after linker scripts but this requires quite a bit of contextual information to know when we need the value of a symbol expression. It looks like ld.bfd has a similar post-script fixup for symbol sections and values although it just evaluates all linker defined symbols again rather than all the addresses again. That could be another option if there were performance concerns about redoing the whole of assignAddresses.

One thing I haven't checked through yet is whether there is a more specific check for a relocation to an absolute address. Ideally we wouldn't want to allow a relative relocation to symbol = 0x1000 but we would symbol = some_other_symbol. I don't know whether that is possible though given the context available at the time. Letting it through is probably the best option (rely on the user knowing what they are doing).

In D55423#1323012, @peter.smith wrote:

One thing I haven't checked through yet is whether there is a more specific check for a relocation to an absolute address. Ideally we wouldn't want to allow a relative relocation to symbol = 0x1000 but we would symbol = some_other_symbol. I don't know whether that is possible though given the context available at the time. Letting it through is probably the best option (rely on the user knowing what they are doing).

I agree. In theory, we could do some additional fixup of the symbols values before scanning the relocations, it would allow distinguishing between non-absolute and absolute symbols,
that is what I also tried when experimented with the code for this patch, it was:

void LinkerScript::assignSymbolSections() {
  for (BaseCommand *Base : Script->SectionCommands) {
    auto *Cmd = dyn_cast<SymbolAssignment>(Base);
    if (!Cmd || !Cmd->Sym)
      continue;

    auto Deleter = make_unique<AddressState>();
    Ctx = Deleter.get();
    Ctx->OutSec = Aether;

    ExprValue V = Cmd->Expression();
    if (!V.isAbsolute())
      Cmd->Sym->Section = V.Sec;
  }
}

But it is too complicated and this patch is much simpler without such things.

nickdesaulniers added a subscriber: nickdesaulniers.Dec 7 2018, 2:59 PM

This seems tricky to me. I wonder if you can just fix the problem by reordering symbol assignments in your linker scripts. Also, by allowing forward reference in an assignment, you could theoretically write circular symbol assignments, e.g. foo=bar and bar=foo. That is logically a bit too complicated to me. Maybe we should just keep the linearity of the symbol assignment, instead of supporting this corner case?

In D55423#1324212, @ruiu wrote:

This seems tricky to me. I wonder if you can just fix the problem by reordering symbol assignments in your linker scripts.

Reordering can help to set the correct symbols values. I.e. the following would work for PR:

SECTIONS { .text 0x1000 : { __text = . ; *(.text) } }
aliasto__text = __text;

But it does not look like a final solution. Without reordering we still produce the broken output which might be hard to diagnose.
And I think the initial pattern is used in the wild.

As an example: we have a addr-zero.test test case already (implemented in rL298083) which is broken right now after recent rL332038.
The problem has the same roots, just reordering the sections broke the correctness of symbols evaluation. Doing the one more pass would
solve this once and forever it seems.

Also, by allowing forward reference in an assignment, you could theoretically write circular symbol assignments, e.g. foo=bar and bar=foo.

LLD already accepts this. For script containing foo = boo; boo = foo; we create 2 absolute zero symbols.
For just foo = boo we report a "symbol not found: boo" as expected.

It is not ideal, It happens because we declare the symbols early anyways as a Defined to let
they exist before starting LTO and versioning.

But I think that is just one more example that linker scripts should be used carefully.
I do not think we really should care about a fix for this, because gold has equal behavior it seems (though bfd - not).

Maybe we should just keep the linearity of the symbol assignment, instead of supporting this corner case?

That sure would be the simplest and desired way, if we could drop the existent scripts I think and write LS logic from scratch.
But we have to support non-linearity behavior which is assumed to work I think. Or report an error.

I see one more argument that I find important for the LLD design. See our code in maybeAddThunks:
https://github.com/llvm-mirror/lld/blob/master/ELF/Writer.cpp#L1500

It was implemented to create thunks, but now it does not only that. It is called not only when Target->NeedsThunks,
but also when packed relocations are used. As a result, for some targets, we always call assignAddresses at least once again,
and for others - never do.

This makes the behavior for those platforms to be different from the common case (assuming it is x86-x64).
It is not a problem to call assignAddresses few times, it was designed for that.
And in this patch, I had to add assignAddresses to make other targets, that do not use maybeAddThunks to be happy.
I would prefer to have the consistent behavior of the linker script for all targets. I think we could rename maybeAddThunks
to something else, update its comment and make assignAddresses be always called at least once for each target.
That would remove 2 of 4 code lines this patch has. It is very short for the problem it solves.

I quite like the idea of renaming maybeNeedsThunks(), for example addAddressDependentContent()? We could make sure that there is always at least one pass of assignAddresses done there.

The test case was derived from the linux kernel linker script for AArch64 which would have assignAddresses() called more than once due to AArch64 needing thunks. It is tempting to say just rearrange your linker script but this is not always easy as complex linker scripts are often machine generated. For example the kernel linker script is actually generated from https://github.com/torvalds/linux/blob/master/arch/arm64/kernel/vmlinux.lds.S using macros from https://github.com/torvalds/linux/blob/master/arch/arm64/kernel/image.h .

My understanding is that AArch64 (Arm64 in kernel parlance) is the priority target for using LLD to link the kernel, so I guess the Relocations.cpp would be sufficient on AArch64 and Arm. I don't know if the X86 kernel linker script would suffer from this problem, but it is certainly a possibility.

In D55423#1325166, @peter.smith wrote:

My understanding is that AArch64 (Arm64 in kernel parlance) is the priority target for using LLD to link the kernel, so I guess the Relocations.cpp would be sufficient on AArch64 and Arm. I don't know if the X86 kernel linker script would suffer from this problem, but it is certainly a possibility.

Yes, it worked for AArch64 without changes to Writer.cpp, but I think it is important from many sides to have common logic behavior to be consistent,
so I had to fix this for x64 and hence did the second change.

I created a patch that would allow reducing the amount of changes done here: https://reviews.llvm.org/D55550

In D55423#1325166, @peter.smith wrote:

I quite like the idea of renaming maybeNeedsThunks(), for example addAddressDependentContent()? We could make sure that there is always at least one pass of assignAddresses done there.

I decided to name it finalizeAddressDependentContent in D55550 because finalize seems to be consistent prefix with the functions called around.

I think I'm not still convinced that we should support forward declaration. How often do you really need that? Also, how is it hard/easy to detect if something is a forward reference? How much easy/hard to fix your linker scripts if yours is already using that pattern?

I believe we should at least ban circular definitions such as foo=bar; bar=foo; because it doesn't make any sense.

In D55423#1333787, @ruiu wrote:

I think I'm not still convinced that we should support forward declaration. How often do you really need that? Also, how is it hard/easy to detect if something is a forward reference? How much easy/hard to fix your linker scripts if yours is already using that pattern?

I believe we should at least ban circular definitions such as foo=bar; bar=foo; because it doesn't make any sense.

I am not trying to support the forward declaration. We already do that (example: addr-zero.test).
I think it is a bit hard to detect that given the approach we have (define all linker script symbols early) and I am not sure we should care.
Some of the scripts in the wild (including apps in the BSD world) were using it already and that is why we supported that initially, I think.
Scripts require careful use anyways.

Please take a look at the https://reviews.llvm.org/D55550 first. I think D55550 is a very reasonable change.
With the D55550, this patch becomes really trivial/natural.

nickdesaulniers added inline comments.Apr 2 2019, 3:36 AM

ELF/Relocations.cpp
387–388	@peter.smith suggested that just this part of this patch is what I need in order to link an arm64 Linux kernel with KASLR enabled with LLD (https://bugs.llvm.org/show_bug.cgi?id=39810#c11). I was able to confirm that this change was all that I needed to link a bootable kernel image. Can I please have just this change split out of the patch? Surely this part is less controversial?

Herald added subscribers: jdoerfert, MaskRay. · View Herald TranscriptApr 2 2019, 3:36 AM

grimar marked an inline comment as done.Apr 2 2019, 7:28 AM

grimar added inline comments.

ELF/Relocations.cpp
387–388	If nobody will so that earlier, I'll take a look at how to extract this part with an appropriate test case tomorrow. (Though I still think we just need to land D55550, and then this part. Having a different behavior of linker script on different architectures is just weird).

peter.smith added inline comments.Apr 2 2019, 8:09 AM

ELF/Relocations.cpp
387–388	The pr contains an AArch64 version of the test. It should fail with a link error with -pie. FWIW I agree with George that D55550 is worth landing as well. Linker scripts are often generated from a large set of macros and it isn't always easy to hand edit the files to avoid forward references.

grimar marked an inline comment as done.Apr 3 2019, 5:34 AM

grimar added inline comments.

ELF/Relocations.cpp

387–388

So I ended up with the following test case:

# REQUIRES: aarch64
# RUN: llvm-mc -filetype=obj -triple=aarch64-linux-gnu %s -o %t.o
# RUN: echo "aliasto__text = __text; SECTIONS { .text 0x1000 : { __text = . ; *(.text) } }" > %t.script
# RUN: ld.lld -pie -o %t --script %t.script %t.o
# RUN: llvm-readobj -symbols %t | FileCheck %s

## Check that alias 'aliasto__text' has the correct value.
## (It should belong to the section .text and point to it's start).

## TODO: This could be a test for x86 platform, but currently, LLD
## has an inconsistent behavior of the linker script between x86 and aarch64.
## Under x86, `aliasto__text` would not get the correct value and test would fail.

# CHECK:      Symbol {
# CHECK:        Name: __text
# CHECK-NEXT:   Value: 0x1000
# CHECK-NEXT:   Size: 0
# CHECK-NEXT:   Binding: Global
# CHECK-NEXT:   Type: None
# CHECK-NEXT:   Other: 0
# CHECK-NEXT:   Section: .text
# CHECK-NEXT: }

# CHECK:      Symbol {
# CHECK:        Name: aliasto__text
# CHECK-NEXT:   Value: 0x1000
# CHECK-NEXT:   Size: 0
# CHECK-NEXT:   Binding: Global
# CHECK-NEXT:   Type: None
# CHECK-NEXT:   Other: 0
# CHECK-NEXT:   Section: .text
# CHECK-NEXT: }

.text
.globl _start
.type _start, %function
_start:
.globl aliasto__text
  bl __text
  bl aliasto__text

I.e. it still uses forward declaration for aliasto__text. I found no way to make this piece of the code be useful without
doing that.

Rui's comment above says:
"I think I'm not still convinced that we should support forward declaration. How often do you really need that? Also, how is it hard/easy to detect if something is a forward reference? How much easy/hard to fix your linker scripts if yours is already using that pattern?
I believe we should at least ban circular definitions such as foo=bar; bar=foo; because it doesn't make any sense."

I.e. he was against doing it that time. My conclusion remains the same: if you want to support such forward declarations we want to land D55550 + rebased version of this patch to avoid mess about different behavior between aarch64 and x86 and fix it all at once. I do not see for what another case this piece can be useful. (If it is useful for some other case, please let me know, then it would be probably possible to justify such change).

Thanks for the test suggestion, that looks good to me. My understanding is that this would unblock the Linux Kernel KASLR build on AArch64. The upstream linux maintainers are not willing to change the linker script for what seems like a linker limitation/bug.

My understanding is that only the Arm and AArch64 Linux kernel's are actively using LLD right now and the Linker script is specific to those builds so this half-fix in practice would be a step forward.

E5ten added a subscriber: E5ten.Apr 4 2019, 4:23 PM

Simplified after landing the D55550.
Had to remove symbol-location.s test completely. I updated the patch description to explain why.

grimar edited the summary of this revision. (Show Details)Apr 18 2019, 1:58 AM

Remove @PLT suffixes from the calls in the test. I think I put them by mistake in the initial version of the patch.

LGTM

This revision is now accepted and ready to land.Apr 18 2019, 2:12 AM

Closed by commit rL358652: [LLD][ELF] - A fix for "linker script assignment loses relative nature of… (authored by grimar). · Explain WhyApr 18 2019, 2:58 AM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2019, 2:58 AM

@grimar - this patch causes a regression in behaviour for references to script symbols. I think https://reviews.llvm.org/D61298 is a reasonable fix. Could you take a look?

peter.smith mentioned this in D61298: [LLD] Emit dynamic relocations for references to script symbols in -pie links.Apr 30 2019, 2:58 AM

Diff 177169

ELF/Relocations.cpp

Show First 20 Lines • Show All 376 Lines • ▼ Show 20 Lines	if (Sym.IsPreemptible)
return false;		return false;
if (!Config->Pic)		if (!Config->Pic)
return true;		return true;

// The size of a non preemptible symbol is a constant.		// The size of a non preemptible symbol is a constant.
if (E == R_SIZE)		if (E == R_SIZE)
return true;		return true;

		// We set the final symbols values for linker script defined symbols later.
		// They always can be computed as a link time constant.
		if (Sym.ScriptDefined)
		return true;
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions @peter.smith suggested that just this part of this patch is what I need in order to link an arm64 Linux kernel with KASLR enabled with LLD (https://bugs.llvm.org/show_bug.cgi?id=39810#c11). I was able to confirm that this change was all that I needed to link a bootable kernel image. Can I please have just this change split out of the patch? Surely this part is less controversial? nickdesaulniers: @peter.smith suggested that just this part of this patch is what I need in order to link an…
		grimarAuthorUnsubmitted Done Reply Inline Actions If nobody will so that earlier, I'll take a look at how to extract this part with an appropriate test case tomorrow. (Though I still think we just need to land D55550, and then this part. Having a different behavior of linker script on different architectures is just weird). grimar: If nobody will so that earlier, I'll take a look at how to extract this part with an…
		peter.smithUnsubmitted Not Done Reply Inline Actions The pr contains an AArch64 version of the test. It should fail with a link error with -pie. FWIW I agree with George that D55550 is worth landing as well. Linker scripts are often generated from a large set of macros and it isn't always easy to hand edit the files to avoid forward references. peter.smith: The pr contains an AArch64 version of the test. It should fail with a link error with -pie.
		grimarAuthorUnsubmitted Done Reply Inline Actions So I ended up with the following test case: # REQUIRES: aarch64 # RUN: llvm-mc -filetype=obj -triple=aarch64-linux-gnu %s -o %t.o # RUN: echo "aliasto__text = __text; SECTIONS { .text 0x1000 : { __text = . ; (.text) } }" > %t.script # RUN: ld.lld -pie -o %t --script %t.script %t.o # RUN: llvm-readobj -symbols %t \| FileCheck %s ## Check that alias 'aliasto__text' has the correct value. ## (It should belong to the section .text and point to it's start). ## TODO: This could be a test for x86 platform, but currently, LLD ## has an inconsistent behavior of the linker script between x86 and aarch64. ## Under x86, `aliasto__text` would not get the correct value and test would fail. # CHECK: Symbol { # CHECK: Name: __text # CHECK-NEXT: Value: 0x1000 # CHECK-NEXT: Size: 0 # CHECK-NEXT: Binding: Global # CHECK-NEXT: Type: None # CHECK-NEXT: Other: 0 # CHECK-NEXT: Section: .text # CHECK-NEXT: } # CHECK: Symbol { # CHECK: Name: aliasto__text # CHECK-NEXT: Value: 0x1000 # CHECK-NEXT: Size: 0 # CHECK-NEXT: Binding: Global # CHECK-NEXT: Type: None # CHECK-NEXT: Other: 0 # CHECK-NEXT: Section: .text # CHECK-NEXT: } .text .globl _start .type _start, %function _start: .globl aliasto__text bl __text bl aliasto__text I.e. it still uses forward declaration for `aliasto__text`. I found no way to make this piece of the code be useful without doing that. Rui's comment above says: "I think I'm not still convinced that we should support forward declaration. How often do you really need that? Also, how is it hard/easy to detect if something is a forward reference? How much easy/hard to fix your linker scripts if yours is already using that pattern? I believe we should at least ban circular definitions such as foo=bar; bar=foo; because it doesn't make any sense." I.e. he was against doing it that time. My conclusion remains the same: if you want to support such forward declarations we want to land D55550 + rebased version of this patch to avoid mess about different behavior between aarch64 and x86 and fix it all at once. I do not see for what another case this piece can be useful. (If it is useful for some other case, please let me know, then it would be probably possible to justify such change). grimar:* So I ended up with the following test case: ``` # REQUIRES: aarch64 # RUN: llvm-mc…

// For the target and the relocation, we want to know if they are		// For the target and the relocation, we want to know if they are
// absolute or relative.		// absolute or relative.
bool AbsVal = isAbsoluteValue(Sym);		bool AbsVal = isAbsoluteValue(Sym);
bool RelE = isRelExpr(E);		bool RelE = isRelExpr(E);
if (AbsVal && !RelE)		if (AbsVal && !RelE)
return true;		return true;
if (!AbsVal && RelE)		if (!AbsVal && RelE)
return true;		return true;
▲ Show 20 Lines • Show All 1,103 Lines • Show Last 20 Lines

ELF/Writer.cpp

Show First 20 Lines • Show All 474 Lines • ▼ Show 20 Lines	template <class ELFT> void Writer<ELFT>::run() {
// finalizeSections does that.		// finalizeSections does that.
finalizeSections();		finalizeSections();
checkExecuteOnly();		checkExecuteOnly();
if (errorCount())		if (errorCount())
return;		return;

Script->assignAddresses();		Script->assignAddresses();

		// Linker script may contain aliases, for example:
		// "alias = foo; SECTIONS { .text 0x1000 : { foo = .; *(.text) } }"
		// Since we process symbol assignment commands one by one, for that script we
		// end up with a situation when alias symbol gets the section value of the
		// symbol `foo` too early, right before 'foo' gets the .text section value by
		// itself. To resolve this we call `assignAddresses` once again. It is safe
		// because `assignAddresses` is designed to run multiple times to support the
		// thunks creation. It would not resolve the possible complex dependencies
		// that would require multiple calls, but we do not need that in practice. GNU
		// linkers have different behavior here, our approach seems to be close or the
		// same to what gold do.
		if (Script->HasSectionsCommand)
		Script->assignAddresses();

// If -compressed-debug-sections is specified, we need to compress		// If -compressed-debug-sections is specified, we need to compress
// .debug_* sections. Do it right now because it changes the size of		// .debug_* sections. Do it right now because it changes the size of
// output sections.		// output sections.
for (OutputSection *Sec : OutputSections)		for (OutputSection *Sec : OutputSections)
Sec->maybeCompress<ELFT>();		Sec->maybeCompress<ELFT>();

Script->allocateHeaders(Phdrs);		Script->allocateHeaders(Phdrs);

▲ Show 20 Lines • Show All 2,014 Lines • Show Last 20 Lines

test/ELF/linkerscript/addr-zero.test

	# REQUIRES: x86			# REQUIRES: x86
	# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux /dev/null -o %t.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux /dev/null -o %t.o
	# RUN: ld.lld -o %t.so --script %s %t.o -shared			# RUN: ld.lld -o %t.so --script %s %t.o -shared
	# RUN: llvm-readobj --symbols %t.so \| FileCheck %s			# RUN: llvm-readobj --symbols %t.so \| FileCheck %s

	# Test that the script creates a non absolute symbol with value			# Test that the script creates a non absolute symbol with value
	# 0 I.E., a symbol that refers to the load address.			# 0 I.E., a symbol that refers to the load address.

	# CHECK: Symbol {			# CHECK: Symbol {
	# CHECK: Name: foo			# CHECK: Name: foo
	# CHECK-NEXT: Value: 0x70			# CHECK-NEXT: Value: 0x0
	# CHECK-NEXT: Size: 0			# CHECK-NEXT: Size: 0
	# CHECK-NEXT: Binding: Global			# CHECK-NEXT: Binding: Global
	# CHECK-NEXT: Type: None			# CHECK-NEXT: Type: None
	# CHECK-NEXT: Other: 0			# CHECK-NEXT: Other: 0
	# CHECK-NEXT: Section: .text			# CHECK-NEXT: Section: .text
	# CHECK-NEXT: }			# CHECK-NEXT: }

	SECTIONS {			SECTIONS {
	foo = ADDR(.text) - ABSOLUTE(ADDR(.text));			foo = ADDR(.text) - ABSOLUTE(ADDR(.text));
	};			};

test/ELF/linkerscript/symbol-alias-relocation.s

				# REQUIRES: x86
				# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
				# RUN: echo "aliasto__text = __text; SECTIONS { .text 0x1000 : { __text = . ; *(.text) } }" > %t.script
				# RUN: ld.lld -pie -o %t --script %t.script %t.o
				# RUN: llvm-readobj -symbols %t \| FileCheck %s

				## Check that alias 'aliasto__text' has the correct value.
				## (It should belong to the section .text and point to it's start).

				# CHECK: Symbol {
				# CHECK: Name: __text
				# CHECK-NEXT: Value: 0x1000
				# CHECK-NEXT: Size: 0
				# CHECK-NEXT: Binding: Global
				# CHECK-NEXT: Type: None
				# CHECK-NEXT: Other: 0
				# CHECK-NEXT: Section: .text
				# CHECK-NEXT: }

				# CHECK: Symbol {
				# CHECK: Name: aliasto__text
				# CHECK-NEXT: Value: 0x1000
				# CHECK-NEXT: Size: 0
				# CHECK-NEXT: Binding: Global
				# CHECK-NEXT: Type: None
				# CHECK-NEXT: Other: 0
				# CHECK-NEXT: Section: .text
				# CHECK-NEXT: }

				.text
				.globl _start
				.type _start, %function
				_start:
				.globl aliasto__text
				call __text@PLT
				call aliasto__text@PLT

This is an archive of the discontinued LLVM Phabricator instance.

[LLD][ELF] - A fix for "linker script assignment loses relative nature of section" bug.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 177169

ELF/Relocations.cpp

ELF/Writer.cpp

test/ELF/linkerscript/addr-zero.test

test/ELF/linkerscript/symbol-alias-relocation.s

This is an archive of the discontinued LLVM Phabricator instance.

[LLD][ELF] - A fix for "linker script assignment loses relative nature of section" bug.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 177169

ELF/Relocations.cpp

ELF/Writer.cpp

test/ELF/linkerscript/addr-zero.test

test/ELF/linkerscript/symbol-alias-relocation.s

[LLD][ELF] - A fix for "linker script assignment loses relative nature of section" bug.
ClosedPublic