This is an archive of the discontinued LLVM Phabricator instance.

[LLD][ELF] - Linkerscript: add a support for expressions for section's filling
ClosedPublic

Authored by grimar on Jul 3 2019, 6:10 AM.

Details

Summary

Imagine the script:

.section: {
...
} = FILL_EXPR

LLD assumes that FILL_EXPR is a number, and does not allow
it to be an expression. Though that is allowed by specification:
https://sourceware.org/binutils/docs-2.32/ld/Output-Section-Fill.html

This patch adds a support for cases when FILL_EXPR is simple math expression.
I am not sure it makes sense to support the case when expression uses symbols,
I hope it is not used in the wild, though we can do that in a follow-up, but that would do
our filling code a bit more complicated.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42482.

Diff Detail

Event Timeline

grimar created this revision.Jul 3 2019, 6:10 AM
grimar edited the summary of this revision. (Show Details)Jul 3 2019, 6:51 AM
grimar updated this revision to Diff 207785.Jul 3 2019, 7:12 AM
  • A minor simplification.

It looks like gold and bfd disagree on whether a symbol is allowed in practice.

foo = 0x12345678;

SECTIONS {
 .text : { *(.text) } = foo
}

With ld.bfd I get an error, even if I use ABSOLUTE(foo)

ld.bfd fill.o -T fill.lds
fill.lds:4: nonconstant expression for fill value

With gold the link succeeds

ld.gold fill.o -T fill.lds

The ld.bfd I used was quite old 2.26.1 so this may have changed in later versions. Trying a later version lead to a ld.bfd segfault in one case.

As I think that this is an improvement and I agree that a symbol is unlikely to be encountered in practice so I think this is worth doing.

ELF/ScriptParser.cpp
884

It is worth a comment or FIXME: saying we don't support symbols in fill expressions?

grimar updated this revision to Diff 207807.Jul 3 2019, 8:39 AM
grimar marked an inline comment as done.
  • Addressed review comments.
ELF/ScriptParser.cpp
884

Done. I also added a test case to demonstrate the current behavior. It shows "symbol not found: foo" currently,
because tries to find a symbol during evaluating the expression. And at this moment no symbols are added
to a sym table.

In theory we can change the message to be more meaningful in case we read an expression
that should not use any symbols, but since it would be used only for fillexpr, I am not sure it worth doing.

grimar marked an inline comment as done.Jul 3 2019, 8:48 AM
grimar added inline comments.
ELF/ScriptParser.cpp
884

I.e. I mean we could add a flag like DontAllowSymbols and return a different error message when reading an expression which has a reference to a symbol. But, as I said, It looks a bit too much for this feature.

Thanks for the update. Looks ok to me, will be worth seeing what the others say.

arichardson added inline comments.Jul 3 2019, 9:09 AM
ELF/ScriptLexer.cpp
195

This seems unrelated, maybe split it out into a separate patch?

Looks good to me.

ELF/ScriptLexer.cpp
195

I think tokenizeExpr() did not have to be precise before because readOutputSectionDescription() did not consume tokens that can be prefixed of ==, <<, <=, >= and >> (i.e. =, <, >). Now it did so a preciser tokenizeExpr is needed.

I think keeping the change here is probably fine.

test/ELF/linkerscript/sections-padding.s
45

You can delete the spaces around << to check tokens are correctly split.

grimar updated this revision to Diff 207980.Jul 4 2019, 1:02 AM
grimar marked 2 inline comments as done.
  • Addressed comments.
ELF/ScriptLexer.cpp
195

Yes, this is needed for this patch to work (after adding only = a few of our other test start failing).

MaskRay accepted this revision.Jul 4 2019, 1:18 AM
This revision is now accepted and ready to land.Jul 4 2019, 1:18 AM
ruiu accepted this revision.Jul 4 2019, 3:44 AM

LGTM

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 4 2019, 7:17 AM

This change breaks if you add FILL(0x10101010) before *(.aaa) in fill.test:

# REQUIRES: x86
# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %p/Inputs/fill.s -o %t.o
# RUN: ld.lld -o %t --script %s %t.o
# RUN: llvm-objdump -s %t | FileCheck %s

SECTIONS {
  .out : {
   FILL(0x11111111)
   . += 2;
   FILL(0x10101010)
   *(.aaa)
   . += 4;
   *(.bbb)
   . += 4;
   FILL(0x22220000 + 0x2222);
   . += 4;
  }
}

# CHECK:      Contents of section .out:
# CHECK-NEXT: 2222aa22 222222bb 22222222 22222222

That's because the expression for the FILL command should be contained within the parentheses.

This change breaks if you add FILL(0x10101010) before *(.aaa) in fill.test:

# REQUIRES: x86
# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %p/Inputs/fill.s -o %t.o
# RUN: ld.lld -o %t --script %s %t.o
# RUN: llvm-objdump -s %t | FileCheck %s

SECTIONS {
  .out : {
   FILL(0x11111111)
   . += 2;
   FILL(0x10101010)
   *(.aaa)
   . += 4;
   *(.bbb)
   . += 4;
   FILL(0x22220000 + 0x2222);
   . += 4;
  }
}

# CHECK:      Contents of section .out:
# CHECK-NEXT: 2222aa22 222222bb 22222222 22222222

That's because the expression for the FILL command should be contained within the parentheses.

I posted D64476 to fix.