This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Postpone the evaluation of DefinedSynthetic value
AbandonedPublic

Authored by grimar on Feb 1 2017, 4:43 AM.

Details

Reviewers
ruiu
rafael
Summary

After applying D29332 and trying to link linux kernel I found next issue.
We still produced symbols value wrong.

Having the next script:

vvar_start = . - 2 * (1 << 12);
vvar_page = vvar_start;
vvar_vsyscall_gtod_data = vvar_page + 128;

pvclock_page = vvar_start + PAGE_SIZE;
. = SIZEOF_HEADERS;
.hash		: { *(.hash) }			:text
...

LLD output was:

Section Headers:
  [Nr] Name              Type             Address           Offset
  [ 1] .hash             HASH             0000000000000120  00000120
       0000000000000050  0000000000000004   A       3     0     4

Symbol table '.symtab' contains 18 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     3: ffffffffffffe1a0     0 NOTYPE  LOCAL  HIDDEN     1 vvar_vsyscall_gtod_data
     5: ffffffffffffe120     0 NOTYPE  LOCAL  DEFAULT    1 vvar_start
     6: ffffffffffffe120     0 NOTYPE  LOCAL  DEFAULT    1 vvar_page

BFD was:

28: ffffffffffffe000     0 NOTYPE  LOCAL  DEFAULT  ABS vvar_page
33: ffffffffffffe000     0 NOTYPE  LOCAL  DEFAULT    1 vvar_start
35: ffffffffffffe080     0 NOTYPE  LOCAL  DEFAULT  ABS vvar_vsyscall_gtod_data

GOLD was:

6: ffffffffffffe000     0 NOTYPE  LOCAL  DEFAULT  ABS vvar_start
7: ffffffffffffe000     0 NOTYPE  LOCAL  DEFAULT  ABS vvar_page
8: ffffffffffffe080     0 NOTYPE  LOCAL  DEFAULT  ABS vvar_vsyscall_gtod_data

The problem was in assignSectionSymbol(). We used Body->Section->Addr too early.
At the point of evaluation Addr was unset and == 0x0. Solution I suggest - to postpone the evaluation.
After that patch values we produce are correct for case above.

Diff Detail

Event Timeline

grimar created this revision.Feb 1 2017, 4:43 AM
grimar edited the summary of this revision. (Show Details)Feb 1 2017, 4:45 AM
ruiu added inline comments.Feb 1 2017, 11:37 AM
ELF/LinkerScript.cpp
387

I don't think I understand this comment. Why Sym value is affected by the following .sec?

I do understand that Sym is affected if it is .sec : { ... }; Sym = ., but this is Sym = .; .sec { ... };. Can a following expression affects the results of preceding expressions?

By the way, why + 0x1?

grimar added inline comments.Feb 2 2017, 1:27 AM
ELF/LinkerScript.cpp
387

but this is Sym = .; .sec { ... };. Can a following expression affects the results of preceding expressions?

Yes. Please take a look on non-absolute.s:

{ A = . - 0x10; B = A + 0x1; . += 0x1000; }

B is non-absolute expression, because depends on A which uses location counter for evaluation.
That way B is non absolute symbol and belongs to next section, which is .text in testcase.

That is a different behavior in compare with gold/bfd. As you can see in description BFD makes vvar_start to belong to .text,
but other symbols are ABS, gold makes all symbols as ABS. It was discussed with Rafael in D29332 thread that it looks as a bug of gnu linkers.

By the way, why + 0x1?

Just shows the common case. Some expression that has location counter and operation. I think it is pretty common case in scripts. I can replace with "Sym = .;" if you want, though my way probably a bit more readable and common to see.

ruiu added inline comments.Feb 2 2017, 10:03 AM
ELF/LinkerScript.cpp
387

I think that doesn't answer my question. My question is how a following expression can affect a preceding expression. Your expression,

A = . - 0x10; B = A + 0x1; . += 0x1000;

can be computed from left to right, so it doesn't seems to be a counter example.

grimar added inline comments.Feb 2 2017, 12:17 PM
ELF/LinkerScript.cpp
387

It should not affect on final value and that what patch fixes.

Current non-absolute.s contains:

SECTIONS { A = . - 0x10; B = A + 0x1; }
# SYMBOL:        Name: B
# SYMBOL-NEXT:   Value: 0xFFFFFFFFFFFFFFF1

If you just do: "SECTIONS { A = . - 0x10; B = A + 0x1; . += 0x1234; }"
and run test, you'll see that value of B changed:

23>    Symbol {
23>      Name: B (1)
23>      Value: 0x1225

That is wrong and happens because B is DefinedSynthetic and it is relative to section VA.
Patch fixes the calculation of B.

ruiu edited edge metadata.Feb 2 2017, 3:03 PM

What I do not understand is why symbols outside an output section description should belong to an output section. Let's use this extremely simple linker script as an example.

SECTIONS { A = . }

This linker script defines symbol A. All output sections are orphan sections and laid out using the default layout rules. IIUC, what you said is symbol A should be defined relative to an orphan section that happens to become the first section of the output. I wonder why. I think symbol A should be defined relative to the image base instead of relative to some section. In other words, symbol A shouldn't belong to any section.

In general, any symbol that is not inside an output section description should be defined relative to the image base. I.e.

SECTIONS { A = . }

is defined relative to the image base, while

SECTIONS { .foo : { A = . } }

is defined relative to section .foo.

grimar added a comment.EditedFeb 3 2017, 3:02 AM
In D29391#665264, @ruiu wrote:

What I do not understand is why symbols outside an output section description should belong to an output section. Let's use this extremely simple linker script as an example.

SECTIONS { A = . }

This linker script defines symbol A. All output sections are orphan sections and laid out using the default layout rules. IIUC, what you said is symbol A should be defined relative to an orphan section that happens to become the first section of the output. I wonder why. I think symbol A should be defined relative to the image base instead of relative to some section. In other words, symbol A shouldn't belong to any section.

So it means you assume it should be absolute, right ?

In general, any symbol that is not inside an output section description should be defined relative to the image base. I.e.

SECTIONS { A = . }

is defined relative to the image base, while

SECTIONS { .foo : { A = . } }

is defined relative to section .foo.

I think Rafael wrote about the same situation in D28857 thread (18 jan):
"That is a bug in bfd. Its handling of what is absolute is broken. On ELF
absolute really means absolute, like a size. Anything that is a position
in the file is not absolute."

and

"The difference is that the '.' makes it a position in the file, and
therefore not absolute."

And patch where it was implemented was: https://reviews.llvm.org/D26161

grimar updated this revision to Diff 87210.Feb 6 2017, 3:29 AM
  • Addressed review comments.
grimar abandoned this revision.Feb 7 2017, 2:18 AM

D29607 was landed instead.