This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Linkerscript: define symbols outside SECTIONS
ClosedPublic

Authored by phosek on Aug 20 2016, 10:55 PM.

Details

Summary

Symbol assignments outside of SECTIONS command need to be created even when SECTIONS command is not present.

Diff Detail

Event Timeline

phosek updated this revision to Diff 68799.Aug 20 2016, 10:55 PM
phosek retitled this revision from to [ELF] Linkerscript: define symbols outside SECTIONS.
phosek updated this object.
phosek added a reviewer: ruiu.
phosek added a project: lld.
phosek added subscribers: llvm-commits, phosek.
ruiu added inline comments.Aug 21 2016, 9:55 PM
ELF/LinkerScript.cpp
264–268

Now this can be removed, no?

phosek added inline comments.Aug 21 2016, 10:51 PM
ELF/LinkerScript.cpp
264–268

Currently, there are two vectors that may contain assignments, Assignments and Command so we need both loops. I was thinking about using just one, but that doesn't seem to work since we need to iterate over assignments done within SECTIONS in assignAddresses in the correct order because of the ..

ruiu accepted this revision.Aug 22 2016, 1:33 AM
ruiu edited edge metadata.

LGTM

ELF/LinkerScript.cpp
261

Please add a comment saying that this function handles symbol definitions outside SECTIONS command.

This revision is now accepted and ready to land.Aug 22 2016, 1:33 AM
phosek updated this revision to Diff 69049.Aug 23 2016, 3:39 PM
phosek edited edge metadata.

The previous revision was still not correct, it'd break in the following case: foo = 0x100; SECTIONS { bar = foo; } baz = bar; when baz would trigger an error for bar being undefined. The new revision seems to be working fine in all cases. The only problem I see with the this revision is the naming which is a little confusing now. One way to improve it would be to rename HasContents to HasSections and possibly also split createSections() into createSymbols() and createSections() so they can be called separately. Would that make sense?

ruiu added a comment.Aug 23 2016, 10:08 PM

There is at most one SECTIONS command, so an alternative approach would be to split SymbolAssignment list into two lists, say, pre-assignments and post-asignments. I feel like it may be slightly better approach. What do you think?

grimar added a subscriber: grimar.Aug 24 2016, 2:37 AM
In D23751#523869, @ruiu wrote:

There is at most one SECTIONS command...

btw that is not true :( linkerscript-sections.s contains case for multiple SECTIONS command. I also was wonder why do we support this wierd case.

I tested the support for multiple SECTIONS commands in Binutils ld and gold using the following script:

foo = 0x1000;
SECTIONS {
  bar = foo + 0x10;
}
baz = bar + 0x20;
SECTIONS {
  qux = baz + 0x30;
}
quux = qux + 0x40

ld handles this correctly producing:

0000000000001000     0 NOTYPE  GLOBAL DEFAULT  ABS foo
0000000000001010     0 NOTYPE  GLOBAL DEFAULT  ABS bar
0000000000001030     0 NOTYPE  GLOBAL DEFAULT  ABS baz
0000000000001060     0 NOTYPE  GLOBAL DEFAULT  ABS qux
00000000000010a0     0 NOTYPE  GLOBAL DEFAULT  ABS quux

gold produces a different output which doesn't seem correct:

0000000000001000     0 NOTYPE  GLOBAL DEFAULT  ABS foo
0000000000001010     0 NOTYPE  GLOBAL DEFAULT  ABS bar
0000000000001030     0 NOTYPE  GLOBAL DEFAULT  ABS baz
0000000000000050     0 NOTYPE  GLOBAL DEFAULT  ABS qux
0000000000000090     0 NOTYPE  GLOBAL DEFAULT  ABS quux

So given that, maybe support for multiple SECTIONS commands is not necessary. Old ld manual even explicitly says that "You may use at most one SECTIONS command in a script file", although I couldn't find the same statement in the current one.

Do you have any opinion on what to do about this? I'd say that since we already support having multiple SECTIONS commands and it works reasonable (as in the same as in BFD ld), there's no reason to break that functionality in which case the current patch is probably a better albeit a slightly uglier solution? I still feel feel that renaming HasContents to HasSections would improve the readability.

Do you have any opinion on what to do about this? I'd say that since we already support having multiple SECTIONS commands and it works reasonable (as in the same as in BFD ld), there's no reason to break that functionality in which case the current patch is probably a better albeit a slightly uglier solution? I still feel feel that renaming HasContents to HasSections would improve the readability.

Question is addressed to Rui, but
FWIW I think multiple 'SECTIONS' is a bit confusing and uncommon. I think I never saw that in real projects though reviewed many scripts in the wild. Anybody really needs it ?
So if breaking support can help to improve the code I would probably do that.

I agree that 'HasSections' is sounds more readable because currently HasContent actually used only to check if we have SECTIONS command or not.

ruiu added a comment.Aug 29 2016, 3:03 PM

I think I agree with George. Have you ever seen a linker script containing multiple SECTIONS?

Personally no, I haven't so I suppose removing this support shouldn't break anyone. Shall I split the SymbolAssignment list into the pre-assignments and post-asignments lists as you suggested then? I can also report an error when we find more than one SECTIONS command.

ruiu added a comment.Aug 29 2016, 4:12 PM

Yup, I think splitting it into pre- and post-assignment makes things easier to understand. What do you think?

phosek updated this revision to Diff 69757.Aug 30 2016, 2:30 PM

Turned out we don't even need post-assignment as a separate list, those can be added directly to Commands and they'll be handled correctly by the existing logic, it's just the pre-assignment ones that need to be handled separately (to cover the case where there is no SECTIONS command).

This revision was automatically updated to reflect the committed changes.