This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Linkerscript: allow setting custom output section for common symbols, instead of .bss
ClosedPublic

Authored by evgeny777 on Jul 26 2016, 8:15 AM.

Details

Reviewers
ruiu
rafael
Summary

This patch allows common symbols to be placed to arbitrary section, specified in linker script. Special rule *(COMMON) is used for this case.
Only one output section is supported.

The patch introduces artificial input section object (CommonInputSection<ELFT>), which is used to reserve certain amount of space in the output section.

Diff Detail

Event Timeline

evgeny777 updated this revision to Diff 65522.Jul 26 2016, 8:15 AM
evgeny777 retitled this revision from to [ELF] Linkerscript: allow setting custom output section for common symbols, instead of .bss.
evgeny777 updated this object.
evgeny777 added reviewers: ruiu, rafael.
evgeny777 set the repository for this revision to rL LLVM.
evgeny777 added a project: lld.
evgeny777 added subscribers: grimar, davide, emaste and 2 others.

May be it is possible just to rename/reuse .bss section if script contains *(COMMON) ?
Though FreeBSD script for example uses .bss for commons:
https://svnweb.freebsd.org/base/head/sys/conf/ldscript.amd64?revision=284870&view=markup#l157
So even not sure that renaming is really needed.

ruiu added inline comments.Jul 26 2016, 1:26 PM
ELF/InputSection.cpp
689

The size computed here is not always going to be the same as the actual size, because actual size is computed after sorting common symbols.

ELF/InputSection.h
260–262

You introduced this class, but you are using this only for the linker script. If linker scirpts are not used, common symbols directly belong to .bss, but if there are, they belong to some output sectino through this CommonInputSection.

Why don't you use this section both in LinkerScript.cpp and in Writer.cpp? You could create a CommonInputSection, which virtually contains all common symbols in Writer.cpp and add that input sectino to .bss, no?

evgeny777 added inline comments.Jul 26 2016, 1:40 PM
ELF/InputSection.h
260–262

Makes sense.

evgeny777 updated this revision to Diff 65693.Jul 27 2016, 3:51 AM
evgeny777 removed rL LLVM as the repository for this revision.

Review updated. Now common section is instantiated in Writer and is added to .bss in case linker script processor hasn't added it to some other section.
Also test case now checks for common section size and alignment

grimar added inline comments.Jul 27 2016, 5:22 AM
ELF/LinkerScript.cpp
111

llvm::find(Patterns, "COMMON") returns iterator. Condition is always true,
for cases when Patterns does not contain COMMON it is equal to

if (Patterns.end() && ... )

You probably want to update testcase also.

ELF/Symbols.h
188

Since you always call fixSymbols() I think you can skip initialization.

ELF/Writer.cpp
225–229

May be factory should own and create this ?
That way it would be consistent with OwningSections it has and also
you'll dont need to add argument in call below, you will be able to take it directly from factory.

evgeny777 added inline comments.Jul 27 2016, 6:33 AM
ELF/LinkerScript.cpp
111

I think you're not right. This is ArrayRef<T>, which iterator is a pointer to element. This means that if pointer is nullptr then there is no such element in the array. If you try to do this with STL container, for example std::vector you'll get compilation error, because STL iterator does not define operator bool().

ELF/Symbols.h
188

If someone forgets to set Section pointer there will be a crash instead of undefined behavior, which is always good

ELF/Writer.cpp
225–229

This is *input* section and I guess you're talking about OutputSectionFactory, right?

grimar added inline comments.Jul 27 2016, 7:13 AM
ELF/LinkerScript.cpp
111

Right, I missed that it is ArrayRef, but it is even simpler than:
you are searching pointer among pointers, it will always return true
since when find return end() it will be some non-zero pointer.

evgeny777 updated this revision to Diff 65740.Jul 27 2016, 7:23 AM

George found a bug, review updated.

It turned out, that ArrayRef has this code for end():
iterator end() const { return Data + Length; }

So it is never nullptr.

ruiu edited edge metadata.Jul 27 2016, 2:49 PM

Overall looking good. I think I like the idea of CommonInputSection.

ELF/InputSection.cpp
673–677

build is called only from the constructor, so I don't see a reason to separate the two functions. You could merge build with the constructor.

684–687
for (Symbol *S : Symtab<ELFT>::X->getSymbols())
  if (auto *C = dyn_cast<DefinedCommon<ELFT>>(S->body()))
    Symbols.push_back(C);
704

It's weird that we have two variables for one thing -- can you use only this->Alignment?

707–712

If you add an pointer to the CommonInputSection to DefinedCommon symbol type and use it in Symbol::getVA, you dont' need to adjust offsets (i.e. you can remove this function.)

ELF/InputSection.h
272

= {} to initialize with zeros, otherwise this change will trigger asan, because you are passing a pointer to this member to InputSectionBase's ctor and the ctor uses it.

ELF/Symbols.h
188

You can remove this variable, no?

evgeny777 added inline comments.Jul 27 2016, 3:03 PM
ELF/Symbols.h
188

So you suggest to replace it with pointer to CommonInputSection and use OutSec and OutSecOff, correct?

ruiu added inline comments.Jul 27 2016, 3:08 PM
ELF/Symbols.h
188

Yes. Or, more radially, you can make a pointer to a CommonInputSection a global varaible since it's a singleton class, but it's probably too much for this change, so I wouldn't do it at least now.

evgeny777 updated this revision to Diff 65884.Jul 28 2016, 1:17 AM
evgeny777 edited edge metadata.

Review updated

ruiu accepted this revision.Jul 28 2016, 12:03 PM
ruiu edited edge metadata.

LGTM with the following changes.

ELF/InputSection.cpp
674

val -> Val

681

Remove the blank line.

ELF/LinkerScript.cpp
111

Do you need to check the size of the common input section? It seems that you can safely add an empty section to the output section. If so, please remove that check.

ELF/OutputSections.cpp
1451

CS should be named Section for consistency with DefiendRegular class.

ELF/Writer.cpp
741

Remove the size check if not necessary.

This revision is now accepted and ready to land.Jul 28 2016, 12:03 PM
evgeny777 added inline comments.Jul 28 2016, 12:29 PM
ELF/Writer.cpp
741

Looks like this check is really not necessary. After removing it I had to add this->Live = true to CommonInputSection constructor to make tests pass when --gc-sections flag is set.

evgeny777 closed this revision.Jul 29 2016, 5:34 AM