This is an archive of the discontinued LLVM Phabricator instance.

[ELF2] - Basic implementation of -r/--relocatable
ClosedPublic

Authored by grimar on Nov 5 2015, 6:52 AM.

Details

Summary

Currently does not have support for files containing relocation sections with entries that refer to local symbols (like rel[a].eh_frame which refer to sections and not to symbols).

Diff Detail

Repository
rL LLVM

Event Timeline

grimar updated this revision to Diff 39367.Nov 5 2015, 6:52 AM
grimar retitled this revision from to [ELF2] - Basic implementation of -r/--relocatable.
grimar updated this object.
grimar added reviewers: ruiu, rafael.
grimar added subscribers: llvm-commits, grimar.
grimar added inline comments.Nov 5 2015, 7:08 AM
ELF/OutputSections.cpp
642 ↗(On Diff #39367)

Sorry, forgot about that "true ?", will fix in updated patch.

ruiu edited edge metadata.Nov 5 2015, 9:01 AM

I took a quick look at the code. This patch doesn't remove relocations that were successfully resolved within an object file the linker is writing, does this? Is it hard to support that? I think that's the reason why users want to use -r in the first place (so that the number of relocations in a combined object file is much smaller than the total number of relocations in the input files.) If it's going to be more code, it's fine to do in a follow up patch, though.

ELF/Driver.cpp
155 ↗(On Diff #39367)

Move this into the code block above and sort.

ELF/InputFiles.cpp
211–213 ↗(On Diff #39367)

I'd use cast<> instead of dyn_cast<> and remove the check.

grimar added a comment.EditedNov 5 2015, 9:27 AM
In D14382#282370, @ruiu wrote:

I took a quick look at the code. This patch doesn't remove relocations that were successfully resolved within an object file the linker is writing, does this? Is it hard to support that? I think that's the reason why users want to use -r in the first place (so that the number of relocations in a combined object file is much smaller than the total number of relocations in the input files.) If it's going to be more code, it's fine to do in a follow up patch, though.

Actually no relocations are applied atm. See ELF/InputFiles.cpp (215) has "break;" so nothing is added to S->RelocSections list and no relocations resolved during that link because of that. I can put correcting that into plan at first place but would be nice to split this to following patches. That one looks large already.

I am also have plan to add support for rel[a].eh_frame sections in following one(s) as well.

ELF/Driver.cpp
155 ↗(On Diff #39367)

Ok. I just wasn`t sure here how to highlight that Relocatable affects on other options. Thats why put it outside as well.

grimar added a comment.Nov 5 2015, 9:30 AM
In D14382#282370, @ruiu wrote:

I took a quick look at the code. This patch doesn't remove relocations that were successfully resolved within an object file the linker is writing, does this? Is it hard to support that?

I also didnt investigate that case yet. For test case ld behaves in the same way as lld now. No any relocations happen for R_X86_64_32S and I didn`t look at other ones.

ruiu added a comment.Nov 5 2015, 9:36 AM

I'd like to see a patch to apply (and reduce) relocations. This patch seems generally good, but there's a chance that the fields and functions added in this patch will not make sense with the follow-up patch. We don't have to combine all changes into one patch, but can split into a series of patches.

rafael edited edge metadata.Nov 5 2015, 9:57 AM
rafael added a subscriber: rafael.

I don't think traditional linkers resolve relocations during -r.

I would probably delay implementing that if so. A simple -r should help us
with linking freebsd.

Cheers, Rafael

In D14382#282400, @ruiu wrote:

I'd like to see a patch to apply (and reduce) relocations. This patch seems generally good, but there's a chance that the fields and functions added in this patch will not make sense with the follow-up patch. We don't have to combine all changes into one patch, but can split into a series of patches.

Well I tried to build some background. All added functions/variables are mostly used for -r option which implementation is a subject to change anyways since it is initial.
I understand the core idea you want to see in this patch at first though. So if reducing relocations is needed for it initial commit, I can handle that I think. Currently I dont know the valid test case for that yet (which exactly relocations can be applied during relocatable output linking). If you could say what can I use then that info will be really helpfull, if not I`ll try to investigate that tomorrow.

I don't think traditional linkers resolve relocations during -r.

I would probably delay implementing that if so. A simple -r should help us
with linking freebsd.

Cheers, Rafael

Sounds good :)

That said, great to see -r implemented, I plan to take a closer look
at this patch for review in the next days.

Just in case only basic implementation is done atm. I still plan to work on rel[a].eh_frame relocation sections support.

grimar added a comment.Nov 6 2015, 8:08 AM

So according to Joerg in "llvm-commits Digest, Vol 137, Issue 69" this one seems to be useful ? Or not ?

grimar abandoned this revision.Nov 13 2015, 8:45 AM

elf2 does not need that it seems

ruiu added inline comments.Feb 16 2016, 4:14 PM
ELF/Driver.cpp
156–162 ↗(On Diff #39367)

Move this to checkOptions.

ELF/InputFiles.cpp
209 ↗(On Diff #39367)

This needs a description. (Any piece of code that does something obvious needs explanation.)

It is probably better to explain how you are handling object files if -r.

ELF/InputSection.cpp
95–107 ↗(On Diff #39367)

I don't think this is a right place to add this code. This is more like copying existing relocations rather than relocating. You want to create a new function to copy relocation sections instead of embedding here.

149–159 ↗(On Diff #39367)

This code is not very readable -- you defined a vector with only one element and run a for-loop over that vector? It is probably better to just call a different function if we find that we are handling a relocation section.

ELF/InputSection.h
120 ↗(On Diff #39367)
// If -r (or --relocatable) option is given, this member may have a non-null value.
// During partial linking, we do not apply any relocations but just leave them as is,
// expecting the final static linking applies them all at once.
// In that case, relocation sections are handled as regular input sections, and their
// RelocatingSection pointers point to their target sections.
ELF/OutputSections.cpp
174 ↗(On Diff #39367)

Why did you change the function name? If not relevant to this change, please revert.

639 ↗(On Diff #39367)

Remove this temporary variable and directly assign to this->Header.

646–649 ↗(On Diff #39367)

I'm not quite sure what you are doing here.

654–655 ↗(On Diff #39367)

Is it possible that we do not have the symbol table if -r is specified?

942 ↗(On Diff #39367)

Why did you need this change?

948 ↗(On Diff #39367)

What is this for?

ELF/OutputSections.h
235–236 ↗(On Diff #39367)

This needs explanation.

// This section contains input relocation sections so that they are copied
// to the output. Used only when -r is specified (-r lets the linker do partial
// linking, which combines multiple .o files into single .o file).
// Normally, we don't need this class because we interpret and consume
// relocations ourselves instead of copying.
243 ↗(On Diff #39367)

What is this variable for?

ELF/Writer.cpp
829–833 ↗(On Diff #39367)

It looks a bit weird to use ?: in combination with if. I'd write

if (Config->Shared)
  EHdr->e_type = ET_DYN;
else if (Config->Relocatable)
  EHdr->e_type = ET_REL;
else
  EHdr->e_type = ET_EXEC;
837 ↗(On Diff #39367)

Use Config->Relocatable instead of EHdr->e_type==ET_REL

840 ↗(On Diff #39367)

Ditto

Hi Rui,

I am not sure what do you mean, I thought that we discussed that not going to implement "-r/--relocatable" about 3 month ago. This patch is pretty old and abandoned :)
I assume it was reviewed by mistake.

Hi Rui,

I am not sure what do you mean, I thought that we discussed that not going to implement "-r/--relocatable" about 3 month ago. This patch is pretty old and abandoned :)
I assume it was reviewed by mistake.

Sorry, didn`t see the new discussion about it when wrote that. Please ignore comment above.

grimar reclaimed this revision.Feb 17 2016, 1:32 AM

So, since this is needed functionality,
I will rebase and reupload the fresh patch in few days.

grimar updated this revision to Diff 48305.Feb 18 2016, 6:10 AM
grimar edited edge metadata.
grimar marked 20 inline comments as done.
  • Rebased
  • Addressed review comments.
ELF/OutputSections.cpp
174 ↗(On Diff #39367)

There is no more such function in lld code, but I changed the member name also. For explanations why - please see comments below.

646–649 ↗(On Diff #39367)

I need to have the section index for finalize() (The section header index of the section to which the relocation applies.).
So I save output section here for that. Also I perform a check that such output section is the same. As it would be error for them to have different index. Last is just for safety, I am not sure how much is possible.

654–655 ↗(On Diff #39367)

Since StripAll is forced disabled, then looks like not.

942 ↗(On Diff #39367)

I need to set correct symtab index for relacations in
void InputSectionBase<ELFT>::copyRelocatable(uint8_t *Buf, uint8_t *BufEnd, RelIteratorRange<isRela> Rels)
Alternative would be to iterate over bodies in symtab right in place there, but that would be much slower.
So I am using SymIndex (it is no more DynSymIndex) and keep correct index in it. For relocatable output in keeps index in symtab, otherwize - in dynsym, like was before.

948 ↗(On Diff #39367)

Because of written above - I need to skip locals to assign correct indexes. DynSym does not have locals, so it is always 0 for it. But for SymTab I need to start iteration from its value.

ELF/OutputSections.h
243 ↗(On Diff #39367)

Updated name, added comment + please see comments for void StaticRelocSection<ELFT>::addSection(InputSection<ELFT> *C),
where it is used.

ruiu added inline comments.Feb 18 2016, 2:28 PM
ELF/Config.h
70 ↗(On Diff #48305)

Remove ' = false'.

ELF/InputSection.cpp
99 ↗(On Diff #48305)

What does this function do? It needs a comment.

I figured that out myself. So, this function seems to do this.

// This function copies section contents to Buf, assuming that the current section
// contains relocations. Usually this function is not called because we apply and
// consume relocation ourselves as a static linker. However, if -r is specified,
// we are a producer of an object file, so instead of applying, we
// copy all relocations to a resulting object file and let the final static
// linker resolve all relocations.
// We can't use memcpy to copy relocations because we need to update symbol table
// offset and section index for each relocation. So we copy relocations one by one.
103 ↗(On Diff #48305)

s/RelType/RelTy/

(I think we used this name at somewhere else.)

106 ↗(On Diff #48305)

I believe RI originally stands for Relocation Iterator. Using RI as a name of a relocation is probably inappropriate. Please rename Rel.

118 ↗(On Diff #48305)
fatal("Relocation against local symbols is not supported yet")
288–291 ↗(On Diff #48305)

OK, so these functions are called after you copy all relocations to Buf. But inside fixupRelocations, you copy all relocations to Buf again. That's not wasting but confusing. I'd rename fixupRelocations to copyRelocations and move this function call before memcpy in this function.

ELF/InputSection.h
70 ↗(On Diff #48305)

I'm not a big fan of optional parameters because it adds complexity (even though it is small.) And this function converts a given offset to a different offset starting from a different point of origin, so it is weird to call this function without any argument.

ELF/OutputSections.cpp
1402–1403 ↗(On Diff #48305)

Please try to separate the control flow for -r from the regular control flow. (Or, in general, please do not try to add small ifs for various flags to achieve something.)

!StrTabSec.isDynamic() does not make sense for -r. As I understand, what you want to do is to set DynsymIndex for each symbol if -r is given, so that you can use the indices in fixRelocations() later. In that case, you don't actually want to sort anything (is this correct?). So, please add

if (Config->Relocatable) {
  size_t I = 0;
  for (const std::pair<SymbolBody *, size_t> &P : Symbols)
    P.first->DynsymIndex = ++I;
  return;
}

before if (!StrTabSec.isDynamic()).

1410 ↗(On Diff #48305)

I do not understand -- isn't this an incompatible change if you do not use -r?

ELF/OutputSections.h
302 ↗(On Diff #48305)

I do not want to define a derived class of OutputSection. All other subclasses are of OutputSectionBase.

305 ↗(On Diff #48305)

Do not use C for InputSections because C stands for Chunk. We have removed Chunk, so it no longer makes sense. I'd name S.

309 ↗(On Diff #48305)
// The output section to which relocation would be applied.
ELF/Writer.cpp
103 ↗(On Diff #48305)

This needs an explanation.

1310 ↗(On Diff #48305)

This needs a comment.

1472 ↗(On Diff #48305)
if (!Config->Relocatable) {
  EHdr->e_phoff = sizeof(Elf_Ehdr);
  EHdr->e_phentsize = sizeof(Elf_Phdr);
}
grimar updated this revision to Diff 48506.Feb 19 2016, 10:00 AM
grimar marked 15 inline comments as done.
  • Addressed Rui's comments.
ELF/OutputSections.cpp
1402–1403 ↗(On Diff #48305)

Yes, it looks I dont need sorting. Fixed.

1410 ↗(On Diff #48305)

No, it`s not:

When I don't use -r, SymTab execution flow returns earlier, inside if (!StrTabSec.isDynamic()) {...}
For DynSymTab case NumLocals is always 0, so it has no any impact on logic.

ELF/OutputSections.h
305 ↗(On Diff #48305)

Ok. Btw it is still used in

OutputSection::addSection(InputSectionBase<ELFT> *C) override;
rafael added inline comments.Feb 19 2016, 3:16 PM
ELF/InputSection.h
95 ↗(On Diff #48506)

Why do you need to store this? You can always use sh_info, no?
That way you don't add a field to every InputSectionBase.

ELF/OutputSections.cpp
751 ↗(On Diff #48506)

This is remarkably similar to OutputSeciton. Can you use that?

763 ↗(On Diff #48506)

Can you get here? If so, add a test, if not, use an assert.

grimar added inline comments.Feb 20 2016, 3:51 AM
ELF/InputSection.h
95 ↗(On Diff #48506)

I tried to do that, but for me there are too much additional code required instead of the single field. I had to add something like

template <class ELFT>
void StaticRelocSection<ELFT>::addSection(InputSectionBase<ELFT> *Sec) {
  if (!RelocOutSec) {
    const ObjectFile<ELFT> &File = *Sec->getFile();
    ArrayRef<InputSectionBase<ELFT> *> Sections = File.getSections();
    InputSectionBase<ELFT> *RelocatedSection =
        Sections[Sec->getSectionHdr()->sh_info];
    RelocOutSec = RelocatedSection->OutSec;
  }

for places where I was need that. If you insist I will update the patch with such change.
But I would leave code as is for now.

ELF/OutputSections.cpp
751 ↗(On Diff #48506)

Sorry, I am not sure what do you mean here. If you mean I can inhirit from OutputSeciton then I was already asked to not do that. That was initial plan in earlier revision of this patch.
If you meant something else, could you please give more details about that ?

763 ↗(On Diff #48506)

I just assumed that users who will try linker probably will use the release version. So if them meet that error situation, there is no way they see the assert, it will not be debug build for them.
error() is universal way to prompt here. I would use it or fatal() here.

grimar added inline comments.Feb 20 2016, 9:44 PM
ELF/OutputSections.cpp
751 ↗(On Diff #48506)

I also can't just use OutputSection instead of StaticRelocSection because of additional logic involved.


rafael wrote:

Can you get here? If so, add a test, if not, use an assert.

I just assumed that users who will try linker probably will use the release version. So if them meet that error situation, there is no way they see the assert, it will not be debug build for them.
error() is universal way to prompt here. I would use it or fatal() here.

If it cannot be reached this is dead code and should be an assert.
Never add an error or fatal that cannot actually happen.

Cheers,
Rafael

I just tried applying this patch. I noticed two things:

  • It is not git-clang-format clean. In particular some lines are more

than 80 chars long.

  • It doesn't compile on linux:

/home/espindola/llvm/llvm/tools/lld/ELF/OutputSections.h:306:53:
error: unknown type name 'uintX_t'; did you mean 'uint8_t'?

StaticRelocSection(StringRef Name, uint32_t Type, uintX_t Flags, bool IsRela);
                                                  ^~~~~~~
                                                  uint8_t
ruiu added inline comments.Feb 22 2016, 2:03 PM
ELF/InputFiles.cpp
230 ↗(On Diff #48506)

choosen -> chosen

ELF/InputSection.cpp
109 ↗(On Diff #48506)

You are not using BufEnd. Please remove.

112–113 ↗(On Diff #48506)

Why don't you use range-based for?

ELF/OutputSections.cpp
763 ↗(On Diff #48506)

So, the point is that

  • if this is not reachable except by programmer's error (by a bug), assert is better
  • if this is reachable by feeding incompatible files, giving incorrect combination of flags, etc, error is appropriate
ELF/Writer.cpp
844 ↗(On Diff #48506)

Checking for RelocatingSection to see if an input section is a relocation section is an obscure way to distinguish two different kind of sections. Maybe we should define RelInputSection for relocation sections?

grimar updated this revision to Diff 48802.Feb 23 2016, 3:08 AM
grimar marked 9 inline comments as done.
  • Addressed review comments
  • Fixed compilation issues under linux
ELF/OutputSections.cpp
763 ↗(On Diff #48506)

Yep, that was clear for me. I just not always sure which situations are reachable by feeding incompatible files for example. 'Corrupted' files can contain absolutely anything to trigger different unexpected situations.
Replaced with assert as I dont know way how to reach here.

ELF/Writer.cpp
844 ↗(On Diff #48506)

Done. May be RelInput would be better ?
Then we would have:

enum Kind { Regular, RelInput, EHFrame, Merge, MipsReginfo };

I mean all of that are sections kind, IMO no need to specify "section" in enum name again.

emaste added a subscriber: emaste.Feb 23 2016, 6:19 AM
  • Fixed compilation issues under linux

Confirmed it builds on FreeBSD now (previously failed the same as on Linux).

test/ELF/invalid-linkerscript.test needs updating because it currently relies on the -r option is not supported output (to verify functionality unrelated to -r).

As an aside, the Bindings/Go/go.test test gets further but still fails, with FDE doesn't reference another section. I will test the FreeBSD buildworld with this patch incorporated soon.

looks like recent commits invalided the new test in test/ELF/relocatable.s

/tank/emaste/src/llvm/tools/lld/test/ELF/relocatable.s:93:18: error: expected string not found in input
# CHECKEXE-NEXT: SectionHeaderOffset: 0x2190
                 ^
<stdin>:22:2: note: scanning from here
 SectionHeaderOffset: 0x11E8
 ^

As pointed out, the following tests are failing:

lld :: ELF/invalid-linkerscript.test
lld :: ELF/relocatable.s
ruiu added inline comments.Feb 23 2016, 1:38 PM
ELF/InputSection.h
39 ↗(On Diff #48802)

So you just added RelInputSection kind? Adding a new kind and does not add a new class does not make sense. What I suggested was to actually define a new class for a relocation section. See http://reviews.llvm.org/D17551. I applied your patch and updated yours to separate the class from InputSection.

grimar updated this revision to Diff 48917.Feb 24 2016, 5:37 AM
  • Updated diff with changes proposed by Rafael Ávila de Espíndola.
  • Moved getRelocatedSection and copyRelocations from InputSectionBase to InputSection as was suggested by Rui Ueyama.
ruiu accepted this revision.Feb 24 2016, 11:07 AM
ruiu edited edge metadata.

LGTM

ELF/OutputSections.cpp
746 ↗(On Diff #48917)

Use S as a variable name.

This revision is now accepted and ready to land.Feb 24 2016, 11:07 AM
rafael accepted this revision.Feb 24 2016, 2:56 PM
rafael edited edge metadata.

LGTM

This revision was automatically updated to reflect the committed changes.
grimar marked an inline comment as done.

Rafael, Rui, thank you for reviews and help with that !