This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Implement infrastructure for thunk code creation
ClosedPublic

Authored by atanasyan on Mar 7 2016, 8:44 AM.

Details

Summary

Some targets might require creation of thunks. For example, MIPS targets require stubs to call PIC code from non-PIC one. The patch implements infrastructure for thunk code creation and provides support for MIPS LA25 stubs. Any MIPS PIC code function is invoked with its address in register $t9. So if we have a branch instruction from non-PIC code to the PIC one we cannot make the jump directly and need to create a small stub to save the target function address.
See page 3-38 ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf

  • In relocation scanning phase we ask target about thunk creation necessity by calling TagetInfo::needsThunk method. The InputSection class maintains list of Symbols requires thunk creation.
  • Reassigning offsets performed for each input sections after relocation scanning complete because position of each section might change due thunk creation.
  • The patch introduces new dedicated value for DefinedSynthetic symbols DefinedSynthetic::SectionEnd. Synthetic symbol with that value always points to the end of the corresponding output section. That allows to escape updating synthetic symbols if output sections sizes changes after relocation scanning due thunk creation.
  • In the InputSection::writeTo method we write thunks after corresponding input section. Each thunk is written by calling TargetInfo::writeThunk method.
  • The patch supports the only type of thunk code for each target. For now, it is enough.

Diff Detail

Repository
rL LLVM

Event Timeline

atanasyan updated this revision to Diff 49968.Mar 7 2016, 8:44 AM
atanasyan retitled this revision from to [ELF] Implement infrastructure for thunk code creation.
atanasyan updated this object.
atanasyan added reviewers: ruiu, rafael.
atanasyan set the repository for this revision to rL LLVM.
atanasyan added a project: lld.
atanasyan added a subscriber: llvm-commits.
ruiu edited edge metadata.Mar 7 2016, 11:19 AM

I'm still reading this patch, but does it work for all cases? When you want to create a thunk for a short-range jump instruction, you want to locate that thunk within the range of the instruction. Output sections can be large. Putting thunks at end of each output section will work?

In D17934#369150, @ruiu wrote:

I'm still reading this patch, but does it work for all cases? When you want to create a thunk for a short-range jump instruction, you want to locate that thunk within the range of the instruction. Output sections can be large. Putting thunks at end of each output section will work?

I had a solution which puts stubs in the beginning of the corresponding input section. In that case the risk to get too long jump is smaller. But this solution seemed to me over-complicated a bit. Let me think again till tomorrow. Maybe I revert that solution.

ruiu added a subscriber: ruiu.Mar 7 2016, 12:18 PM

This is just an idea and I'm not suggesting that that is better, but you
can scan a list of input sections and insert stub sections to the list,
instead of appending stub sections to input/output sections.

grimar edited edge metadata.Mar 9 2016, 1:53 AM
grimar added a subscriber: grimar.
atanasyan updated this revision to Diff 50131.Mar 9 2016, 6:17 AM
atanasyan updated this object.

Write thunks before corresponding InputSection's to reduce risk of getting too long jump.

ruiu added inline comments.Mar 9 2016, 2:18 PM
ELF/OutputSections.cpp
806 ↗(On Diff #50131)

Using a hash table feels a bit tricky. Can you add a pointer to SymbolBody to point to a thunk?

ELF/Target.cpp
1687 ↗(On Diff #50131)

Needs a comment.

1721 ↗(On Diff #50131)

Please run clang-format-diff.

1733 ↗(On Diff #50131)

We have using namespace llvm::ELF.

1735 ↗(On Diff #50131)

We usually name an object of type Elf_Sym ElfSym. In this case, the return type of RHS is still our own symbol type, so I'd name differently, say just Sym, D, Defined or something like that.

1739–1745 ↗(On Diff #50131)

This can be

return (ElfSym->Section->getFile()->getObj().getHeader()->e_flags & EF_MIPS_PIC) ||
       (ElfSym->Sym.st_other & STO_MIPS_MIPS16) == STO_MIPS_PIC;
ELF/Writer.cpp
346–352 ↗(On Diff #50131)

I don't like to add new code to this function because this function is overloaded. Particularly, when this code will be executed is not very obvious because if a condition above is true and it reaches a continue, this code will be skipped. Will this code be skipped for all correct cases?

Can you create a new loop over relocations and move this code there?

1050–1051 ↗(On Diff #50131)

This seems a bit odd.

If we always reassign offsets, then we don't want to reassign offsets at all. Instead, we want to assign offsets right here only once.

1068–1072 ↗(On Diff #50131)

I don't understand what this is supposed to do?

1183 ↗(On Diff #50131)

What is this change for?

atanasyan added inline comments.Mar 9 2016, 8:56 PM
ELF/OutputSections.cpp
806 ↗(On Diff #50131)

I need a link between input section and corresponding set of thunks to put the thunks in the right place during a "write" phase. I can add pointer to ThunkChunk to the InputSectionBase class. That will allow to remove this DenseMap. Is it acceptable to you?

ELF/Target.cpp
1687 ↗(On Diff #50131)

Will do

1721 ↗(On Diff #50131)

OK

1733 ↗(On Diff #50131)

OK

1735 ↗(On Diff #50131)

OK

1739–1745 ↗(On Diff #50131)

OK

ELF/Writer.cpp
346–352 ↗(On Diff #50131)

OK. I will try to do so.

1050–1051 ↗(On Diff #50131)

You are right. The assignOffsets is better name.

1068–1072 ↗(On Diff #50131)

This and the next change are related. Now we add synthetic symbols pointing to beginning and end of output sections before relocation scanning phase. In this patch output section size might change during the relocation scanning. So synthetic symbols point to the end of sections need to be updated. There is a couple of possible solutions:

  • Add synthetic symbols after the relocation scanning. Not sure that this does not affect something in relocation scanning phase.
  • Use special value DefinedSynthetic<ELFT>::End to mark symbol that need to point to the end of sections. Update such symbols with the correct value later.

In this patch I use the second solution. Do you think it is safe to add synthetic symbols after the relocation scanning?

ruiu added inline comments.Mar 9 2016, 9:46 PM
ELF/OutputSections.cpp
806 ↗(On Diff #50131)

I think that instead of attaching a chunk to an InputSection, we may be able to define a new type of (synthesized) InputSection for a thunk. Then we can keep OutputSection to have a flat list of InputSections instead of InputSections + their attachment. It may be cleaner approach IMO.

In that design, we can make a symbol have a pointer to a thunk input section. getThunkVA() returns the VA of the thunk input section, so it just have to call getVA on the thunk section.

What do you think? I'm not sure if this is really cleaner, but I'd think it's worth to try.

ELF/Writer.cpp
1068–1072 ↗(On Diff #50131)

OK. So if we keep the output section to have a flat list of input sections instead of input sections + their chunks, the size of the output section wouldn't change during relocation scanning phase, right?

atanasyan added inline comments.Mar 9 2016, 10:24 PM
ELF/OutputSections.cpp
806 ↗(On Diff #50131)

Now the root of input section hierarchy InputSectionBase class is tightly related to ELF input section. Its constructor requires reference to ObjectFile and Elf_Shdr. If we want to introduce something like SyntheticInputSection which is unrelated to and file and ELF header we have to create new root base class and inherit SyntheticInputSection and InputSectionBase. One more problem is that OutputSection::Sections container contains pointers to InputSection objects and it is not clear how to mix InputSection and SyntheticInputSection in the same container.

ELF/Writer.cpp
1068–1072 ↗(On Diff #50131)

Before the scanning phase we know nothing about thunks. During the scanning we either add chunks to OutputSections or create synthetic input sections and add them to OutputSections too. In any case the size will change.

atanasyan updated this revision to Diff 50237.Mar 9 2016, 11:06 PM

Fixed minor issues found on review. The open questions are updating synthetic symbols values and necessity to introduce new SyntheticInputSction class.

atanasyan updated this revision to Diff 50424.Mar 11 2016, 6:07 AM

Rebased the patch.

ruiu added a comment.Mar 12 2016, 7:38 PM

This looks still a bit too intrusive. The code to handles thunks appears too many places.

Maybe we should handle thunks as a separate output section? Then we don't need to add addThunk to the regular output section class and we don't have to update the code to calculate section size and offsets.

In D17934#374020, @ruiu wrote:

This looks still a bit too intrusive. The code to handles thunks appears too many places.

Maybe we should handle thunks as a separate output section? Then we don't need to add addThunk to the regular output section class and we don't have to update the code to calculate section size and offsets.

I see some problems here:

  • We can get a really long out-of-range jumps. I remember that my initial solution had the same problem but it was my mistake.
  • The output section should have a name. Putting all thunks into the separate output section is in fact ABI extension because we have to choose dedicated unique name for such section. Moreover if somebody have a linker script to put output sections in exact places (common case for embedded software developers), it will be surprise that he or she have to handle one more unusual output section.
  • There is optimization used by bfd and gold linkers - if input section contains the only function and we need to create a thunk for it, we can place the thunk in the beginning of the section and use just two assembly instructions to save the function address into the $25 registers and do not use additional jump instructions at all. My patch does not implement this optimization yet but moving all thunks into the separate output section make such optimization impossible at all.

I think the less intrusive solution is to concentrate the thunk handling in the InputSection class. We can add a vector of SymbolBody pointers right to this class. In that case only InputSection and TargetInfo descendant will know about synthetic code.

BTW MIPS is not the only target uses thunks. AFAIK ARM has concept of veneer.

atanasyan updated this revision to Diff 50696.Mar 14 2016, 10:22 PM
atanasyan updated this object.
  • Isolated almost all thunk handling code in the InputSection class
  • Reduced number of auxiliary classes, containers etc
atanasyan updated this revision to Diff 52145.Mar 30 2016, 4:05 PM

Rebase the patch.

ruiu added a comment.Mar 30 2016, 4:08 PM

I'm sorry that I missed this one. Taking a look now.

ruiu added inline comments.Mar 30 2016, 4:50 PM
ELF/InputSection.cpp
42–45 ↗(On Diff #52145)

I feel like this is slightly more readable.

if (auto *D = dyn_cast<InputSection<ELFT>>(this))
  if (D->getThunksSize() > 0)
    return D->getThunkOff() + D->getThunksSize();
return Header->sh_size;
121 ↗(On Diff #52145)

Is the thunk need to be aligned? If so, does it have the same alignment as the original section?

317 ↗(On Diff #52145)

This needs a comment.

359 ↗(On Diff #52145)

This needs a comment, saying that a section may have attached data that needs to be written after the section, and that data is usually a thunk to jump to a location that is too far to fit in a short jump instruction.

ELF/InputSection.h
172 ↗(On Diff #52145)

I'd also mention that you can ignore this if you are interested only in x86/x86-64.

190 ↗(On Diff #52145)

TinyPtrVector is probably better as I expect this is empty in most cases.

ELF/Target.cpp
1749 ↗(On Diff #52145)

It is probably better to just say that "see MipsTargetInfo::writeThunk for details" instead of copying the comment.

ELF/Writer.cpp
483–495 ↗(On Diff #52145)

Please make this a separate function and call only for platforms that need thunks.

atanasyan updated this revision to Diff 52211.Mar 31 2016, 8:14 AM
atanasyan marked 7 inline comments as done.
  • Use TinyPtrVector to store in the InputSection class pointers to SymbolBody which requires thunk creation.
  • Create separate Writer::scanRelocsForThunks method for scanning relocations
  • Remove thunks alignment. We do not need to do that.
  • Simplify InputSectionBase::getSize
  • Fix some comments
ruiu accepted this revision.Mar 31 2016, 9:24 AM
ruiu edited edge metadata.

LGTM with a nit. I think it is now simple enough to submit. Thanks!

ELF/Writer.cpp
316 ↗(On Diff #52211)

Invert the condition and use continue.

This revision is now accepted and ready to land.Mar 31 2016, 9:24 AM

Thanks for review.

This revision was automatically updated to reflect the committed changes.