This is an archive of the discontinued LLVM Phabricator instance.

[LLD][ARM] Thunk support framework for ARM and Mips
ClosedPublic

Authored by peter.smith on Jun 30 2016, 5:36 AM.

Details

Summary

Add Thunk support framework for ARM and Mips

Generalise the Mips LA25 Thunk code and implement ARM and Thumb interworking Thunks.

  • Introduce a new module Thunks.cpp to store the mapping between SymbolBody and Thunk
  • SymbolBody now has no Thunk specific fields
  • A Target can more easily have more than one type of Thunk
  • Support PC-relative calls to Thunks
  • Support Thunks to PLT entries
  • Existing Mips LA25 Thunk code integrated with no change in behaviour
  • Support for ARMv7A interworking Thunks
  • Added pc-relative versions of MOVT/MOVW relocations needed by ARM Thunks
  • Tests for new functionality added.

Limitations:

  • Only one Thunk per SymbolBody, this is sufficient for all currently implemented Thunks.
  • ARM thunks assume presence of V6T2 MOVT and MOVW instructions.

Notes for reviewers:

  • It is just about possible to implement ARM/Thumb interworking Thunks in the same framework as the Mips LA25 Thunks. However this does mean adding new fields in RegularSymbol and SharedSymbol to record information about the InputSection that the Thunk will be written to. Given that the majority of Symbols even in ARM will have no Thunks I thought it better to move the Thunk information away from SymbolBody.
  • The code in Thunk.cpp could be moved into Target.cpp although if the ThunkInfo class were merged into TargetInfo it would mean templating TargetInfo with ELFT. I also thought that the Thunk code was less interesting to Targets other than ARM or Mips.
  • In this patch I've chosen to keep the Mips Thunk code in Target undisturbed rather than moving it into Thunks.cpp. This is partly due to the MipsTargetInfo<ELFT>::writeThunk member function calling local static functions in Target.cpp. It also means less rework if Thunks.cpp isn't the right design.
  • Most of the other changes are moving existing code around to deal with the Thunks module.
  • I think that the implementation can be extended to cope with range-extension Thunks. In summary a SymbolBody will map to multiple Thunks, a caller will select the most appropriate Thunk out of the possible candidates.

Diff Detail

Event Timeline

peter.smith retitled this revision from to [LLD][ARM] Thunk support framework for ARM and Mips.
peter.smith updated this object.
peter.smith added reviewers: ruiu, atanasyan, rafael.
peter.smith added a subscriber: rengolin.
ruiu added inline comments.Jul 1 2016, 12:15 AM
ELF/InputSection.cpp
142–145

total -> Total

ELF/Thunks.cpp
47

In general, the class hierarchy defined in this file looks too deep. Please move more features to "leaf" classes to reduce number of classes.

ELF/Thunks.h
32–34

I think this class does not have to be a class. It can be a non-member function instead.

ruiu added inline comments.Jul 1 2016, 12:15 AM
ELF/Thunks.cpp
45

Add a newline.

344

What is this for? If you don't need this now, please remove.

348–355

We want to group them by class by convention.

template class ThunkInfo<ELF32LE>;
template class ThunkInfo<ELF32BE>;
template class ThunkInfo<ELF64LE>;
template class ThunkInfo<ELF64BE>;

template ThunkInfo<ELF32LE> *createThunkInfo<ELF32LE>();
template ThunkInfo<ELF32BE> *createThunkInfo<ELF32BE>();
template ThunkInfo<ELF64LE> *createThunkInfo<ELF64LE>();
template ThunkInfo<ELF64BE> *createThunkInfo<ELF64BE>();
ELF/Thunks.h
23

This is a good place to describe the class a bit more in detail.

// Class to describe an instance of a Thunk. Thunks are pieces of data
// that are attached to input sections. MIPS and ARM needs them to
// place small pieces of code for function calls that cannot directly be called.
40

Add a newline before this line.

45

Ditto

ELF/Writer.cpp
25 ↗(On Diff #62351)

Remove.

Thank you very much for the comments. Apologies for the delay in responding.

Changes:

  • Made all formatting suggestions.
  • Added some more comments describing classes.
  • Flattened the class hierarchy to two levels.
  • Only Thunk::getVA() was being used after a call to getThunk(S) so I simplified to getThunkVA(S).

I've put some further explanations as response to comments.

ruiu added inline comments.Jul 1 2016, 11:24 PM
ELF/Thunks.h
81

What's the reason to manage thunks separately from SymbolBodies? It seems you can add a new member to the symbol to store a thunk.

ruiu added inline comments.Jul 1 2016, 11:27 PM
ELF/Thunks.cpp
295–296

This should be a fatal(), no?

I've added some inline responses to comments.

ELF/InputSection.cpp
142–145

Ok, made change.

ELF/Thunks.cpp
45

Ok, have made change.

47

I've squashed the hierarchy down to two levels. This has meant moving some of the implementation details into the header file. Given that the implementation details moved are not target specific, this is probably better than duplicating functionality in the leaf classes.

295–296

Currently the ThunkInfo() implementation of adjustExprForThunk() is needed to return the expression unaltered. If adjustExprForThunk() were either removed from the class or guarded by something like Target->needsThunk() then creating ThunkInfo would not be needed and this would indeed be fatal.

344

It is needed so that a Target that doesn't need thunks always returns the same RelExpr in adjustExprForThunk(). I've flattened the hierarchy so that ThunkInfoBase is no longer required, but I've kept the ThunkInfo for the moment.

The alternative is to move adjustExprFOrThunk out of the ThunkInfo class, perhaps into TargetInfo? This would mean that only ARM and Mips would create an object but would mean that code very similar to the other Thunks is spread out over two files. Let me know if you'd prefer that?

348–355

Ok, have made change.

ELF/Thunks.h
23

Ok, have added some more comments.

32–34

I'm not sure about this yet. With the class hierarchy flattening I've moved some more information into the header files to remove the need for an intermediate layer so this might be a bit clearer why it is a class.

I think it could be done, with non-member functions forwarding to a class internal to Thunks.cpp. The bit I'm not happy about yet is the initialisation and finalisation of the persistent state. I can give this another try next week if you think it is important.

40

Ok, have made change.

45

Ok, have made change.

81

My reasons for separating out thunks:

  • Avoiding adding a field for all symbols when only a small proportion need thunks, and only for targets that need thunks.
  • Thunk field either has to go in both Shared (thunks to PLT imports) and DefinedRegular or in their shared ancestor Defined. The former means casting to the appropriate Defined sub-class when accessing the thunk and the latter puts a field only needed by a small proportion of symbols into most of them. Neither of these options felt attractive.
  • Range extension thunks will need to support multiple thunks per symbol, a container of thunks, or a pointer to one may not as acceptable.

I don't think any of these are especially strong arguments right now although I think we'll end up separating thunks when multiple thunks per symbol are needed.

ELF/Writer.cpp
25 ↗(On Diff #62351)

Ok, have made change.

peter.smith updated this revision to Diff 62692.Jul 4 2016, 9:43 AM

I had a go at implementing interworking thunks by storing the pointer to Thunk in the Symbol. I think that this simplifies the implementation enough to be worth doing. The implementation can be revisited when range extension thunks are implemented.

Changes:

  • Removed ThunkInfo class and its sub-classes
  • Moved adjustExprForThunk into TargetInfo
  • Shared and DefinedRegular have a pointer to their Thunk
ruiu edited edge metadata.Jul 6 2016, 4:17 PM

I think this is in general looking good. My comments are mostly on style.

ELF/Symbols.h
238

Please use C++11 member initialization.

std::unique_ptr<Thunk<ELFT>> Alt = nullptr;

I don't think Alt is a good name because there are a lot of things associated to symbols that can be called "alt" such as PLT, GOT or copy relocations, etc. Because Thunk is a class name, we can't use the same name, so I don't have a good suggestion. Maybe ThunkData or ThunkObj?

ELF/Target.cpp
1566

Remove else because it is after return.

1956–1960

Can you define variables for the boolean values?

ELFFile<ELFT> &File = D->Section->getFile()->getObj();
bool PicFile = File->e_flags & EF_MIPS_PIC;
bool PicSym = (D->StOther & STO_MIPS_MIPS16) == STO_MIPS_PIC;
return (PicFile || PicSym) ? R_THUNK_ABS : Expr;
ELF/Target.h
60–61

You can remove thunkSize member function from all Targets and return 16 from MipsThunk<ELFT>::size().

ELF/Thunks.cpp
50

ARMThunkDestVA -> armThunkDestVA (or getARMThunkDestVA) because function names should start with lowercase letters.

57

Please enclose the class definitions in this file with an anonymous namespace.

59

We omit virtual if override is given.

59–60

I'd put a blank line between them.

70–71

Ditto

163

armAddThunk (or addThunkARM or anything that starts with a lowercase letter.)

202

Error messages should start with lowercase letters.

205

Lowercase.

224

llvm_unreachable is better because it is unreacahable unless there is a bug in LLD's code. The general guideline is

  • if it is unreachable unless there is a programmer's error, llvm_unreachable (or assert)
  • if it is unreachable unless input files are corrupted, fatal
  • otherwise, error.
227–239

I don't think you need to instantiate these templates because no other compilation units use them.

241–248

Nit: we omit parameter names in template instantiation. E.g.

template void AddThunk<ELF64BE>(uint32_t, SymbolBody &,
                                InputSection<ELF64BE> &);
ELF/Thunks.h
52

AddThunk -> addThunk

ruiu added inline comments.Jul 6 2016, 4:34 PM
ELF/Thunks.h
37

You don't redefine this function in a derived class, so please remove virtual.

ruiu added inline comments.Jul 6 2016, 4:37 PM
ELF/Thunks.h
15

You can remove this #include.

test/ELF/arm-thumb-interwork-thunk.s
82–86

Remove trailing whitespaces.

peter.smith updated this revision to Diff 63051.Jul 7 2016, 5:10 AM
peter.smith edited edge metadata.

Thanks again for the review comments. I've updated the diff to hopefully cover all raised so far.

ruiu added inline comments.Jul 7 2016, 10:09 AM
ELF/InputSection.cpp
133

Ditto.

190–194

I think you want to add A.

ELF/InputSection.h
213

Parameters need to be in camel case as a rule, so thunk is not okay. In this context, I'd just name this T.

ELF/Relocations.cpp
608

AddThunk -> addThunk

ELF/Symbols.h
238

I noticed that you can remove = nullptr as a std::unique_ptr is initialized to a nullptr by default.

314

Ditto

ELF/Target.cpp
185–187

Because we already have getRelExpr, I'd name this getThunkExpr. I believe existing adjust* functions have side effects (mutates parameters) which is different from this function.

ELF/Thunks.cpp
101

What's the naming convention in this file? This class is named ThumbARM but the other classes are named ARMThumb. Can you rename this ARMThumbV7ABSLongThunk?

122

Ditto

229

AddThunk -> addThunk (all function names need to start with lowercase letters.)

235

Remove \n. A newline is automatically appended.

ELF/Thunks.h
15

Remove this #include because it's unused.

Thank you for the extra comments. I've updated diff to hopefully address the most recent comments.

Adding llvm-commits to subscribers. My apologies for forgetting to put this on earlier.

ELF/InputSection.cpp
133

Made corresponding change.

190–194

The A wasn't in the original Mips R_THUNK, I didn't change it just in case it was deliberate. Adding it doesn't cause the mips test to fail, presumably as the addend for jump instruction will be 0. Given that this has been generalised I've added the A.

ELF/InputSection.h
213

Ok, made change to T.

ELF/Relocations.cpp
608

Ok, made change.

ELF/Symbols.h
238

Agreed, I used Alt as short for alternative. I've gone with ThunkData.

238

Ok, made change.

314

Ok, made change.

ELF/Target.cpp
185–187

Ok, I've made the change. I chose adjustThunkExpr to follow adjustRelaxExpr which also doesn't mutate its parameters.

1566

Ok, made change.

1956–1960

Ok, made very similar change to above.

ELF/Thunks.cpp
50

Ok, made change to getARMThunkDestVA

57

Ok, made change.

59

Ok, removed all virtual keywords where override is also present.

59–60

Ok, made change.

70–71

Have added a blank line between all similar cases.

101

There is a comment above:
Specific ARM Thunk implementations. The naming convention is:
Source State, TargetState, Target Requirement, ABS or PI, Range

I've added a "To" in between ARMThumb and ThumbARM to make this clearer: For example the classes are now:
ARMToThumbV7ABSLongThunk and ThumbToARMV7ABSLongThunk

122

Should be addressed by the naming convention comment above.

163

Ok, have gone with addThunkARM.

202

Ok, made change.

205

Used addThunkMips and made static to match addThunkARM

224

Ok, have used llvm_unreachable in this case.

227–239

Yes you are right, I've removed them.

229

Ok, made change.

I thought that in the original review comment AddThunk being globally visible had a different convention.

235

Ok, made change.

241–248

Ok removed parameter names.

ELF/Thunks.h
15

Ok, have made the change.

37

Ok, done.

test/ELF/arm-thumb-interwork-thunk.s
82–86

Ok, done, apologies that those slipped through.

ruiu accepted this revision.Jul 7 2016, 12:34 PM
ruiu edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jul 7 2016, 12:34 PM
peter.smith closed this revision.Jul 8 2016, 10:09 AM

Committed -r 274863 closing this revision.