This is an archive of the discontinued LLVM Phabricator instance.

[LLD][ELF] Add support for local symbols in Synthetic Sections
ClosedPublic

Authored by peter.smith on Jan 20 2017, 8:46 AM.

Details

Summary

A synthetic local symbol is a symbol with STB_LOCAL binding defined in a SyntheticSection. The primary use case for these symbols is to identify Thunks (When support for Thunks as synthetic sections) is added. Another use case is ARM/AArch64 mapping symbols such as $a, $t and $d. These can be added to Thunks and PLT sections to give better disassembly.

For thunks we need to create a symbol in the SyntheticSection representing the Thunk so we can retarget the relocation at it. As functions with STB_LOCAL binding can have the same name, and may need more than one range extension Thunk it simplifies the Thunk implementation if we can make the Thunk symbols have STB_LOCAL binding, and allow non-unique names.

Mapping symbols such as $a, $t and $d are helpful for the disassembler to identify which parts of the Thunk are ARM, Thumb or Data. The ABI allows mapping symbols to have a .suffix which is ignored to account for tools that can't handle non-unique names.

We can already add symbols with STB_LOCAL binding to synthetic sections by using addRegular, however we have to say IsLocal=false in the SymbolBody and we have to have a unique name. These are not fundamental restrictions to implement Thunks or MappingSymbols as in both cases we can make sure the symbol name we add is unique. However I'm not hugely keen on setting IsLocal=false, there is also potential for other SymbolTable processing treating the symbols as if they were global.

As the local symbols are added outside of SymbolTable::Symbols I've incremented NumLocals when adding a synthetic local.

I've added support for adding to mapping symbols to the ARM PLT to test this out.

Diff Detail

Repository
rL LLVM

Event Timeline

peter.smith created this revision.Jan 20 2017, 8:46 AM

I've not added any uses of adding a local symbol to a synthetic section. I've checked that it works by adding mapping symbols to ARM PLT entries, this is quite a large diff due to test updates so I've not included it in the review. I've attached the patch for reference

evgeny777 edited edge metadata.Jan 20 2017, 9:25 AM

Looks good, but wait for others, please.

ELF/SyntheticSections.cpp
1143 ↗(On Diff #85142)

I'd add lambda which takes 'DefinedRegular<ELFT> *' and call it in both places.

grimar added inline comments.Jan 20 2017, 9:33 AM
ELF/SyntheticSections.cpp
1152 ↗(On Diff #85142)

cast<DefinedRegular<ELFT>> instead ?

ruiu edited edge metadata.Jan 20 2017, 9:57 AM

SymbolTable class in SymbolTable.cpp is a class to arbitrate global symbols, and currently it has no interaction with local symbols, because local symbols don't need any symbol name resolution. This patch is using that class as a storage to store synthetic symbols. I found that that's a bit confusing.

Can you move that to SyntheticSections.h or somewhere to make it clear that the symbol table doesn't do any business with local symbols?

Removed all code in SymbolTable and rebased the patch on top of D29021.

I've added a function addSyntheticLocal which creates the symbol body and automatically adds it to the SymTab if it exists. This avoids storing the SymbolBodies in a temporary location.

ruiu added a comment.Jan 23 2017, 10:04 AM

It looks like this patch does not contain all changes. You defined addSyntehticLocal but you are not using it...?

ELF/SyntheticSections.cpp
292 ↗(On Diff #85385)

S -> *S

peter.smith edited the summary of this revision. (Show Details)

Thanks for the comments. I've updated the diff.

At the moment I only really need addSyntheticLocal for Thunks, I thought it would be best to split that part from the first Thunk patch to get the interface to addSyntheticLocal before submitting the Thunk patch.

To test addSyntheticLocal I've made a patch to add mapping symbols to the PLT section. I've uploaded the whole thing with this patch. I think that this is a worthwhile improvement to the disassembly of the PLT section especially on GNU objdump.

I'm happy to abandon this review and roll in addSyntheticLocal to either the first Thunks patch. Or I'm happy to add the PLT mapping symbols if you'd prefer?

ruiu added a comment.Jan 23 2017, 11:52 AM

I think this patch would be useful per se, but I wonder if you emit more than $a or $d? I mean those symbols could include the original symbol names to which these PLT entries jump to at runtime. (But if it is too much or too weird, I'm OK with this patch.)

ELF/Target.cpp
1738–1739 ↗(On Diff #85420)

Since string constants are not on the stack, I think you don't need to use Saver.

That is possible although in many cases the disassembler can derive that information without an additional symbol. For example there is some functionality in GNU objdump to display the symbol name of the PLT from some hard-coded choice of PLT size made when building objdump. For example, from hello world compiled and linked with ld.bfd:

000102ac <puts@plt-0x14>:
   102ac:	e52de004 	push	{lr}		; (str lr, [sp, #-4]!)
   102b0:	e59fe004 	ldr	lr, [pc, #4]	; 102bc <_init+0x1c>
   102b4:	e08fe00e 	add	lr, pc, lr
   102b8:	e5bef008 	ldr	pc, [lr, #8]!
   102bc:	00010288 	.word	0x00010288

000102c0 <puts@plt>:
   102c0:	e28fc600 	add	ip, pc, #0, 12
   102c4:	e28cca10 	add	ip, ip, #16, 20	; 0x10000
   102c8:	e5bcf288 	ldr	pc, [ip, #648]!	; 0x288
...

The lld ARM backend uses 4 word plt entries and doesn't get picked up by objdump's recognizer.

For x86 objdump can recognise lld's plt entries

00000000002011b0 <__libc_start_main@plt-0x10>:
  2011b0:	ff 35 62 0e 00 00    	pushq  0xe62(%rip)        # 202018 <__TMC_END__+0x8>
  2011b6:	ff 25 64 0e 00 00    	jmpq   *0xe64(%rip)        # 202020 <__TMC_END__+0x10>
  2011bc:	0f 1f 40 00          	nopl   0x0(%rax)

00000000002011c0 <__libc_start_main@plt>:
  2011c0:	ff 25 62 0e 00 00    	jmpq   *0xe62(%rip)        # 202028 <__TMC_END__+0x18>
  2011c6:	68 00 00 00 00       	pushq  $0x0
  2011cb:	e9 e0 ff ff ff       	jmpq   2011b0 <_fini+0x10>

So I think it probably isn't worth manually adding the symbols with @plt.

Updated patch to remove Saver.save for mapping symbol entries.

ruiu accepted this revision.Jan 24 2017, 12:42 PM

LGTM

ELF/SyntheticSections.cpp
295 ↗(On Diff #85550)

Please use make<DefinedRegular<ELFT>>(...) instead of new (BAlloc).

1456 ↗(On Diff #85550)

Can you add a comment saying that ARM's .plt section usually contains local symbols? That is a foreign concept for those who are familiar only with x86.

This revision is now accepted and ready to land.Jan 24 2017, 12:42 PM
This revision was automatically updated to reflect the committed changes.