Relocations can be in non-relocatable files by linking with --emit-relocs. This removes the associated violated assertion in ELFObjectFile and removes the short-circuit return in llvm-objdump. The comment says objdump doesn't print out these relocations though this isn't correct. Using the associated test file it's seen GNU objdump indeed prints out these relocations.
Details
Diff Detail
Event Timeline
Please add a comment saying how the binary was created.
The assert is correct. Whatever is calling getRelocationOffset should
instead be trying to get the address.
Testing locally I see that objdump prints relocations that were copied
by --emit-reloc but not dynamic relocations.
Given
void g(void);
void f(void) {
g();
}
objudump on x86_64 will print
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000000000000005 R_X86_64_PLT32 g-0x0000000000000004
RELOCATION RECORDS FOR [.eh_frame]:
OFFSET TYPE VALUE
0000000000000020 R_X86_64_PC32 .text+0x0000000000000270
readelf prints
Relocation section '.rela.plt' at offset 0x238 contains 1 entries:
Offset Info Type Symbol's
Value Symbol's Name + Addend
00000000000013e8 0000000200000007 R_X86_64_JUMP_SLOT 0000000000000000 g + 0
Relocation section '.rela.text' at offset 0x428 contains 1 entries:
Offset Info Type Symbol's
Value Symbol's Name + Addend
0000000000000275 0000000600000004 R_X86_64_PLT32 0000000000000000 g - 4
Relocation section '.rela.eh_frame' at offset 0x440 contains 1 entries:
Offset Info Type Symbol's
Value Symbol's Name + Addend
00000000000002a0 0000000100000002 R_X86_64_PC32
0000000000000270 .text + 270
Note that:
- The relocations from .rela.plt are missing in objdump.
- The relocations in the .so have addresses. What objdump does is map
address to section and subtract the section address to get an offset.
Please investigate exactly which relocations are skipped by objdump.
Please make sure llvm-objdump skips the same relocations and prints
offsets like objdump.
Cheers,
Rafael
It looks like in the ELF spec, as you already noticed, it says r_offset contains the virtual address if the file type is ET_EXEC or ET_DYN so in those cases we look for the relocated section and subtract the address.
I added a comment in the test on how the binary was created.
For equivalent functionality of dynamic relocations we'd need to add the -R flag to llvm-objdump, SymbolRef would need an accessor like isDynamic, the ELF implementation can use code from getRelocationSymbol to determine if it's dynamic:
bool IsDyn = Rel.d.b & 1; DataRefImpl SymbolData; if (IsDyn) SymbolData = toDRI(DotDynSymSec, symbolIdx); else SymbolData = toDRI(DotSymtabSec, symbolIdx);
and then COFF and MachO would need implementations for those file types. The interwoven disassembly/relocations would need to also take this in to account.
It'd be good to have equivalent functionality though I think we should do that in a separate revision since it's non-trivial.
I'm noticing r215844 saying GNU objdump doesn't print relocations in non-relocatable files. This doesn't seem correct, with -emit-relocs objdump does seem to print relocations in executable and SO files.
r_offset is being modeled as per the ELF spec IMO and looks correct. GNU objdump does show relocations when there are relocation sections, that point to the section that needs relocation.
Example :-
00000000004000c0 <foo>: 4000c0: 55 push %rbp 4000c1: 48 89 e5 mov %rsp,%rbp 4000c4: e8 e7 ff ff ff callq 4000b0 <bar> 4000c5: R_X86_64_PC32 bar-0x4 4000c9: 5d pop %rbp 4000ca: c3 retq
The assert is correct. Whatever is calling getRelocationOffset should
instead be trying to get the address.
I disagree, the assertion isn't correct.
You can ask for a relocation's offset inside an exe and SO file. r_offset is overloaded to contain the address in these which is why section offset needs to be subtracted in order to get the offset.
I'm not looking for the address, I'm looking for precisely the relocation offset.
http://www.sco.com/developers/gabi/latest/ch4.reloc.html
r_offset This member gives the location at which to apply the relocation action. For a relocatable file, the value is the byte offset from the beginning of the section to the storage unit affected by the relocation. For an executable file or a shared object, the value is the virtual address of the storage unit affected by the relocation.
This shows that the assertion is incorrect, as the ELF ABI suggests reading relocation records is valid in Executables as well.
While relocation offset is not looked up for shared libraries from .rela.dyn and .rela.plt sections, there is a usecase where relocation sections are generated with the linker switch --emit-relocs
Removing the assertion, LGTM.
llvm-objdump wants to print the relocationOffset, so it calls RelocationRef::relocationOffset which calls ELFObjectFile::getRelocationOffset
Since objdump wants the relocation offset, and this pipes down to ELFObjectFile::getRelocationOffset, at what point is the address supposed to be asked for?
Is this the recommendation on how to print relocation section offsets inside executable files?
This is much better, but needs a bit more work.
include/llvm/Object/ELFObjectFile.h | ||
---|---|---|
687 | It is undesirable to expose this. Calling a relocation "dynamic" was a hack to avoid looking at sh_link every time one asks for a symbol for a relocation. I removed the hack in r264624. | |
test/tools/llvm-objdump/X86/relocations-in-nonrelocatable.test | ||
7 | You can produce a much simpler file by running gcc -c main.c -fno-asynchronous-unwind-tables -fPIC Please do that and also run "llvm-readobj -r" so show all the relocations in the file. I investigated a bit exactly which relocations are printed by bfd objdump. The answer is that it prints relocations from sections that use .symtab as the symbol table. That is just a limitation of bfd. Some options we have given the desire to print relocations in executables and shared libraries.
| |
tools/llvm-objdump/llvm-objdump.cpp | ||
444 | This is guarding against a relocation with a symbol index of 0? That is an independent change, no? | |
445 | Since moving it, please use the new variable name style: Symb | |
462 | Does gnu objdump print ABS for this? Seems like an odd term. ABS refers to symbols, but in here you have no symbol at all. | |
1195 | You only need a sorted std::vector, no? | |
1196 | Initialize the iterators with "I = " not "I(...)"; | |
1221 | Can you use lower_bound? |
It is undesirable to expose this.
Calling a relocation "dynamic" was a hack to avoid looking at sh_link every time one asks for a symbol for a relocation. I removed the hack in r264624.