getRelocatedSection interface should not check that the object file is
relocatable, as executable files may have relocations preserved with
--emit-relocs linker flag. The relocations are useful in context of post-link
binary analysis for function reference identification. For example, BOLT relies
on relocations to perform function reordering.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/include/llvm/Object/ELFObjectFile.h | ||
---|---|---|
986–988 | These extra checks are unnecessary on LLVM side. Let's get rid of them and update the diff description. |
Please add tests for such functional changes.
The subject line is usually an imperative sentence. Mention what the patch changes, instead of stating a fact that people agree with (it is unclear what you do in the patch to fulfill the goal)
@MaskRay: agree with the title wording change.
Note that it's a second attempt to make getRelocatedSection work regardless of object file type. The previous attempt was reverted in commit 9219fe79b9842; it's the one that re-introduced the check we want to remove:
commit 9219fe79b9842f3be3524c2e9aeb807ddffdfe3e Author: Rafael Espindola <rafael.espindola@gmail.com> Date: Mon Mar 21 20:59:15 2016 +0000 Revert "[llvm-objdump] Printing relocations in executable and shared object files. This partially reverts r215844 by removing test objdump-reloc-shared.test which s tated GNU objdump doesn't print relocations, it does." This reverts commit r263971. It produces the wrong results for .rela.dyn. I will add a test. llvm-svn: 263987 ... diff --git a/llvm/include/llvm/Object/ELFObjectFile.h b/llvm/include/llvm/Object/ELFObjectFile.h index 5ee90ee9ea71..b01fa1da4b3e 100644 --- a/llvm/include/llvm/Object/ELFObjectFile.h +++ b/llvm/include/llvm/Object/ELFObjectFile.h @@ -640,6 +640,9 @@ ELFObjectFile<ELFT>::section_rel_end(DataRefImpl Sec) const { template <class ELFT> section_iterator ELFObjectFile<ELFT>::getRelocatedSection(DataRefImpl Sec) const { + if (EF.getHeader()->e_type != ELF::ET_REL) + return section_end(); +
Regarding tests: none of the tests in LLVM and subprojects have executable files with relocations, so all tests are passing. But it's expected that this change may break downstream tests, e.g. for llvm-dwarfdump.
llvm-dwarfdump is relying on getRelocatedSection() to return section_end() for ELF files of types other than relocatable objects. We've changed the function to return relocatable section for other types of ELF files. As a result, llvm-dwarfdump started re-processing relocations for sections that already had relocations applied, e.g. in executable files, and this resulted in wrong values reported.
As a workaround/solution, we can make this function return relocated section for executable (and any non-relocatable objects) files only if the section is allocatable:
Expected<const Elf_Shdr *> SecOrErr = EF.getSection(EShdr->sh_info); if (!SecOrErr) return SecOrErr.takeError(); + if (EF.getHeader()->e_type != ELF::ET_REL && + !((*R)->sh_flags & ELF::SHF_ALLOC)) + return section_end(); return section_iterator(SectionRef(toDRI(*SecOrErr), this)); }
Another potential solution would be to introduce a flag (e.g. CheckFileRelocatable) for getRelocatedSection or another function with intended behavior.
There are options for adding the test, too:
- As a unit test for getRelocatedSection that would check that this function works for executable files with relocations.
- As a test in lld folder that would run llvm-mc; ld.lld with --emit-relocs and then a client of getRelocatedSection (llvm-objdump -dr or something similar). Cons: test is unrelated to lld.
- As a prepared ELF binary test for LLVM clients of getRelocatedSection.
What do you think would be an appropriate solution?
You can use yaml2obj to create an ET_EXEC (or whatever) object with static relocations. There are many good examples of using yaml2obj to create an object in places like the llvm-readobj or even the yaml2obj tests themselves - just find one that has relocations and change the type. The fix isn't related to LLD (directly), so I don't think we should test it in the LLD tests (although it might be a good candidate for my planned cross-project-tests - see D95339 and related patches). However, my personal opinion is that as this behaviour change is for the Object library, it should be a new Object unit test, i.e. option 1.
What's the motivation for this change anyway? An executable (whether built with --emit-relocs or not) will already have had its static relocations applied. Applying dynamic relocations seems likely to be impossible, since by their very nature they can't be applied until the program has been loaded in memory/dynamic libraries loaded etc.
@jhenderson: thanks for the suggestion. I'm going to add the Object unit test for this change.
The motivation is the use of relocations by BOLT for function reordering optimization (specifically, to find references to functions). We recommend that users keep relocations in the final binary: https://github.com/facebookincubator/BOLT#usage
I think I follow now, thanks. Dynamic relocations should probably still be omitted by this lookup, i.e. getRelocatedSection only deals with static relocations. Not all dynamic relocation sections have an associated section, so this function will fail on them.
llvm-dwarfdump is relying on getRelocatedSection() to return section_end() for ELF files of types other than relocatable objects. We've changed the function to return relocatable section for other types of ELF files. As a result, llvm-dwarfdump started re-processing relocations for sections that already had relocations applied, e.g. in executable files, and this resulted in wrong values reported.
It seems to me that for Elf_Rel relocations, you can't really reapply them, because the addend has been overwritten by the final patched value. So, as I understand it, the only way you can reliably know how to reapply relocations is with Elf_Rela relocations. I think this means you need to change llvm-dwarfdump too, to only reapply relocations for ET_REL objects? (It's harmless to do so for other ET_* kinds, iff the relocation types are Elf_Rela of course, but it's pointless work).
Presumably we don't want to reapply them, right? (unless BOLT is modifying the relocations and expecting them to be reapplied) I'm assuming there's some property we should be able to observe that indicates the relocations have already been applied so we should ignore them?
I don't know BOLT at all, but if functions within an executable were to be reordered, the relocations referencing that function would need reapplying so that calls to it are targeting the right address. The resultant values would be different, because the target address would be different.
@jhenderson, @dblaikie: BOLT recreates relocations implicitly as it uses MCStreamer for code emission. We don't reapply existing relocations from the binary. I should have pointed out that BOLT performs not just function reordering (where reapplying relocations might make sense), but rather full binary rewrite: splits text into hot and cold sections, splits functions, reorders basic blocks, reorders functions, applies PLT optimizations, and so on. Hope that makes it clear.
"recreates relocations" - I'm not sure what that means, could you help me understand?
The important thing, I think, is (can you confirm this, please) - static relocations in a linked binary do not need to be reapplied by the DWARF parser when reading the binary, right? They should be ignored. Presumably there's /some/ property of the binary that makes it clear that these are static relocations, already applied by the linker (or any post-link rewriting tool like BOLT) and should be ignored. The DWARF parser is making the mistake that their presence means they should be applied, when it should be checking some other property/feature to determine if they should be applied?
I think implicit assumption that unless binary is relocatable the assumption is relocations have been applied by the linker.
llvm/unittests/Object/ELFObjectFileTest.cpp | ||
---|---|---|
594–595 | Whilst you're adding this test, it probably makes sense for one for ET_REL files too, since you're explicitly calling out the executable version. | |
601 | Tip: you don't need all this whitespace. Just get rid of the excess, so that it nicely lines up, but without a massive gap. | |
613–619 | This looks to be unnecessary. The SectionHeaderTable block is only needed if you want to have a different order for sections in the file versus in the section header table. | |
643 | The ASSERT_TRUE and ASSERT_EQ here and below can probably all be EXPECT_TRUE and EXPECT_EQ - it's harmless for the test to continue to run in these cases. I don't think you actually need the RelSec->isText() check anyway. Really, all that's important is that the section name is correct, since there's only one ".text" section. | |
650 | Do you need this? FoundRela seems like it'd be enough - one of the other checks will fail otherwise. |
Sorry to jump in here, hope I'm not derailing a perfectly functional review.
Could you provide some more context in the patch description? Might be worth knowing what that ET_REL check was originally implemented for? (could you check the patch history of why that was added)
So this getRelocatedSection is meant to retrieve the section that the specified REL or RELA section applies to? But if the ELF file is already relocated (ie: e_type != ET_REL) then the REL/RELA section sort of doesn't apply to the section - no relocations need to be applied to that section, right? But maybe it's the wrong abstraction/interface design? (or callers should be using the interface in some different way) - this REL or RELA section is for whatever section its for, but that section has already had the relocations applied?
I worry that with only this change the callers might be getting the wrong idea - the idea that the relocations do need to be applied, when in fact they don't, because the file isn't relocatable? (previously/witohut this patch they were relying on this function not returning the relocated section to indicate that no relocations needed to be applied)
re:
But it's expected that this change may break downstream tests, e.g. for llvm-dwarfdump.
Then there should be an llvm-dwarfdump test that demonstrates this fix changes behavior. That can include a linked object file that still has relocations - if such a file can't be produced by LLVM tools (well, a linked file can't be produced by LLVM itself - needs a linker like lld, and LLVM tests can't depend on lld) - then a checked in binary can be used for testing.
Or a libObject unit test might be an option too. Though I personally wouldn't mind seeing how this affects some tool, rather than only a narrow unit test.
The check was added in the initial implementation of getRelocatedSection https://github.com/llvm/llvm-project/commit/4f60a38f18ee68c4abd4cfefb5a4bba0f0917f16.
So your reasoning is most likely correct, that the original intent was to skip non-relocatable files and return section_end().
But we believe that this interface should just return the relocated section and not have this implicit check. As a side effect, this can make llvm-objdump -d -r work the same as objdump -dr which is helpful at times. If you believe that LLVM should rather keep the current interface, we can make the change on the BOLT side.
re:
But it's expected that this change may break downstream tests, e.g. for llvm-dwarfdump.
Then there should be an llvm-dwarfdump test that demonstrates this fix changes behavior. That can include a linked object file that still has relocations - if such a file can't be produced by LLVM tools (well, a linked file can't be produced by LLVM itself - needs a linker like lld, and LLVM tests can't depend on lld) - then a checked in binary can be used for testing.
Or a libObject unit test might be an option too. Though I personally wouldn't mind seeing how this affects some tool, rather than only a narrow unit test.
The test that broke in llvm-dwarfdump is this one: https://reviews.llvm.org/harbormaster/unit/view/755844/. Looking into how exactly did this change affect XCOFF test.
Ah, I see that the change to test isRelocatableObject in DWARFContext is compensating for this ^ issue. So any other callers would need to make similar adjustments if they are assuming that an "relocated section" is meant to have relocations applied to it.
Might be a bit subtle - maybe there's a chance renaming the function would be suitable if there's a name that might better capture that "this is the section that these relocations are related to, but they might've already been applied" - or I guess there could be a parameter to the function, or two different functions, etc, to better differentiate the situations.
but if it's just the one DWARF fix... I'm not /super/ fussed about it.
The check was added in the initial implementation of getRelocatedSection https://github.com/llvm/llvm-project/commit/4f60a38f18ee68c4abd4cfefb5a4bba0f0917f16.
So your reasoning is most likely correct, that the original intent was to skip non-relocatable files and return section_end().But we believe that this interface should just return the relocated section and not have this implicit check. As a side effect, this can make llvm-objdump -d -r work the same as objdump -dr which is helpful at times. If you believe that LLVM should rather keep the current interface, we can make the change on the BOLT side.
re:
But it's expected that this change may break downstream tests, e.g. for llvm-dwarfdump.
Then there should be an llvm-dwarfdump test that demonstrates this fix changes behavior. That can include a linked object file that still has relocations - if such a file can't be produced by LLVM tools (well, a linked file can't be produced by LLVM itself - needs a linker like lld, and LLVM tests can't depend on lld) - then a checked in binary can be used for testing.
Or a libObject unit test might be an option too. Though I personally wouldn't mind seeing how this affects some tool, rather than only a narrow unit test.
The test that broke in llvm-dwarfdump is this one: https://reviews.llvm.org/harbormaster/unit/view/755844/. Looking into how exactly did this change affect XCOFF test.
I'll keep an eye out to see what you come up with there.
Provide XCOFFObjectFile::isRelocatableObject for 64-bit XCOFF format
(header format is the same for both 32 and 64 bit).
@dblaikie: I've fixed the broken llvm-dwarfdump test. If there's anything else that you think would be a good idea to test, just let me know.
CC @jhenderson, @MaskRay
I'll leave that to @jhenderson or @MaskRay - they've got some more familiarity with the object library.
LGTM.
llvm/unittests/Object/ELFObjectFileTest.cpp | ||
---|---|---|
612 | Delete Symbols. It is not used. STT_SECTION is not needed if not used as relocation targets (https://sourceware.org/bugzilla/show_bug.cgi?id=27109) | |
646 | Add a comment that this is for ld --emit-relocs. |
I want to check whether @jhenderson may have further comments.
https://llvm.org/docs/CodeReview.html#lgtm-how-a-patch-is-accepted
If it is likely that others will want to review a recently-posted patch, especially if there might be objections, but no one else has done so yet, it is also polite to provide a qualified approval (e.g., “LGTM, but please wait for a couple of days in case others wish to review”).
LGTM too. I've got a lot else on my plate, so if @dblaikie or @MaskRay are able to land this, that's preferable. Otherwise, I'll try to do so in the next couple of days.
Maybe it's worth adding a comment to the getRelocatedSection declaration to say that it returns the section the input relocations patch, highlighting that that section may or may not have already had their relocations applied.
I think moving the behaviour of "should I apply relocations" to a different place makes sense to me.
Added a comment for getRelocatedSection highlighting the fact that relocations
may or may not have been applied to the section.
I've got a disassembler assertion failure when disassembling a Linux kernel built with thinLTO that bisects back to this commit.
llvm-objdump: ../include/llvm/ADT/ArrayRef.h:196: ArrayRef<T> llvm::ArrayRef<unsigned char>::slice(size_t, size_t) const [T = unsigned char]: Assertion `N+M <= size() && "Invalid specifier"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: llvm-objdump -dr vmlinux #0 0x00000000016f4b7a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /android0/llvm-project/llvm/build/../lib/Support/Unix/Signals.inc:565:11 #1 0x00000000016f4d2b PrintStackTraceSignalHandler(void*) /android0/llvm-project/llvm/build/../lib/Support/Unix/Signals.inc:632:1 #2 0x00000000016f33f3 llvm::sys::RunSignalHandlers() /android0/llvm-project/llvm/build/../lib/Support/Signals.cpp:97:5 #3 0x00000000016f5455 SignalHandler(int) /android0/llvm-project/llvm/build/../lib/Support/Unix/Signals.inc:407:1 #4 0x00007f622df59140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14140) #5 0x00007f622da31ce1 raise ./signal/../sysdeps/unix/sysv/linux/raise.c:51:1 #6 0x00007f622da1b537 abort ./stdlib/abort.c:81:7 #7 0x00007f622da1b40f get_sysdep_segment_value ./intl/loadmsgcat.c:509:8 #8 0x00007f622da1b40f _nl_load_domain ./intl/loadmsgcat.c:970:34 #9 0x00007f622da2a662 (/lib/x86_64-linux-gnu/libc.so.6+0x34662) #10 0x0000000000f15340 llvm::ArrayRef<unsigned char>::slice(unsigned long, unsigned long) const /android0/llvm-project/llvm/build/../include/llvm/ADT/ArrayRef.h:0:7 #11 0x0000000000ee9fbd disassembleObject(llvm::Target const*, llvm::object::ObjectFile const*, llvm::MCContext&, llvm::MCDisassembler*, llvm::MCDisassembler*, llvm::MCInstrAnalysis const*, llvm::MCInstPrinter*, llvm::MCSubtargetInfo const*, llvm::MCSubtargetInfo const*, (anonymous namespace)::PrettyPrinter&, llvm::objdump::SourcePrinter&, bool) /android0/llvm-project/llvm/build/../tools/llvm-objdump/llvm-objdump.cpp:1437:51 #12 0x0000000000ee72ee disassembleObject(llvm::object::ObjectFile const*, bool) /android0/llvm-project/llvm/build/../tools/llvm-objdump/llvm-objdump.cpp:1711:1 #13 0x0000000000ee56f0 dumpObject(llvm::object::ObjectFile*, llvm::object::Archive const*, llvm::object::Archive::Child const*) /android0/llvm-project/llvm/build/../tools/llvm-objdump/llvm-objdump.cpp:2352:7 #14 0x0000000000ee2dd2 dumpInput(llvm::StringRef) /android0/llvm-project/llvm/build/../tools/llvm-objdump/llvm-objdump.cpp:2432:5 #15 0x0000000000f40a99 void (*std::for_each<__gnu_cxx::__normal_iterator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, void (*)(llvm::StringRef)>(__gnu_cxx::__normal_iterator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, __gnu_cxx::__normal_iterator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, void (*)(llvm::StringRef)))(llvm::StringRef) /usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_algo.h:3838:33 #16 0x0000000000ef4e6b void (*llvm::for_each<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, void (*)(llvm::StringRef)>(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, void (*)(llvm::StringRef)))(llvm::StringRef) /android0/llvm-project/llvm/build/../include/llvm/ADT/STLExtras.h:1541:3 #17 0x0000000000ee1268 main /android0/llvm-project/llvm/build/../tools/llvm-objdump/llvm-objdump.cpp:2708:3 #18 0x00007f622da1cd0a __libc_start_main ./csu/../csu/libc-start.c:308:16 #19 0x0000000000edcb9a _start (/android0/llvm-project/llvm/build/bin/llvm-objdump+0xedcb9a) [1] 2784195 abort llvm-objdump -dr vmlinux
Indeed, if I revert 8ec73e96b72d04787ed606cfbb62a7a2a05b3711, I can disassemble my kernel again. This is blocking debugging of https://lore.kernel.org/lkml/5913cdf4-9c8e-38f8-8914-d3b8a3565d73@kernel.org/. This is a 49M binary; my employer makes sharing files excessively difficult...
https://github.com/ClangBuiltLinux/linux/issues/1446
llvm-objdump -dr seems to be broken when invoked on executables with relocations. Prior to this diff, llvm-objdump would simply not print relocations in executables while disassembling instructions, but now it's hitting the assertion since it started seeing the relocations. The problem is that relocation offsets are not adjusted wrt the section address.
Nick, you can drop -r flag to unblock you. The effect should be identical to reverting this diff. But if you want to see relocations interleaved with disassembly, we need to fix llvm-objdump. I can take a look at that.
These extra checks are unnecessary on LLVM side. Let's get rid of them and update the diff description.