Page MenuHomePhabricator

[ELF] getRelocatedSection: remove the check for ET_REL object file
ClosedPublic

Authored by Amir on May 11 2021, 5:48 PM.

Details

Summary

getRelocatedSection interface should not check that the object file is
relocatable, as executable files may have relocations preserved with
--emit-relocs linker flag. The relocations are useful in context of post-link
binary analysis for function reference identification. For example, BOLT relies
on relocations to perform function reordering.

Diff Detail

Event Timeline

Amir created this revision.May 11 2021, 5:48 PM
Amir published this revision for review.May 12 2021, 11:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 12 2021, 11:28 PM
maksfb added inline comments.May 14 2021, 2:22 PM
llvm/include/llvm/Object/ELFObjectFile.h
986–988

These extra checks are unnecessary on LLVM side. Let's get rid of them and update the diff description.

Amir updated this revision to Diff 345609.May 14 2021, 11:21 PM

Remove added checks

Amir marked an inline comment as done.May 14 2021, 11:21 PM
Amir retitled this revision from [ELF] getRelocatedSection: allow allocatable section in non-relocatable file to [ELF] getRelocatedSection should work for executables with --emit-relocs.May 14 2021, 11:49 PM
Amir edited the summary of this revision. (Show Details)
Amir added subscribers: maksfb, rafaelauler.

Please add tests for such functional changes.

The subject line is usually an imperative sentence. Mention what the patch changes, instead of stating a fact that people agree with (it is unclear what you do in the patch to fulfill the goal)

Amir retitled this revision from [ELF] getRelocatedSection should work for executables with --emit-relocs to [ELF] getRelocatedSection: remove the check for ET_REL object file.May 19 2021, 12:05 PM
Amir edited the summary of this revision. (Show Details)
Amir added a comment.May 19 2021, 12:26 PM

@MaskRay: agree with the title wording change.

Note that it's a second attempt to make getRelocatedSection work regardless of object file type. The previous attempt was reverted in commit 9219fe79b9842; it's the one that re-introduced the check we want to remove:

commit 9219fe79b9842f3be3524c2e9aeb807ddffdfe3e                                                                                                                 
Author: Rafael Espindola <rafael.espindola@gmail.com>
Date:   Mon Mar 21 20:59:15 2016 +0000

    Revert "[llvm-objdump] Printing relocations in executable and shared object files.  This partially reverts r215844 by removing test objdump-reloc-shared.test which s
tated GNU objdump doesn't print relocations, it does."

    This reverts commit r263971.
    It produces the wrong results for .rela.dyn. I will add a test.

    llvm-svn: 263987

...

diff --git a/llvm/include/llvm/Object/ELFObjectFile.h b/llvm/include/llvm/Object/ELFObjectFile.h
index 5ee90ee9ea71..b01fa1da4b3e 100644
--- a/llvm/include/llvm/Object/ELFObjectFile.h
+++ b/llvm/include/llvm/Object/ELFObjectFile.h
@@ -640,6 +640,9 @@ ELFObjectFile<ELFT>::section_rel_end(DataRefImpl Sec) const {
 template <class ELFT>
 section_iterator
 ELFObjectFile<ELFT>::getRelocatedSection(DataRefImpl Sec) const {
+  if (EF.getHeader()->e_type != ELF::ET_REL)
+    return section_end();
+

Regarding tests: none of the tests in LLVM and subprojects have executable files with relocations, so all tests are passing. But it's expected that this change may break downstream tests, e.g. for llvm-dwarfdump.
llvm-dwarfdump is relying on getRelocatedSection() to return section_end() for ELF files of types other than relocatable objects. We've changed the function to return relocatable section for other types of ELF files. As a result, llvm-dwarfdump started re-processing relocations for sections that already had relocations applied, e.g. in executable files, and this resulted in wrong values reported.
As a workaround/solution, we can make this function return relocated section for executable (and any non-relocatable objects) files only if the section is allocatable:

  Expected<const Elf_Shdr *> SecOrErr = EF.getSection(EShdr->sh_info);
  if (!SecOrErr)
    return SecOrErr.takeError();
+  if (EF.getHeader()->e_type != ELF::ET_REL &&
+      !((*R)->sh_flags & ELF::SHF_ALLOC))
+    return section_end();
  return section_iterator(SectionRef(toDRI(*SecOrErr), this));
}

Another potential solution would be to introduce a flag (e.g. CheckFileRelocatable) for getRelocatedSection or another function with intended behavior.

There are options for adding the test, too:

  1. As a unit test for getRelocatedSection that would check that this function works for executable files with relocations.
  2. As a test in lld folder that would run llvm-mc; ld.lld with --emit-relocs and then a client of getRelocatedSection (llvm-objdump -dr or something similar). Cons: test is unrelated to lld.
  3. As a prepared ELF binary test for LLVM clients of getRelocatedSection.

What do you think would be an appropriate solution?

There are options for adding the test, too:

  1. As a unit test for getRelocatedSection that would check that this function works for executable files with relocations.
  2. As a test in lld folder that would run llvm-mc; ld.lld with --emit-relocs and then a client of getRelocatedSection (llvm-objdump -dr or something similar). Cons: test is unrelated to lld.
  3. As a prepared ELF binary test for LLVM clients of getRelocatedSection.

What do you think would be an appropriate solution?

You can use yaml2obj to create an ET_EXEC (or whatever) object with static relocations. There are many good examples of using yaml2obj to create an object in places like the llvm-readobj or even the yaml2obj tests themselves - just find one that has relocations and change the type. The fix isn't related to LLD (directly), so I don't think we should test it in the LLD tests (although it might be a good candidate for my planned cross-project-tests - see D95339 and related patches). However, my personal opinion is that as this behaviour change is for the Object library, it should be a new Object unit test, i.e. option 1.

What's the motivation for this change anyway? An executable (whether built with --emit-relocs or not) will already have had its static relocations applied. Applying dynamic relocations seems likely to be impossible, since by their very nature they can't be applied until the program has been loaded in memory/dynamic libraries loaded etc.

Amir added a comment.May 24 2021, 4:27 PM

@jhenderson: thanks for the suggestion. I'm going to add the Object unit test for this change.
The motivation is the use of relocations by BOLT for function reordering optimization (specifically, to find references to functions). We recommend that users keep relocations in the final binary: https://github.com/facebookincubator/BOLT#usage

@jhenderson: thanks for the suggestion. I'm going to add the Object unit test for this change.
The motivation is the use of relocations by BOLT for function reordering optimization (specifically, to find references to functions). We recommend that users keep relocations in the final binary: https://github.com/facebookincubator/BOLT#usage

I think I follow now, thanks. Dynamic relocations should probably still be omitted by this lookup, i.e. getRelocatedSection only deals with static relocations. Not all dynamic relocation sections have an associated section, so this function will fail on them.

llvm-dwarfdump is relying on getRelocatedSection() to return section_end() for ELF files of types other than relocatable objects. We've changed the function to return relocatable section for other types of ELF files. As a result, llvm-dwarfdump started re-processing relocations for sections that already had relocations applied, e.g. in executable files, and this resulted in wrong values reported.

It seems to me that for Elf_Rel relocations, you can't really reapply them, because the addend has been overwritten by the final patched value. So, as I understand it, the only way you can reliably know how to reapply relocations is with Elf_Rela relocations. I think this means you need to change llvm-dwarfdump too, to only reapply relocations for ET_REL objects? (It's harmless to do so for other ET_* kinds, iff the relocation types are Elf_Rela of course, but it's pointless work).

@jhenderson: thanks for the suggestion. I'm going to add the Object unit test for this change.
The motivation is the use of relocations by BOLT for function reordering optimization (specifically, to find references to functions). We recommend that users keep relocations in the final binary: https://github.com/facebookincubator/BOLT#usage

I think I follow now, thanks. Dynamic relocations should probably still be omitted by this lookup, i.e. getRelocatedSection only deals with static relocations. Not all dynamic relocation sections have an associated section, so this function will fail on them.

llvm-dwarfdump is relying on getRelocatedSection() to return section_end() for ELF files of types other than relocatable objects. We've changed the function to return relocatable section for other types of ELF files. As a result, llvm-dwarfdump started re-processing relocations for sections that already had relocations applied, e.g. in executable files, and this resulted in wrong values reported.

It seems to me that for Elf_Rel relocations, you can't really reapply them, because the addend has been overwritten by the final patched value. So, as I understand it, the only way you can reliably know how to reapply relocations is with Elf_Rela relocations. I think this means you need to change llvm-dwarfdump too, to only reapply relocations for ET_REL objects? (It's harmless to do so for other ET_* kinds, iff the relocation types are Elf_Rela of course, but it's pointless work).

Presumably we don't want to reapply them, right? (unless BOLT is modifying the relocations and expecting them to be reapplied) I'm assuming there's some property we should be able to observe that indicates the relocations have already been applied so we should ignore them?

@jhenderson: thanks for the suggestion. I'm going to add the Object unit test for this change.
The motivation is the use of relocations by BOLT for function reordering optimization (specifically, to find references to functions). We recommend that users keep relocations in the final binary: https://github.com/facebookincubator/BOLT#usage

I think I follow now, thanks. Dynamic relocations should probably still be omitted by this lookup, i.e. getRelocatedSection only deals with static relocations. Not all dynamic relocation sections have an associated section, so this function will fail on them.

llvm-dwarfdump is relying on getRelocatedSection() to return section_end() for ELF files of types other than relocatable objects. We've changed the function to return relocatable section for other types of ELF files. As a result, llvm-dwarfdump started re-processing relocations for sections that already had relocations applied, e.g. in executable files, and this resulted in wrong values reported.

It seems to me that for Elf_Rel relocations, you can't really reapply them, because the addend has been overwritten by the final patched value. So, as I understand it, the only way you can reliably know how to reapply relocations is with Elf_Rela relocations. I think this means you need to change llvm-dwarfdump too, to only reapply relocations for ET_REL objects? (It's harmless to do so for other ET_* kinds, iff the relocation types are Elf_Rela of course, but it's pointless work).

Presumably we don't want to reapply them, right? (unless BOLT is modifying the relocations and expecting them to be reapplied) I'm assuming there's some property we should be able to observe that indicates the relocations have already been applied so we should ignore them?

I don't know BOLT at all, but if functions within an executable were to be reordered, the relocations referencing that function would need reapplying so that calls to it are targeting the right address. The resultant values would be different, because the target address would be different.

Amir added a comment.May 26 2021, 9:24 AM

@jhenderson, @dblaikie: BOLT recreates relocations implicitly as it uses MCStreamer for code emission. We don't reapply existing relocations from the binary. I should have pointed out that BOLT performs not just function reordering (where reapplying relocations might make sense), but rather full binary rewrite: splits text into hot and cold sections, splits functions, reorders basic blocks, reorders functions, applies PLT optimizations, and so on. Hope that makes it clear.

@jhenderson, @dblaikie: BOLT recreates relocations implicitly as it uses MCStreamer for code emission. We don't reapply existing relocations from the binary. I should have pointed out that BOLT performs not just function reordering (where reapplying relocations might make sense), but rather full binary rewrite: splits text into hot and cold sections, splits functions, reorders basic blocks, reorders functions, applies PLT optimizations, and so on. Hope that makes it clear.

"recreates relocations" - I'm not sure what that means, could you help me understand?

The important thing, I think, is (can you confirm this, please) - static relocations in a linked binary do not need to be reapplied by the DWARF parser when reading the binary, right? They should be ignored. Presumably there's /some/ property of the binary that makes it clear that these are static relocations, already applied by the linker (or any post-link rewriting tool like BOLT) and should be ignored. The DWARF parser is making the mistake that their presence means they should be applied, when it should be checking some other property/feature to determine if they should be applied?

@jhenderson, @dblaikie: BOLT recreates relocations implicitly as it uses MCStreamer for code emission. We don't reapply existing relocations from the binary. I should have pointed out that BOLT performs not just function reordering (where reapplying relocations might make sense), but rather full binary rewrite: splits text into hot and cold sections, splits functions, reorders basic blocks, reorders functions, applies PLT optimizations, and so on. Hope that makes it clear.

"recreates relocations" - I'm not sure what that means, could you help me understand?

The important thing, I think, is (can you confirm this, please) - static relocations in a linked binary do not need to be reapplied by the DWARF parser when reading the binary, right? They should be ignored. Presumably there's /some/ property of the binary that makes it clear that these are static relocations, already applied by the linker (or any post-link rewriting tool like BOLT) and should be ignored. The DWARF parser is making the mistake that their presence means they should be applied, when it should be checking some other property/feature to determine if they should be applied?

I think implicit assumption that unless binary is relocatable the assumption is relocations have been applied by the linker.

Amir updated this revision to Diff 348138.May 26 2021, 6:19 PM

Added a test case, addressed DWARF parser issue

Amir updated this revision to Diff 348140.May 26 2021, 6:30 PM

Removed unneeded includes

jhenderson added inline comments.May 27 2021, 1:02 AM
llvm/unittests/Object/ELFObjectFileTest.cpp
594–595

Whilst you're adding this test, it probably makes sense for one for ET_REL files too, since you're explicitly calling out the executable version.

601

Tip: you don't need all this whitespace. Just get rid of the excess, so that it nicely lines up, but without a massive gap.

613–619

This looks to be unnecessary. The SectionHeaderTable block is only needed if you want to have a different order for sections in the file versus in the section header table.

643

The ASSERT_TRUE and ASSERT_EQ here and below can probably all be EXPECT_TRUE and EXPECT_EQ - it's harmless for the test to continue to run in these cases.

I don't think you actually need the RelSec->isText() check anyway. Really, all that's important is that the section name is correct, since there's only one ".text" section.

650

Do you need this? FoundRela seems like it'd be enough - one of the other checks will fail otherwise.

Sorry to jump in here, hope I'm not derailing a perfectly functional review.

Could you provide some more context in the patch description? Might be worth knowing what that ET_REL check was originally implemented for? (could you check the patch history of why that was added)

So this getRelocatedSection is meant to retrieve the section that the specified REL or RELA section applies to? But if the ELF file is already relocated (ie: e_type != ET_REL) then the REL/RELA section sort of doesn't apply to the section - no relocations need to be applied to that section, right? But maybe it's the wrong abstraction/interface design? (or callers should be using the interface in some different way) - this REL or RELA section is for whatever section its for, but that section has already had the relocations applied?

I worry that with only this change the callers might be getting the wrong idea - the idea that the relocations do need to be applied, when in fact they don't, because the file isn't relocatable? (previously/witohut this patch they were relying on this function not returning the relocated section to indicate that no relocations needed to be applied)

re:

But it's expected that this change may break downstream tests, e.g. for llvm-dwarfdump.

Then there should be an llvm-dwarfdump test that demonstrates this fix changes behavior. That can include a linked object file that still has relocations - if such a file can't be produced by LLVM tools (well, a linked file can't be produced by LLVM itself - needs a linker like lld, and LLVM tests can't depend on lld) - then a checked in binary can be used for testing.

Or a libObject unit test might be an option too. Though I personally wouldn't mind seeing how this affects some tool, rather than only a narrow unit test.

Amir updated this revision to Diff 349365.Jun 2 2021, 1:47 PM

Updated the test case, following suggestions by @jhenderson.

Amir added a comment.Jun 2 2021, 2:53 PM

Sorry to jump in here, hope I'm not derailing a perfectly functional review.

Could you provide some more context in the patch description? Might be worth knowing what that ET_REL check was originally implemented for? (could you check the patch history of why that was added)

So this getRelocatedSection is meant to retrieve the section that the specified REL or RELA section applies to? But if the ELF file is already relocated (ie: e_type != ET_REL) then the REL/RELA section sort of doesn't apply to the section - no relocations need to be applied to that section, right? But maybe it's the wrong abstraction/interface design? (or callers should be using the interface in some different way) - this REL or RELA section is for whatever section its for, but that section has already had the relocations applied?

I worry that with only this change the callers might be getting the wrong idea - the idea that the relocations do need to be applied, when in fact they don't, because the file isn't relocatable? (previously/witohut this patch they were relying on this function not returning the relocated section to indicate that no relocations needed to be applied)

The check was added in the initial implementation of getRelocatedSection https://github.com/llvm/llvm-project/commit/4f60a38f18ee68c4abd4cfefb5a4bba0f0917f16.
So your reasoning is most likely correct, that the original intent was to skip non-relocatable files and return section_end().

But we believe that this interface should just return the relocated section and not have this implicit check. As a side effect, this can make llvm-objdump -d -r work the same as objdump -dr which is helpful at times. If you believe that LLVM should rather keep the current interface, we can make the change on the BOLT side.

re:

But it's expected that this change may break downstream tests, e.g. for llvm-dwarfdump.

Then there should be an llvm-dwarfdump test that demonstrates this fix changes behavior. That can include a linked object file that still has relocations - if such a file can't be produced by LLVM tools (well, a linked file can't be produced by LLVM itself - needs a linker like lld, and LLVM tests can't depend on lld) - then a checked in binary can be used for testing.

Or a libObject unit test might be an option too. Though I personally wouldn't mind seeing how this affects some tool, rather than only a narrow unit test.

The test that broke in llvm-dwarfdump is this one: https://reviews.llvm.org/harbormaster/unit/view/755844/. Looking into how exactly did this change affect XCOFF test.

Amir edited the summary of this revision. (Show Details)Jun 2 2021, 3:00 PM
Amir marked 5 inline comments as done.Jun 2 2021, 3:28 PM

Sorry to jump in here, hope I'm not derailing a perfectly functional review.

Could you provide some more context in the patch description? Might be worth knowing what that ET_REL check was originally implemented for? (could you check the patch history of why that was added)

So this getRelocatedSection is meant to retrieve the section that the specified REL or RELA section applies to? But if the ELF file is already relocated (ie: e_type != ET_REL) then the REL/RELA section sort of doesn't apply to the section - no relocations need to be applied to that section, right? But maybe it's the wrong abstraction/interface design? (or callers should be using the interface in some different way) - this REL or RELA section is for whatever section its for, but that section has already had the relocations applied?

I worry that with only this change the callers might be getting the wrong idea - the idea that the relocations do need to be applied, when in fact they don't, because the file isn't relocatable? (previously/witohut this patch they were relying on this function not returning the relocated section to indicate that no relocations needed to be applied)

Ah, I see that the change to test isRelocatableObject in DWARFContext is compensating for this ^ issue. So any other callers would need to make similar adjustments if they are assuming that an "relocated section" is meant to have relocations applied to it.

Might be a bit subtle - maybe there's a chance renaming the function would be suitable if there's a name that might better capture that "this is the section that these relocations are related to, but they might've already been applied" - or I guess there could be a parameter to the function, or two different functions, etc, to better differentiate the situations.

but if it's just the one DWARF fix... I'm not /super/ fussed about it.

The check was added in the initial implementation of getRelocatedSection https://github.com/llvm/llvm-project/commit/4f60a38f18ee68c4abd4cfefb5a4bba0f0917f16.
So your reasoning is most likely correct, that the original intent was to skip non-relocatable files and return section_end().

But we believe that this interface should just return the relocated section and not have this implicit check. As a side effect, this can make llvm-objdump -d -r work the same as objdump -dr which is helpful at times. If you believe that LLVM should rather keep the current interface, we can make the change on the BOLT side.

re:

But it's expected that this change may break downstream tests, e.g. for llvm-dwarfdump.

Then there should be an llvm-dwarfdump test that demonstrates this fix changes behavior. That can include a linked object file that still has relocations - if such a file can't be produced by LLVM tools (well, a linked file can't be produced by LLVM itself - needs a linker like lld, and LLVM tests can't depend on lld) - then a checked in binary can be used for testing.

Or a libObject unit test might be an option too. Though I personally wouldn't mind seeing how this affects some tool, rather than only a narrow unit test.

The test that broke in llvm-dwarfdump is this one: https://reviews.llvm.org/harbormaster/unit/view/755844/. Looking into how exactly did this change affect XCOFF test.

I'll keep an eye out to see what you come up with there.

Amir updated this revision to Diff 349412.Jun 2 2021, 4:43 PM

Provide XCOFFObjectFile::isRelocatableObject for 64-bit XCOFF format
(header format is the same for both 32 and 64 bit).

Amir added a comment.Jun 2 2021, 7:23 PM

@dblaikie: I've fixed the broken llvm-dwarfdump test. If there's anything else that you think would be a good idea to test, just let me know.
CC @jhenderson, @MaskRay

@dblaikie: I've fixed the broken llvm-dwarfdump test. If there's anything else that you think would be a good idea to test, just let me know.

Nah, sounds OK to me.

Amir added a comment.Jun 4 2021, 1:24 PM

@dblaikie: mind accepting the diff then?

@dblaikie: mind accepting the diff then?

I'll leave that to @jhenderson or @MaskRay - they've got some more familiarity with the object library.

MaskRay accepted this revision.Jun 4 2021, 2:57 PM

LGTM.

llvm/unittests/Object/ELFObjectFileTest.cpp
612

Delete Symbols. It is not used. STT_SECTION is not needed if not used as relocation targets (https://sourceware.org/bugzilla/show_bug.cgi?id=27109)

646

Add a comment that this is for ld --emit-relocs.

This revision is now accepted and ready to land.Jun 4 2021, 2:57 PM
Amir updated this revision to Diff 349993.Jun 4 2021, 4:47 PM

Addressed comments by @MaskRay

Amir marked 2 inline comments as done.Jun 4 2021, 4:48 PM
Amir added a comment.Jun 4 2021, 11:04 PM

@MaskRay: can you please commit this one for me? I don't have commit rights yet.

@MaskRay: can you please commit this one for me? I don't have commit rights yet.

I want to check whether @jhenderson may have further comments.

https://llvm.org/docs/CodeReview.html#lgtm-how-a-patch-is-accepted

If it is likely that others will want to review a recently-posted patch, especially if there might be objections, but no one else has done so yet, it is also polite to provide a qualified approval (e.g., “LGTM, but please wait for a couple of days in case others wish to review”).

jhenderson accepted this revision.Jun 7 2021, 2:38 AM

LGTM too. I've got a lot else on my plate, so if @dblaikie or @MaskRay are able to land this, that's preferable. Otherwise, I'll try to do so in the next couple of days.

Maybe it's worth adding a comment to the getRelocatedSection declaration to say that it returns the section the input relocations patch, highlighting that that section may or may not have already had their relocations applied.

I think moving the behaviour of "should I apply relocations" to a different place makes sense to me.

Amir updated this revision to Diff 350376.Jun 7 2021, 11:52 AM

Added a comment for getRelocatedSection highlighting the fact that relocations
may or may not have been applied to the section.

MaskRay updated this revision to Diff 350401.Jun 7 2021, 1:16 PM

adjust comment

This revision was landed with ongoing or failed builds.Jun 7 2021, 1:17 PM
This revision was automatically updated to reflect the committed changes.