This is an archive of the discontinued LLVM Phabricator instance.

[Propeller] Make decoding BBAddrMaps trace through relocations
ClosedPublic

Authored by aidengrossman on Feb 11 2023, 11:25 PM.

Details

Summary

Currently when using the LLVM tools (eg llvm-readobj, llvm-objdump) to
find information about basic block locations using the propeller tooling
in relocatable object files function addresses are not mapped properly
which causes problems. In llvm-readobj this means that incorrect
function names will be pulled. In llvm-objdum this means that most BBs
won't show up in the output if --symbolize-operands is used. This patch
changes the behavior of decodeBBAddrMap to trace through relocations
to get correct function addresses if it is going through a relocatable
object file. This fixes the behavior in both tools and also other
consumers of decodeBBAddrMap. Some helper functions have been added
in/refactoring done to aid in grabbing BB address map sections now that
in some cases both relocation and BB address map sections need to be
obtained at the same time.

Regression tests moved around/added.

Diff Detail

Event Timeline

aidengrossman created this revision.Feb 11 2023, 11:25 PM
Herald added a reviewer: MaskRay. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
aidengrossman requested review of this revision.Feb 11 2023, 11:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2023, 11:25 PM

This patch is designed to allow tracing through relocations with the basic block address infrastructure currently in llvm-readobj and llvm-objdump. Currently when using relocatable object files with these tools with basic block sections, the tools almost don't work at all (llvm-objdump with --symbolize-operands will work quite poorly not showing most of the basic block labels), to not having the intended functionality (ie all function addresses in llvm-readobj --bb-addr-map are zero and the name mapping is not right, always mapping to the first function name). This patch fixes that functionality. I'm not completely sure I've taken the right approach here as I'm not too familiar with this section of the code base, so please let me know if there is some common infrastructure or some different approach I should be taking.

Thanks for doing this. It has been on our to-do list. @shenhan can share more insights.

Some structural comments.

llvm/lib/Object/ELF.cpp
642–660

Let's see if we can remove this function too, by incorporating the check into the IsMatch predicate.

688

Let's move ELFDumper::getSectionAndRelocations to ELFFile and use it here instead of writing a custom version.

702–707

The IsRelocatable path should also read the Address (even though it's normally zero). Something like:

uint64_t SectionOffset = Cur.tell();
Address = static_cast<uintX_t>(Data.getAddress(Cur));
if (IsRelocatable) {
  assert(Address == 0); // maybe
  Address += OffsetToRela.at(SectionOffset).r_addend;
}
722

Just define a lambda or a static function for this, and read in Elf_Rela_Range.

724

Let's move this into the caller.

738–741

Change this to assertion (since it's checking something about the code logic).

Thanks for working on this. You mentioned

This patch changes the behavior of decodeBBAddrMap to trace through relocations to get correct function addresses if it is going through a relocatable object file."

How is this achieved? Each BBAddrMap section has a corresponding .rela section, but that .rela section does not contain real address information (the addresses are only known at linker time). (Most of the time) It has entries of <offset_into_bbaddrmap_section, ".text", addend>. Are you computing function addresses as ".text address as exhibited in the object file + addend"?

How is this achieved? Each BBAddrMap section has a corresponding .rela section, but that .rela section does not contain real address information (the addresses are only known at linker time). (Most of the time) It has entries of <offset_into_bbaddrmap_section, ".text", addend>. Are you computing function addresses as ".text address as exhibited in the object file + addend"?

Sorry. I should've been a bit more clear here. You are correct in that I'm computing the function "addresses" as offsets within the .text section. I've modified decodeBBAddrMap to return the offset of the function from the beginning of the text section so that everything can be resolved properly when addresses aren't available.

Sorry for the delay in addressing reviewer feedback. I should have @rahmanl 's comments addressed tonight or tomorrow so that we can move this patch forward.

I've taken a low-level view of the code, as I'm not massively familiar with the higher level stuff. In addition to my various inline comments, please add testing for all the new error/warning paths.

llvm/lib/Object/ELF.cpp
673

Don't use auto here. It's not obvious what the return type is.

680

Are you sure you need the cast here?

694–696

Here and in many other places, don't use braces for single line if statements like this. See https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements.

700
707
728

This probably shouldn't use auto, as the type isn't obvious from the context.

736

Should RelaSec be const & not just const?

744

Again, don't use auto here.

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
6

Why is this unsupported on Windows?

26

This Content is a bit of an opaque blob. Would it be possible and clearer to write this test using assembly instead?

llvm/test/tools/llvm-readobj/ELF/bb-addr-map-relocatable.test
5

Again, why is this unsupported on Windows?

81

Again, would it be practical and cleare to write this test using assembly?

llvm/tools/llvm-readobj/ELFDumper.cpp
7180

This warning should provide more context about why it cannot get the section. That context should be available in the error returned by the function you called. Indeed, I wouldn't be surprised if you had a failure in some build configurations because the Error within the Expected has been thrown away without any usage.

Also applies below.

aidengrossman marked 17 inline comments as done.

Addressed reviewer feedback, split patch into two to separate the refactoring
of getSectionAndRelocations from the rest of the patch to hopefully keep
things a little bit easier to review.

Thank you everyone for all the feedback. It has been quite helpful. I believe I have addressed most of the feedback given (and probably opened up room for quite a bit more). I still need to add some tests for some of the new failure cases, but they should be relatively sparse after refactoring to use getSectionAndRelocations() instead of rewriting it. I've also split this patch into two (https://reviews.llvm.org/D144783). I tried to get Phabricator to pick it up as a patch stack, but for whatever reason I couldn't get it working.

llvm/lib/Object/ELF.cpp
722

I'm not sure what you mean here by defining a lambda or static function for this. I've moved it into the caller as is.

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
26

The content is definitely quite opaque. From what I can gather from the yaml2obj tests, it is possible to write these tests in assembly (without piping anything around), but all the content has to be in hex with assembly commentary beside it. The current precedent for tests regarding this feature is to use similar opaque blobs (although maybe this should be changed at some point). However, I'm not sure how much value adding assembly would bring. The assembly is essentially the same as the disassembled input that I'm checking later on plus a couple extra irrelevant annotations. For debugging the test case, I think having the expected output would be the most helpful and adding assembly for the input would not add a lot of value. I might be missing something obvious here though, so please let me know if I'm incorrect in one of my points or there's something I haven't thought of.

llvm/test/tools/llvm-readobj/ELF/bb-addr-map-relocatable.test
81

This test can have the content of the .text section removed, so I've removed it completely for clarity.

Haven't looked at the tests yet. Thanks for the revision.

llvm/lib/Object/ELF.cpp
646

Add a comment for this variable. The name is too general and does not convey what it really is.
Also maybe reword to FunctionAddressTranslaton since this is just about function addresses in SHT_LLVM_BB_ADDR_MAP.

650

Append a context to this error (e.g, "Unable to read the relocation section for section: ....").

651
704

Should we report Error here if the translation entry doesn't exist?

llvm/lib/Object/ELFObjectFile.cpp
690–691

Let's use reportError here. SHT_LLVM_BB_ADDR_MAP should always link to a text section. We may have to change the `llvm-objdump tests a bit then.

rahmanl added inline comments.Mar 2 2023, 9:46 AM
llvm/lib/Object/ELFObjectFile.cpp
690–691

Disregard my previous comment. But is there a reason you are changing the error message?

aidengrossman marked 6 inline comments as done.

Address reviewer feedback.

llvm/lib/Object/ELFObjectFile.cpp
690–691

I think I was messing around with the refactoring at some point and changed it. I've changed it back since there's no reason to change it to this.

jhenderson added inline comments.Mar 3 2023, 12:37 AM
llvm/lib/Object/ELF.cpp
648

Nit.

Also, since this comment is tied to the following variable, there should not be a blank line between them (there might want to be one before the comment, but that's up to you).

705–710

You could turn this into a plain if without an else if you flip the two parts around, i.e. make the if clause FOTIterator == FunctionOffsetTranslations.end().

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
6
6

Given we have an open ticket for this, I wonder if this should be an XFAIL, not UNSUPPORTED. The idea being that when the issue is fixed, the test will start x-passing, and not be forgotten about.

jhenderson added inline comments.Mar 3 2023, 12:37 AM
llvm/test/tools/llvm-readobj/ELF/bb-addr-map-relocatable.test
6

Also, see above re. XFAIL vs UNSUPPORTED

llvm/tools/llvm-readobj/ELFDumper.cpp
7182

This (currently) returns an Error, so that return value needs capturing and checking.

aidengrossman marked 6 inline comments as done.

Address reviewer feedback and rebase.

aidengrossman added inline comments.Mar 3 2023, 4:41 PM
llvm/tools/llvm-readobj/ELFDumper.cpp
7182

This was originally using the implementation of getSectionAndRelocations from within ELFDumper.cpp that wrapped around the actual implementation, but I've removed that abstraction due to how we're now handling errors and also to improve clarity.

Switch back to UNSUPPORTED on the tests since the original failure was due to
some new behavior in VS2022 and the minimum version for building LLVM is
currently VS2019 (where the test should pass).

aidengrossman added inline comments.Mar 3 2023, 4:54 PM
llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
6

I think we should stick with UNSUPPORTED here because the original failure was due to some change specific to VS2022, and the minimum version to build LLVM is currently VS2019. This means that there might be some buildbots/developers/users building LLVM on VS2019 where this test would unexpectedly pass.

rahmanl added inline comments.Mar 4 2023, 11:30 PM
llvm/lib/Object/ELFObjectFile.cpp
727

std::nullopt.

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
6

In fact, current build bots still use VS2019 which makes the test succeed on Windows too. The issue with VS2022 was revealed later upon downstream builds with VS2022.

44–55

You can also remove these extra blocks to shorten the test.

80–89

This assembly is irrelevant. right? I mean the object file can be synthesized even if it's not from real code. In that case, I suggest we remove the assembly code.

aidengrossman marked 3 inline comments as done.

Use std::nullopt instead of {} when creating an optional. Also went through
elf-bbaddrmap-symbolize-relocatable.yaml to make it only use a single basic
block per function, removed the opaque content binary blob, and also removed
any assembly checks since they're not relevant to what we are trying to test
here.

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
80–89

Right, the assembly is not relevant to the test. I essentially just copied over the test format from the non-relocatable bbaddrmap symbolization test. I've removed the assembly and am now just checking for the basic block labels (which are what was missing in the first place) and this now prevents duplicate functionality between the tests)

jhenderson added inline comments.Mar 5 2023, 11:55 PM
llvm/lib/Object/ELF.cpp
706
llvm/lib/Object/ELFObjectFile.cpp
727

As far as I understand it, you don't need either - std::optional has a default constructor to initialise as empty. In other words, this line should just be std::optional<Elf_Shdr> RelaSec;

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
6

Not addressed still.

6

Ah, yes, sorry for the noise. It would be nice to have the ability to XFAIL for specific compiler versions, but I don't think that functionality exists in lit.

26

I was referring to making the whole input an assembly file and not using YAML at all. However, with the removal of the opaque blob, that no longer is relevant.

llvm/tools/llvm-readobj/ELFDumper.cpp
7181

Not clang-formatted.

7183

I have a slight concern that this is slightly fragile - if the underlying code changes to emit other kinds of errors, this will fail here. It should probably handle the error base class, in some manner, so that we don't end up with a surprise crash in the future.

aidengrossman marked 5 inline comments as done.

Address reviewer feedback.

aidengrossman added inline comments.Mar 6 2023, 12:24 AM
llvm/tools/llvm-readobj/ELFDumper.cpp
7183

Definitely a valid concern. I fixed up the call sites in ELFDumper.cpp but forgot to do it here. Should be fixed in the latest revision.

Switch from using llvm::MapVector to MapVector.

No major comments. Looks good.

llvm/lib/Object/ELFObjectFile.cpp
690–691

Flip this for more readability, i.e,

if (!TextSectionIndex) return true;

Expected<const Elf_Shdr *> TextSecOrErr = EF.getSection(Sec.sh_link);
...
729

This can be nullptr.

llvm/test/tools/llvm-readobj/ELF/bb-addr-map-relocatable.test
102–109

Maybe remove these too.

llvm/tools/llvm-readobj/ELFDumper.cpp
7177

No need to capture anything here.

7194

Change this consistently with the other code.

7196

Again, handle the case where this is nullptr.

Code looks broadly fine, but I don't see any testing for the error cases?

llvm/lib/Object/ELF.cpp
707

I wonder if utohexstr would be better here, to give a hex offset?

aidengrossman marked 7 inline comments as done.

Address reviewer feedback, add regression tests for new possible failure modes.

llvm/lib/Object/ELF.cpp
707

Thank you for the suggestion. I would think it would definitely be better here to be consistent with the rest of the output from the LLVM tools.

llvm/lib/Object/ELFObjectFile.cpp
729

Added some error handling here and a test case.

jhenderson added inline comments.Mar 8 2023, 12:52 AM
llvm/lib/Object/ELF.cpp
707

There's a Twine::utohexstr (that I didn't know about), which would be more appropriate here, I think.

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
72
92

Nit: please add a new line at the end of the file.

llvm/test/tools/llvm-readobj/ELF/bb-addr-map-relocatable.test
104

There's no particular need for this, is there? Same elsewhere.

118
151

Isn't this the same as the test case two up?

175

?

Is it worth the following test case: an ET_EXEC or ET_DYN file with a .rela.dyn section that doesn't reference a particular section? To show that we don't get this warning in that situation?

212

Nit: no new line at EOF.

aidengrossman marked 7 inline comments as done.

Address reviewer feedback.

llvm/test/tools/llvm-readobj/ELF/bb-addr-map-relocatable.test
175

Fixed the typo. Thank you. When creating a shared object/exectuable, the addresses for the functions are already populated in the SHT_LLVM_BB_ADDR_MAP` sections, so we don't need to do any relocations. The new paths introduced by this patch also only run on ET_REL files.

jhenderson accepted this revision.Mar 9 2023, 12:52 AM

LGTM, barring typo (and also possibly extra test case, though I'm uncertain about that one).

llvm/test/tools/llvm-readobj/ELF/bb-addr-map-relocatable.test
148

Still got a typo :)

175

My thinking is mostly along the lines of "what do we have that ensures that the if (ET_REL) check is actually there?" In other words, if that check disappeared, would this warning potentially pop up?

This revision is now accepted and ready to land.Mar 9 2023, 12:52 AM
aidengrossman marked 2 inline comments as done.

Address reviewer feedback.

@rahmanl Do you mind reviewing my most recent changes and accepting this (assuming there are no further comments) when you get a chance? Thank you.

llvm/test/tools/llvm-readobj/ELF/bb-addr-map-relocatable.test
175

Ah, that logic makes a lot of sense. I've added an ET_DYN test with a .rela.dyn section to test this. Thanks for the suggestion!

rahmanl added inline comments.Mar 9 2023, 11:13 PM
llvm/lib/Object/ELF.cpp
643

Do not pass arguments by const std::optional& as it will make a copy if we pass it a Elf_Shdr. I suggest we pass by const Elf_Shdr * and let nullptr indicate that the value doesn't exist.

645

Callers may call this function on relocatable sections without providing RelaSec. If assertions are not enabled, we will still get zero addresses for all functions. I think it's better to report error here instead.

In general, assertions should be used to check the invariants of the code, not things like input validation.

648

Here and elsewhere, please use SHT_LLVM_BB_ADDR_MAP when referring to the bb-address-map data. This will make code search easier.

657

Remove braces for this loop.

706

Move this above:

uintX_t Address = static_cast<uintX_t>(Data.getAddress(Cur));
if (!Cur) return Cur.takeError();
if (IsRelocatable) {
   assert(Address == 0);
   auto FOTIterator = FunctionOffsetTranslations.find(SectionOffset);
   if (FOTIterator == FunctionOffsetTranslations.end()) 
             return createError("failed to get relocation data for offset: " +
                          Twine::utohexstr(SectionOffset) + " in section " +
                          describe(*this, Sec));
...
llvm/lib/Object/ELFObjectFile.cpp
711–716

Moving this error handling to decodeBBAddr and passing RelaSec by const Elf_Shdr * will greatly simplify this code:

for (auto const &[Sec, RelocSec] : *SectionRelocMapOrErr) {
    Expected<std::vector<BBAddrMap>> BBAddrMapOrErr =
        EF.decodeBBAddrMap(*Sec, RelocSec);
}
llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
11

nit: unnecessary spaces.

llvm/test/tools/llvm-readobj/ELF/bb-addr-map-relocatable.test
168

Change to SHT_LLVM_BB_ADDR_MAP.

jhenderson added inline comments.Mar 9 2023, 11:48 PM
llvm/lib/Object/ELF.cpp
645

Hmm, I think assertions are perfectly valid to use for input validation for function arguments. The question I use to decide on assertion versus error is "can the assertion ever be hit with any of our existing code paths" - in other words, if somebody passed in an arbitrary binary blob to one of our tools that uses this code, could it theoretically get here?

The LLVM coding standard's first example on assert usage actually covers this case. See https://llvm.org/docs/CodingStandards.html#assert-liberally. In that case, the index is passed into the function and validated via an assert.

As an aside, @aidengrossman, you should have a message explaining the assertion (see the coding standards document).

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
11

The spaces should be left there. The idea is to provide an indentation to show that the FileCheck line is a continuation of the previous line (without having to spot that the previous line ends with \. It's a common pattern in many newer tests in the tools (especially ones I've been involved with).

llvm/test/tools/llvm-readobj/ELF/bb-addr-map-relocatable.test
167
rahmanl added inline comments.Mar 10 2023, 12:21 AM
llvm/lib/Object/ELF.cpp
645

Thanks for your clarification @jhenderson. Using assertions for input validation does make sense.

Let me rephrase my opinion. The input validation here is a complex one. It is already being done three times in this code (in the callers ELFObjectFile.cpp:711 , ELFDumper.cpp:7196, and here). The callers already create errors in such cases. I think moving the error handling code here simplifies the code. We don't have to add a message explaining that this function can only be called when RelaSec != std::nullopt || getHeader().e_type != ELF:ET_REL when we already have error handling for the same case in the callers.

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-symbolize-relocatable.yaml
11

You're right. sorry for the noise.

rahmanl added inline comments.Mar 10 2023, 12:48 AM
llvm/lib/Object/ELF.cpp
645

Actually, I take my argument back. This function should block calls with assertions. Errors are relevant only in the context of the callers.

651

Here, add
assert(RelaSec && "Trying to read a SHT_LLVM_BB_ADDR_MAP section in a relocatable object file without providing a relocation section.");
And remove the assertion above.

llvm/lib/Object/ELFObjectFile.cpp
711–716
if (IsRelocatable && !RelocSec) return createError(....)
Expected<std::vector<BBAddrMap>> BBAddrMapOrErr =
        EF.decodeBBAddrMap(*Sec, RelocSec);
aidengrossman marked 17 inline comments as done.

Address reviewer feedback.

rahmanl accepted this revision.Mar 12 2023, 5:45 PM

Appreciate the work @aidengrossman.