This is an archive of the discontinued LLVM Phabricator instance.

[llvm-readobj] - Introduce `forEachRelocationDo` helper. NFCI.
ClosedPublic

Authored by grimar on Nov 16 2020, 4:25 AM.

Details

Summary

Our printStackSize implementation currently uses
API like RelocationRef, object::symbol_iterator.
It is not ideal as it doesn't allow
to handle possible error conditions properly.

Some time ago I started rewriting it and this NFC patch is
a one more step toward to it. Here I am introducing the
forEachRelocationDo helper. With it it is possible to iterate
over all kinds of relocations, what is helpful for improving
the code it printStackSize and around.

I've made some progress in improving this area and going to post
other related follow-up patches really soon.

Diff Detail

Event Timeline

grimar created this revision.Nov 16 2020, 4:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2020, 4:25 AM
Herald added a subscriber: rupprecht. · View Herald Transcript
grimar requested review of this revision.Nov 16 2020, 4:25 AM

I don't get why RelRelaReloc is needed. Can you describe a bit about it?

llvm/tools/llvm-readobj/ELFDumper.cpp
803

It is slightly unfortunate that this cannot use function_ref because the variable used for initialization is local.

I don't get why RelRelaReloc is needed. Can you describe a bit about it?

Previously the Relocation was the struct which allowed to handle Rel/Rela relocations.
I've reworked it so that it how also is able to keep Relr relocations.
That way it allows to represent 3 types of relocations instead of previous 2 and have a common code for iterations over them.

RelRelaReloc is now a helper class that is used to represent Rel/Rela relocations.

I don't get why RelRelaReloc is needed. Can you describe a bit about it?

Previously the Relocation was the struct which allowed to handle Rel/Rela relocations.
I've reworked it so that it how also is able to keep Relr relocations.
That way it allows to represent 3 types of relocations instead of previous 2 and have a common code for iterations over them.

RelRelaReloc is now a helper class that is used to represent Rel/Rela relocations.

OK. Does this change make Relr easier to represent? Note that while REL/RELA type relocations are self-contained (the sh_offset is immediately known), RELR is not - it needs a context ((1) is it an address or a bitmap (2) for a bitmap, what is the starting address). Should it just be treated as a special case?

grimar planned changes to this revision.EditedNov 17 2020, 12:46 AM

I don't get why RelRelaReloc is needed. Can you describe a bit about it?

Previously the Relocation was the struct which allowed to handle Rel/Rela relocations.
I've reworked it so that it how also is able to keep Relr relocations.
That way it allows to represent 3 types of relocations instead of previous 2 and have a common code for iterations over them.

RelRelaReloc is now a helper class that is used to represent Rel/Rela relocations.

OK. Does this change make Relr easier to represent? Note that while REL/RELA type relocations are self-contained (the sh_offset is immediately known), RELR is not - it needs a context ((1) is it an address or a bitmap (2) for a bitmap, what is the starting address). Should it just be treated as a special case?

Well. May be. I'll take a look on what can I do for it.

grimar updated this revision to Diff 305718.Nov 17 2020, 3:28 AM
  • Use special handling for Elf_Relr as suggested.

Seems looks indeed simpler.

jhenderson added inline comments.Nov 18 2020, 1:26 AM
llvm/tools/llvm-readobj/ELFDumper.cpp
5679

I don't really mind either way, but what's the motivation for passing in the RawRelr variable, rather than continuing to use opts::RawRelr?

grimar marked an inline comment as done.Nov 18 2020, 1:32 AM
grimar added inline comments.
llvm/tools/llvm-readobj/ELFDumper.cpp
5679

In my understanding, the opts::RawRelr option is related to relocation printing:
"Do not decode relocations in SHT_RELR section, display raw contents"

forEachRelocationDo is a more general API which can be used for other things as well.
Perhaps it is not ideal to use such flags inside.

jhenderson accepted this revision.Nov 18 2020, 1:34 AM

Thanks for the answer, I think it makes sense. LGTM, but please wait for @MaskRay.

This revision is now accepted and ready to land.Nov 18 2020, 1:34 AM
MaskRay added inline comments.Nov 18 2020, 5:42 PM
llvm/tools/llvm-readobj/ELFDumper.cpp
3791

this->printRelocationsHelper(Sec); has been expanded into

this->forEachRelocationDo(
        Sec, opts::RawRelr,
        [&](const Relocation<ELFT> &R, unsigned Ndx, const Elf_Shdr &Sec,
            const Elf_Shdr *SymTab) { printReloc(R, Ndx, Sec, SymTab); },
        [&](const Elf_Relr &R) { printRelrReloc(R); });

multiple times. Isn't the previous version clearer?

Can you elaborate on the claim in the description?

Our printStackSize implementation currently uses API like RelocationRef, object::symbol_iterator. It is not ideal as it doesn't allow to handle possible error conditions properly.

grimar marked an inline comment as done.Nov 19 2020, 12:55 AM
grimar added inline comments.
llvm/tools/llvm-readobj/ELFDumper.cpp
3791

Isn't the previous version clearer?

Do you compare with the original code or with the previous version of the patch?
Assuming that you compare with the original code: I am using this change in D91624. It allows to avoid
alot of duplication of error handling when iterating over relocations.

Can you elaborate on the claim in the description?

Yeah. I think that API is just totally broken when used for invalid objects. See the description of D91624
about how ELFObjectFile<ELFT>::symbol_end() "works". In the test case for D91624 I've improved one of the messages and
now it provides more details about an error.

One more example:

object::symbol_iterator RelocSym = Reloc.getSymbol();
if (RelocSym != ElfObj.symbol_end()) {

The underlying implementation of Reloc.getSymbol() is:

symbol_iterator
ELFObjectFile<ELFT>::getRelocationSymbol(DataRefImpl Rel) const {
  uint32_t symbolIdx;
  const Elf_Shdr *sec = getRelSection(Rel);
  if (sec->sh_type == ELF::SHT_REL)
    symbolIdx = getRel(Rel)->getSymbol(EF.isMips64EL());
  else
    symbolIdx = getRela(Rel)->getSymbol(EF.isMips64EL());
  if (!symbolIdx)
    return symbol_end();

  // FIXME: error check symbolIdx
  DataRefImpl SymbolData;
  SymbolData.d.a = sec->sh_link;
  SymbolData.d.b = symbolIdx;
  return symbol_iterator(SymbolRef(SymbolData, this));
}

It just doesn't check sec->sh_link or symbolIdx

Another example is that in the ELFObjectFile.h file, the implementation of ELFObjectFile<ELFT> methods
has 5 report_fatal_error calls. They not just doesn't propogate them to user, but also hide real errors behind the
error code messages I think. E.g.:

// Error check sh_link here so that getRelocationSymbol can just use it.
auto SymSecOrErr = EF.getSection(RelSec->sh_link);
if (!SymSecOrErr)
  report_fatal_error(errorToErrorCode(SymSecOrErr.takeError()).message());
grimar updated this revision to Diff 306352.Nov 19 2020, 2:24 AM
grimar marked an inline comment as done.
  • Restored printRelocationsHelper to avoid forEachRelocationDo duplication.
llvm/tools/llvm-readobj/ELFDumper.cpp
3791

this->printRelocationsHelper(Sec); has been expanded into

this->forEachRelocationDo(
        Sec, opts::RawRelr,
        [&](const Relocation<ELFT> &R, unsigned Ndx, const Elf_Shdr &Sec,
            const Elf_Shdr *SymTab) { printReloc(R, Ndx, Sec, SymTab); },
        [&](const Elf_Relr &R) { printRelrReloc(R); });

If the concern is about duplicating of this call, then it is eaily solvable by introducing
a new method to isolate it. I've added the printRelocationsHelper method back,
it is now incapsulates this forEachRelocationDo call.

MaskRay accepted this revision.Nov 20 2020, 12:23 AM
This revision was automatically updated to reflect the committed changes.